Creating a Pipeline Tutorial

This tutorial will guide you through the process of creating an end-to-end pipeline with Octopipe. By the end of this guide, you will have a working pipeline that extracts data from a source, transforms it, and loads it into a destination.

Overview

Octopipe’s streamlined CLI and opinionated architecture make it easy to build data pipelines quickly. This tutorial covers:

  • Project initialization
  • Authentication
  • Adding a data source and destination
  • Defining a transformation
  • Creating, starting, and monitoring a pipeline

Step 1: Initialize Your Project

Begin by setting up your project directory. This command creates the necessary configuration files and folder structure:

octopipe init --name my_pipeline --description "ETL pipeline for sales data" --local

The —local flag indicates that you are setting up the project for local development.

Step 2: Authenticate

Authenticate your CLI session with your API key to access Octopipe features:

octopipe login --api-key YOUR_API_KEY_HERE

Note:

Make sure your API key is valid. If using username/password, you can provide those credentials instead.

Step 3: Add a Data Source

Configure a data source from which to extract your data. For example, to add a sales API:

octopipe source add --name sales_api --type api --option url=https://api.sales.com/data --option token=SALES_TOKEN

Key Points:

• —name assigns a unique identifier.

• —type specifies the kind of source.

• Additional —option flags supply connection details.

Step 4: Add a Data Destination

Next, configure where the data will be loaded. In this example, we add a PostgreSQL destination:

octopipe destination add --name sales_db --type postgres --option host=localhost --option port=5432 --option user=dbuser --option password=secret --option database=sales

Details:

This command sets up the connection parameters for your destination.

Step 5: Define a Transformation

Create a transformation to map the type safe API schema to your destination schema. Use a schema file to define the mapping:

octopipe transform add --name sales_transform --source sales_api --destination sales_db --schema-file ./schemas/sales_schema.json

What Happens:

The command generates a transformation layer which you can review and approve before execution.

Step 6: Create the Pipeline

Once all components are in place, create the pipeline:

octopipe pipeline create --name daily_sales --source sales_api --destination sales_db --transform sales_transform --schedule "0 0 * * *"

Schedule Explanation:

The cron expression “0 0 * * *” schedules the pipeline to run daily at midnight.

Step 7: Start the Pipeline

Launch your pipeline to begin processing data:

octopipe start daily_sales

Verification:

Ensure the pipeline starts correctly by checking for confirmation messages.

Step 8: Monitor Pipeline Execution

To view real-time logs and monitor the pipeline, run:

octopipe logs daily_sales --follow

Tip:

Use additional options such as —tail 50 to view the last 50 lines if needed.

Conclusion

Congratulations! You have successfully created and launched your first Octopipe pipeline. This tutorial covered the entire process from initialization to monitoring. For further customization and troubleshooting, refer to the other sections of our documentation.

Happy piping!