Documentation Index
Fetch the complete documentation index at: https://docs.octopipe.com/llms.txt
Use this file to discover all available pages before exploring further.
Self-Hosting Octopipe
Octopipe is designed to provide an outstanding local development experience. Self-hosting enables you to test and run pipelines on your own infrastructure, ensuring you have full control over the environment and can debug issues in real time.
Why Self-Host?
- Local Development:
Focus on rapid development and testing without the overhead of cloud deployment.
- Real-Time Monitoring:
Access detailed logs and status updates to troubleshoot and optimize pipeline performance.
- Full Control:
Customize your environment to suit specific development needs.
Setting Up Your Local Environment
Prerequisites
Ensure your system meets the following requirements:
- Python 3.8+ installed.
- Node.js and npm installed.
- Docker and Docker Compose (recommended for managing multiple services).
- Git for source control.
Step 1: Clone the Repository
Clone the Octopipe repository from GitHub:
git clone https://github.com/your-org/octopipe.git
cd octopipe
Step 2: Install Dependencies
Install Python dependencies:
pip install -r requirements.txt
If Node.js dependencies are needed, run:
Step 3: Set Up Docker Compose
For a self-hosted setup, Docker Compose can launch all required services (Meltano, Airflow, Kafka, Spark, etc.). Create or update the docker-compose.yml file with the required services:
version: '3.8'
services:
octopipe:
image: your-org/octopipe:latest
ports:
- "8000:8000"
environment:
- OCTOPIPE_ENV=local
airflow:
image: apache/airflow:2.2.2
ports:
- "8080:8080"
kafka:
image: confluentinc/cp-kafka:latest
spark:
image: bitnami/spark:latest
• Tip: Customize the configuration as per your environment and resource availability.
Step 4: Launch the Environment
Start all services using Docker Compose:
This command brings up all the required services in one command, making it easier to manage local development.
Running and Testing Pipelines Locally
• Initialize a New Pipeline:
octopipe init --name local_pipeline --description "Local development pipeline" --local
• Manage Components:
Add data sources, destinations, and transformations as per your project requirements.
• Start and Monitor Pipelines:
octopipe start local_pipeline
octopipe logs local_pipeline --follow
Monitoring and Debugging
• Real-Time Logs:
Use the logs command to stream output to your terminal, allowing for on-the-fly debugging.
• Status Checks:
Regularly check pipeline status with:
octopipe status local_pipeline
• Step-by-Step Debugging:
In case of errors, stop the pipeline, inspect logs, adjust configurations, and restart:
octopipe stop local_pipeline
octopipe start local_pipeline
Tips for an Amazing Local Experience
• Use a Dedicated Environment:
Run Octopipe in a separate virtual machine or container to avoid conflicts with other applications.
• Automate Routine Tasks:
Use scripts to automate repetitive tasks such as starting/stopping services.
• Document Local Configurations:
Keep notes on any local tweaks to facilitate quick troubleshooting and team onboarding.
Conclusion
Self-hosting Octopipe offers a powerful and flexible way to develop, test, and optimize your data pipelines locally. With detailed logs, easy management of services through Docker Compose, and robust CLI tools, you can enjoy a development experience that is both efficient and scalable.
Embrace the freedom of local development, and fine-tune your pipelines before deploying them to production!