Transform
octopipe transform
Command Reference
The octopipe transform
command manages data transformation settings within your pipelines. It bridges the gap between your source data and the destination schema by mapping the type safe API schema to the labeled database schema.
Purpose
- Transformation Management: Create, update, and remove transformation logic that processes data between source and destination.
- Schema Mapping: Automate the mapping process to ensure data consistency and type safety.
- User Approval: Provide an opportunity for users to review and adjust transformation logic before deployment.
Usage
Subcommands
add
• Purpose: Add a new transformation.
• Usage Example:
• Options:
• --name <transform_name>
: Unique name for the transformation.
• --source <source_name>
: The name of the data source.
• --destination <destination_name>
: The target data destination.
• --schema-file <pat
h>: Path to the schema file that defines the mapping.
• --option <key>=<value>
: Additional transformation options, if required.
list
• Purpose: List all configured transformations.
• Usage Example:
update
• Purpose: Update an existing transformation.
• Usage Example:
remove
• Purpose: Remove a transformation.
• Usage Example:
Detailed Behavior
• Mapping Process:
The command reads the type safe API schema and the labeled database schema, then generates a transformation mapping that is sent for user approval.
• Schema Validation:
The provided schema file is validated to ensure that all fields align correctly between source and destination.
• Execution Preparation:
Once approved, the transformation is written to Spark for execution during pipeline runs.
Examples
Adding a Transformation
Listing Transformations
Updating a Transformation
Removing a Transformation
Best Practices
• Review Mappings:
Always review the auto-generated mappings to ensure accuracy.
• Use Configuration Files:
For complex transformations, use a schema file to document the expected field mappings.
• Iterative Testing:
Test transformations in isolation before integrating them into full pipelines.
Troubleshooting
• Mapping Errors:
Check the schema file for discrepancies if the mapping fails.
• Validation Issues:
Use the verbose mode (—verbose) to get detailed error messages during transformation creation.
• Approval Delays:
Ensure that the interactive approval process is not bypassed, as user input is crucial for accuracy.
Conclusion
The octopipe transform command is critical for creating the bridge between your source data and destination requirements. By leveraging the detailed options and following best practices, you can create reliable transformation logic that maintains data integrity throughout your pipeline.