octopipe transform Command Reference

The octopipe transform command manages data transformation settings within your pipelines. It bridges the gap between your source data and the destination schema by mapping the type safe API schema to the labeled database schema.

Purpose

  • Transformation Management: Create, update, and remove transformation logic that processes data between source and destination.
  • Schema Mapping: Automate the mapping process to ensure data consistency and type safety.
  • User Approval: Provide an opportunity for users to review and adjust transformation logic before deployment.

Usage

octopipe transform <subcommand> [options]

Subcommands

add

Purpose: Add a new transformation.

Usage Example:

octopipe transform add --name sales_transform --source sales_api --destination sales_db --schema-file ./schemas/sales_schema.json

Options:

--name <transform_name>: Unique name for the transformation.

--source <source_name>: The name of the data source.

--destination <destination_name>: The target data destination.

--schema-file <path>: Path to the schema file that defines the mapping.

--option <key>=<value>: Additional transformation options, if required.

list

Purpose: List all configured transformations.

Usage Example:

octopipe transform list

update

Purpose: Update an existing transformation.

Usage Example:

octopipe transform update sales_transform --option new_parameter=value

remove

Purpose: Remove a transformation.

Usage Example:

octopipe transform remove sales_transform

Detailed Behavior

Mapping Process:

The command reads the type safe API schema and the labeled database schema, then generates a transformation mapping that is sent for user approval.

Schema Validation:

The provided schema file is validated to ensure that all fields align correctly between source and destination.

Execution Preparation:

Once approved, the transformation is written to Spark for execution during pipeline runs.

Examples

Adding a Transformation

octopipe transform add --name sales_transform --source sales_api --destination sales_db --schema-file ./schemas/sales_schema.json

Listing Transformations

octopipe transform list

Updating a Transformation

octopipe transform update sales_transform --option new_parameter=value

Removing a Transformation

octopipe transform remove sales_transform

Best Practices

Review Mappings:

Always review the auto-generated mappings to ensure accuracy.

Use Configuration Files:

For complex transformations, use a schema file to document the expected field mappings.

Iterative Testing:

Test transformations in isolation before integrating them into full pipelines.

Troubleshooting

Mapping Errors:

Check the schema file for discrepancies if the mapping fails.

Validation Issues:

Use the verbose mode (—verbose) to get detailed error messages during transformation creation.

Approval Delays:

Ensure that the interactive approval process is not bypassed, as user input is crucial for accuracy.

Conclusion

The octopipe transform command is critical for creating the bridge between your source data and destination requirements. By leveraging the detailed options and following best practices, you can create reliable transformation logic that maintains data integrity throughout your pipeline.