Type safe api
Type safe api
Type Safe API Generation in Octopipe
The type safe API is one of Octopipe’s key innovations. By automating the generation of a strongly-typed API for each connector, Octopipe minimizes runtime errors and improves developer productivity. This document details how the type safe API is generated, its underlying principles, and the benefits it brings to your data pipeline.
Introduction
- Objective: To provide a clear, robust, and error-resistant API interface for all data connectors.
- Method: Combining direct API calls with LLM-assisted type derivation to ensure that every data field is accurately typed.
The Generation Process
- API Invocation:
- Octopipe starts by calling the external API through a configured connector.
- The raw response is captured and analyzed to extract the data structure.
- LLM-Assisted Type Derivation:
- A Large Language Model (LLM) analyzes the API response.
- The LLM suggests appropriate data types based on common patterns and data examples.
- This approach ensures that the derived types are both accurate and consistent across different endpoints.
- Schema Construction:
- Using the insights from the LLM, Octopipe constructs a type safe API schema.
- The schema is defined in a language-agnostic manner, making it easy to integrate with various programming languages.
- This schema is then used as a contract for data processing throughout the pipeline.
Key Features of the Type Safe API
- Strong Typing: All fields are assigned explicit types (e.g., string, integer, float, timestamp), reducing the likelihood of type mismatches during runtime.
- Validation: The API schema undergoes automated validation to ensure consistency with the data returned by the connector.
- Documentation: Generated schemas include detailed descriptions for each field, aiding in developer understanding and debugging.
- Integration: The type safe API is seamlessly integrated with the transform layer, ensuring that data is processed according to strict type definitions.
Developer Benefits
- Error Reduction: Strong typing helps catch errors at compile time rather than during execution.
- Improved Maintainability: Clear API contracts make it easier for developers to understand data flows and maintain the code.
- Rapid Onboarding: New team members can quickly get up to speed thanks to well-documented and consistent API schemas.
- Enhanced Productivity: Automation of type derivation reduces manual coding, letting developers focus on higher-level pipeline logic.
Implementation Details
- Technology Stack: The type safe API is generated using a combination of Python scripts and LLM-based processing.
- Customization: Users can adjust the default type inference rules by providing custom mappings or annotations.
- Fallback Mechanism: In cases where the LLM cannot confidently derive a type, Octopipe falls back to user-defined defaults or prompts for manual input.
Example Workflow
-
Connector Call: The connector retrieves data from an external API endpoint.
-
LLM Analysis: The LLM processes the JSON response and identifies that a field such as
user_id
should be an integer, whileuser_name
is a string. -
Schema Generation: The resulting schema might look like: