Airport Extension for Data-as-a-Service
The Airport extension for DuckDB enables you to offer powerful Data-as-a-Service (DaaS) solutions.
What is Data-as-a-Service?
Data-as-a-Service delivers clean, structured, and queryable data over the Internet, on demand—similar to SaaS (Software-as-a-Service), but for data instead of software. Consumers don’t need to manage infrastructure, storage, or ETL pipelines—they just connect and start querying.
Leveraging Arrow Flight, Airport can stream datasets or provide data locations with minimal client-side effort.
Key Benefits
- Zero Infrastructure Burden
- Clients only need DuckDB. No custom SDKs or platform dependencies.
- Scalable & Real-Time
- Data delivery is efficient and often near real-time.
- Easy Integration
- Seamlessly plugs into apps, dashboards, or analytics platforms.
- Flexible Delivery
- Clients can run queries but they can also download the data just as easily. They can just use a
COPY TO
SQL statement.
- Clients can run queries but they can also download the data just as easily. They can just use a
- Cost Efficiency
- Data can be stored in the cloud or on-premises in a data center, clients can be pointed to the most efficient location. Data can be served from CDNs or streamed directly from the Arrow Flight server as necessary.
Architecture
A typical DaaS deployment using Airport looks like this:
Component Overview
- DuckDB as the Client
- Requires version 1.3.0 or newer. It can run standalone or be embedded.
- Load Balancer
- Distributes requests across multiple Arrow Flight server instances. Must support HTTP/2.
- Arrow Flight Server
- Built using any Arrow-compatible language (Python, Java, Rust, C++, Go). Handles query requests.
- Data Store
- The source of the data. The Arrow Flight server can:
- Stream data directly from the server in the Arrow IPC format.
- Return a reference via a
data://
URI. (e.g., URL to Parquet or CSV on a CDN).
- The source of the data. The Arrow Flight server can:
Additional Elements
These components enhance security, observability, and monetization:
- Authentication & Authorization
- Validate requests and enforce access rules. Column and row level filtering is possible.
- Observability & Logging
- Log all requests for auditing, debugging, and performance insights.
- Subscription Management
- Enable paid access models by integrating billing with authorization logic.