Airport Extension for Data-as-a-Service

The Airport extension for DuckDB enables you to offer powerful Data-as-a-Service (DaaS) solutions.

What is Data-as-a-Service?

Data-as-a-Service delivers clean, structured, and queryable data over the Internet, on demand—similar to SaaS (Software-as-a-Service), but for data instead of software. Consumers don’t need to manage infrastructure, storage, or ETL pipelines—they just connect and start querying.

Leveraging Arrow Flight, Airport can stream datasets or provide data locations with minimal client-side effort.

Key Benefits

  1. Zero Infrastructure Burden
    • Clients only need DuckDB. No custom SDKs or platform dependencies.
  2. Scalable & Real-Time
    • Data delivery is efficient and often near real-time.
  3. Easy Integration
    • Seamlessly plugs into apps, dashboards, or analytics platforms.
  4. Flexible Delivery
    • Clients can run queries but they can also download the data just as easily. They can just use a COPY TO SQL statement.
  5. Cost Efficiency
    • Data can be stored in the cloud or on-premises in a data center, clients can be pointed to the most efficient location. Data can be served from CDNs or streamed directly from the Arrow Flight server as necessary.

Architecture

A typical DaaS deployment using Airport looks like this:

Component Overview

  1. DuckDB as the Client
    • Requires version 1.3.0 or newer. It can run standalone or be embedded.
  2. Load Balancer
    • Distributes requests across multiple Arrow Flight server instances. Must support HTTP/2.
  3. Arrow Flight Server
    • Built using any Arrow-compatible language (Python, Java, Rust, C++, Go). Handles query requests.
  4. Data Store

Additional Elements

These components enhance security, observability, and monetization:

  1. Authentication & Authorization
  2. Observability & Logging
    • Log all requests for auditing, debugging, and performance insights.
  3. Subscription Management
    • Enable paid access models by integrating billing with authorization logic.