Table Returning Functions

The Airport extension for DuckDB supports having Arrow Flight servers provide table returning functions. Table returning functions are SQL functions that return an arbitrary number of rows conforming to a schema. The schema can be dynamically defined based on the input to the function.

Example

An example of a SQL query with a table returning function is:

SELECT * from custom_function('arg1');

In addition to taking scalar parameters table returning functions can be what are called “in-out” functions, where can they receive the output of another query as an input.

An example of calling an “in-out” table returning function is:

SELECT * from in_out_function('arg1',
 (SELECT * from employees where name ilike 'R%')
);

The rows from the employees table will be sent incrementally to Arrow Flight server.

Arrow Flight Server Implementation Notes

DuckDB Catalog Integration

For information about how to register an Arrow Flight as a table returning function refer to Server Catalog Integration.

Supporting the ANY DuckDB Data Type

If you wish to use the DuckDB ANY datatype, create a standard Arrow field in a schema, but add additional metadata with the name of is_any_type with a non empty table. The Airport extension will treat that column as being defined as the ANY type.

Schemas for Table Returning In-Out Functions

If the table returning function is desired to be an “in-out” function, there must be one and only one field in the Arrow input schema with a metadata key named is_table_type with a non empty string value. This column will be treated as the TABLE DuckDB data type by the Airport extension.

Implementation

The Airport extension does two different things, depending if the table returning function is an “in-out” function or not.

Regular Table Returning Function

When the table returning function is called. The Airport extension will make a DoAction Arrow Flight RPC to the server with an action named table_function_flight_info to obtain a FlightInfo which will then be called with the DoGet Arrow Flight RPC method. It is expected the server may interpret the parameters and return a new schema or otherwise encode the parameters in the returned ticket field of the FlightInfo.

In-Out Table Function

In the case of an “in-out” table returning function a DoExchange Arrow RPC is made.

The input schema for the DoExchange call is the schema of the “in” table. The serialized parameters to the table returning function are sent an initial metadata message on the DoExchange stream. This message is msgpack encoded.

struct AirportTableFunctionInOutParameters
{
  std::string json_filters;
  std::vector<idx_t> column_ids;

  // The parameters to the table function, which should
  // be included in the opaque ticket data returned
  // for each endpoint.
  std::string parameters;

  std::string at_unit;
  std::string at_value;

  MSGPACK_DEFINE_MAP(json_filters, column_ids, parameters, at_unit, at_value)
};

The rows produced by the function are read from the DoExchange stream.