column_statistics Action

The column_statitics action is used to provide DuckDB with statistics about a field in a table. These statistics allow DuckDB to execute queries more efficiently. An Arrow Flight server can optionally choose to implement this action on a table by table basis.

Input Parameters

There is a single msgpack serialized parameter passed to the action.

struct GetFlightColumnStatistics
{
  std::string flight_descriptor;
  std::string column_name;
  std::string type;

  MSGPACK_DEFINE_MAP(flight_descriptor, column_name, type)
};

The flight_descriptor field is the Arrow Flight serialized FlightDescriptor structure.

The type field is the DuckDB data type name, i.e. VARCHAR, TIMESTAMP WITH TIME ZONE.

Output Results

The response is a Arrow RecordBatch serialized using the IPC Format. The schema of the response should contain these fields.

Field Name Type Description
has_not_null BOOLEAN Indicate if the field contains a value that is not null.
has_null BOOLEAN Indicate if the field contains a value that is null.
distinct_count UINT64 Indicate the number of distinct values in the field.
min Depends on column type The minimum value of the field.
max Depends on column type The maximum value of the field.
max_string_length UINT64 The maximum length of a string value
contains_unicode BOOLEAN Indicate if the field contains Unicode text.