list_schemas
Action
This list_schemas
action is used by DuckDB to determine the contents of a database attached via Airport. The Arrow Flight server returns information about the schemas that exist in the database as well as all of the tables, scalar functions and table returning functions.
Input Parameters
There is a single msgpack
serialized parameter passed to the action.
struct AirportSerializedCatalogSchemaRequest
{
std::string catalog_name;
(catalog_name)
MSGPACK_DEFINE_MAP};
Output Result
The Arrow Flight server responds with a single msgpack
-serialized message containing a SerializedCompressedContent
structure:
struct AirportSerializedCompressedContent
{
// The uncompressed length of the data.
uint32_t length;
// The compressed data using ZStandard.
std::string data;
(length, data)
MSGPACK_DEFINE};
After decompression, the data is deserialized into a AirportSerializedCatalogRoot
structure:
struct AirportSerializedCatalogRoot
{
// The contents of the catalog itself, this is optional, if its
// more efficient to provide the entire catalog at once rather than
// having each schema listed individually.
;
AirportSerializedContentsWithSHA256Hash contents// A list of schemas.
std::vector<AirportSerializedSchema> schemas;
(contents, schemas)
MSGPACK_DEFINE_MAP};
Each schema is represented as a AirportSerializedSchema
structure:
struct AirportSerializedSchema
{
// The name of the schema
std::string name;
// The description of the schema
std::string description;
// Any tags to apply to the schema.
std::unordered_map<std::string, std::string> tags;
// The contents of the schema itself, which can be external
// or provided inline.
;
AirportSerializedContentsWithSHA256Hash contents
(schema, description, tags, contents)
MSGPACK_DEFINE_MAP};
To help with efficiency, the entire contents of a catalog can be provided via a single URL or inline. Or each individual schema can provide its own URL or inline serialization. This is accomplished by having a AirportSerializedContentsWithSHA256Hash
at catalog level AirportSerializedCatalogRoot
as well as in each AirportSerializedSchema
structure.
Schema Serialization Methods
The AirportSerializedContentsWithSHA256Hash
structure provides schema or catalog content, either inline or via an external URL with SHA256 validation. If an external URL is provided, the contents is assumed to the same as the inline serialization with the SHA256 value of the contents being verified.
// This is a generic type that applies a SHA256
// checksum to the data it contains.
//
// The data can either be provied inline by
// inside of the serialized field, or it can
// be retrieved from the specified URL.
//
// URLs are considered to be immutable, and
// their contents may be cached on disk.
struct AirportSerializedContentsWithSHA256Hash
{
// The SHA256 of the serialized contents.
// or the external url.
std::string sha256;
// The external URL where the contents should be obtained.
std::optional<std::string> url;
// The inline serialized contents.
std::optional<std::string> serialized;
(sha256, url, serialized)
MSGPACK_DEFINE_MAP};
The type and format of the inline serialized data varies depends on if the content is describing an entire catalog or a specific schema.
Catalog Inline Serialization
If the AirportSerializedContentsWithSHA256Hash
is describing an entire catalog the serialized data consists of a vector of pairs of strings. The pairs of strings are:
- The SHA256 value of the following data.
- A ZStandard compressed
msgpack
array of serialized Apache ArrowFlightInfo
structures.
The SHA256 value is required since, it is referenced in the schemas
field of the AirportSerializedCatalogRoot
. Since each schema has the SHA256 provided the schemas will be cached by DuckDB on disk. If a schema can change, it should not be provided inline at the catalog level, instead it should be provided inline at the schema
field of the AirportSerializedCatalogRoot
.
Schema Inline Serialization
If the AirportSerializedContentsWithSHA256Hash
is describing a single schema the serialized data consists of a ZStandard compressed msgpack
array of serialized Apache Arrow FlightInfo
structures.