Lance Iceberg Namespace¶
Lance Iceberg Namespace is an implementation using Apache Iceberg REST Catalog (IRC). For more details about IRC, please read the IRC Specification.
Note
This implementation is designed against the IRC spec as of Iceberg release version 1.9.0.
Namespace Mapping¶
An IRC server can be viewed as the root Lance namespace. The Iceberg multi-level namespaces map to the multi-level child namespaces. Whether the namespace is leveled and the number of levels depend on the specific IRC provider.
Table Definition¶
A Lance table should appear as a table object in IRC in the shape of Iceberg TableMetadata, with the following requirements:
- the
location
must point to the root location of the Lance table - the
properties
must follow:- there is a key
table_type
set tolance
(case insensitive) - there is a key
managed_by
set to eitherstorage
orimpl
(case insensitive). If not set, default tostorage
- there is a key
- the
current-snapshot-id
is set to the latest numeric version number of the table. This field will only be respected ifmanaged_by=impl
When a user performs a LoadTable
API call to retrieve the table metadata,
the server must not return a metadata-location
in the LoadTableResponse
.
Requirement for Implementation Managed Table¶
An update to the implementation managed table must go through IRC UpdateTable API
or CommitTransaction API
with a requirement that the assert-ref-snapshot-id
is the current Lance table version.
If the commit fails due to unresolvable concurrent commits, the IRC server must fail with 409 Conflict
according to the IRC spec.
Using Lance Table in IRC with Iceberg Tooling¶
In order to use the table with Iceberg tooling (e.g. pyiceberg
), the implementation can optionally set the following
in Iceberg TableMetadata:
- there is at least one schema in the list of
schemas
- the schema reflects the latest schema of the Lance table
- the schema has ID
1
- the data type conversion follows Apache Arrow to Apache Iceberg data type conversion.
- the
current-schema-id
is set to1
- there is at least one snapshot in the list of
snapshots
.- the snapshot should have
snapshot-id
set to the latest numeric version number of the table.
- the snapshot should have
- there is at least one snapshot log in the list of
snapshot-log
- the snapshot log should have
snapshot-id
set to the latest numeric version number of the table.
- the snapshot log should have