Lance Glue Namespace¶
Lance Glue Namespace is an implementation using AWS Glue Data Catalog. For more details about AWS Glue, please read the AWS Glue Data Catalog Documentation.
Configuration¶
The Lance Glue namespace accepts the following configuration properties:
Property | Required | Description | Default | Example |
---|---|---|---|---|
catalog_id |
No | The Catalog ID of the Glue catalog (defaults to AWS account ID) | 123456789012 |
|
endpoint |
No | Custom Glue service endpoint for API compatible metastores | https://glue.example.com |
|
region |
No | AWS region for all Glue operations | us-west-2 |
|
access_key_id |
No | AWS access key ID for static credentials | ||
secret_access_key |
No | AWS secret access key for static credentials | ||
session_token |
No | AWS session token for temporary credentials | ||
root |
No | Storage root location of the lakehouse on Glue catalog | Current working directory | /my/dir , s3://bucket/prefix |
storage.* |
No | Additional storage configurations to access table | storage.region=us-west-2 |
Authentication¶
The Glue namespace supports multiple authentication methods:
- Default AWS credential provider chain: When no explicit credentials are provided, the client uses the default AWS credential provider chain
- Static credentials: Set
access_key_id
andsecret_access_key
for basic AWS credentials - Session credentials: Additionally provide
session_token
for temporary AWS credentials
Namespace Mapping¶
An AWS Glue Data Catalog can be viewed as the root Lance namespace. A database in Glue maps to the first level Lance namespace, to form a 2-level Lance namespace as a whole.
Table Definition¶
When fully implemented, a Lance table should appear as a Table object in AWS Glue with the following requirements:
- the
TableType
must be set toEXTERNAL_TABLE
to indicate this is not a Glue managed table - the
StorageDescriptor.Location
must point to the root location of the Lance table - the
Parameters
must follow:- there is a key
table_type
set tolance
(case insensitive) - there is a key
managed_by
set to eitherstorage
orimpl
(case insensitive). If not set, default tostorage
- there is a key
version
set to the latest numeric version number of the table. This field will only be respected ifmanaged_by=impl
- there is a key
Requirement for Implementation Managed Table¶
Updates to implementation-managed Lance tables must use AWS Glue’s VersionId
for conditional updates through the
UpdateTable API. If the VersionId
does not
match the expected version, the update fails to prevent concurrent modification conflicts.