Skip to content

Connection

geneva.db.Connection

Bases: DBConnection

Geneva Connection.

namespace_impl

namespace_impl = namespace_impl

namespace_properties

namespace_properties = namespace_properties

system_namespace

system_namespace = (
    system_namespace if uses_namespace_system_tables else []
)

user_db_uri

user_db_uri: str

Original user database URI for this connection.

effective_system_namespace

effective_system_namespace: list[str]

Effective namespace after backend-compatibility fallback.

system_table_db_uri

system_table_db_uri: str

Database URI where system tables should live.

system_table_connection

system_table_connection: Connection

Connection used to access system tables.

namespace_client

namespace_client: LanceNamespace | None

Returns namespace client if using namespace connection.

flight_client

flight_client: Any

close

close() -> None

Close the connection.

resolve_system_table_location

resolve_system_table_location(
    namespace: list[str] | None = None,
) -> tuple[Connection, list[str]]

Resolve the target connection/namespace for system table operations.

migrate_legacy_system_table_if_needed

migrate_legacy_system_table_if_needed(
    *,
    table_name: str,
    target_conn: Connection,
    target_namespace: list[str] | None,
) -> None

Best-effort migration from legacy system-table locations.

Legacy locations: - Local/object-store direct connections: /

.lance (root) - Remote connections: root namespace [] - Namespace connections: default namespace ["default"]

prepare_system_table_target

prepare_system_table_target(
    table_name: str, namespace: list[str] | None = None
) -> tuple[Connection, list[str]]

Resolve and prepare the destination for a system-table operation.

alter_or_create_system_table

alter_or_create_system_table(
    table_name: str,
    model: Any,
    namespace: list[str] | None = None,
) -> tuple[Connection, Table]

Open or create a Geneva internal table in its resolved location.

table_names

table_names(
    page_token: str | None = None,
    limit: int | None = None,
    *args,
    **kwargs,
) -> Iterable[str]

List all available tables and views.

open_table

open_table(
    name: str,
    storage_options: dict[str, str] | None = None,
    index_cache_size: int | None = None,
    version: int | None = None,
    namespace: list[str] | None = None,
    *args,
    **kwargs,
) -> Table

Open a Lance Table.

Parameters:

  • name (str) –

    Name of the table.

  • storage_options (dict[str, str] | None, default: None ) –

    Additional options for the storage backend. Options already set on the connection will be inherited by the table, but can be overridden here. See available options at https://lancedb.github.io/lancedb/guides/storage/

  • namespace (list[str] | None, default: None ) –

    Namespace path for the table (e.g., ["workspace"])

create_table

create_table(
    name: str,
    data: DATA | None = None,
    schema: Schema | LanceModel | None = None,
    mode: str = "create",
    exist_ok: bool = False,
    on_bad_vectors: str = "error",
    fill_value: float = 0.0,
    storage_options: dict[str, str] | None = None,
    *args,
    **kwargs,
) -> Table

Create a Table in the lake

Parameters:

  • name (str) –

    The name of the table

  • data (DATA | None, default: None ) –

    User must provide at least one of data or schema. Acceptable types are:

    • list-of-dict
    • pandas.DataFrame
    • pyarrow.Table or pyarrow.RecordBatch
  • schema (Schema | LanceModel | None, default: None ) –

    Acceptable types are:

    • pyarrow.Schema
    • lancedb.pydantic.LanceModel
  • mode (str, default: 'create' ) –

    The mode to use when creating the table. Can be either "create" or "overwrite". By default, if the table already exists, an exception is raised. If you want to overwrite the table, use mode="overwrite".

  • exist_ok (bool, default: False ) –

    If a table by the same name already exists, then raise an exception if exist_ok=False. If exist_ok=True, then open the existing table; it will not add the provided data but will validate against any schema that's specified.

  • on_bad_vectors (str, default: 'error' ) –

    What to do if any of the vectors are not the same size or contain NaNs. One of "error", "drop", "fill".

create_view

create_view(
    name: str, query: str, materialized: bool = False
) -> Table

Create a View from a Query.

Parameters:

  • name (str) –

    Name of the view.

  • query (str) –

    SQL query to create the view.

  • materialized (bool, default: False ) –

    If True, the view is materialized.

create_materialized_view

create_materialized_view(
    name: str,
    query: GenevaQueryBuilder,
    with_no_data: bool = True,
) -> Table

Create a materialized view

Parameters:

  • name (str) –

    Name of the materialized view.

  • query (GenevaQueryBuilder) –

    Query to create the view.

  • with_no_data (bool, default: True ) –

    If True, the view is materialized, if false it is ready for refresh.

create_udtf_view

create_udtf_view(
    name: str, source: GenevaQueryBuilder, udtf: UDTF
) -> Table

Create a UDTF-backed materialized view.

The view is created empty; call view.refresh() to populate it.

Parameters:

  • name (str) –

    Name for the new view table.

  • source (GenevaQueryBuilder) –

    Query defining the source data.

  • udtf (UDTF) –

    The UDTF to execute on refresh.

create_scalar_udtf_view

create_scalar_udtf_view(
    name: str,
    source: GenevaQueryBuilder,
    scalar_udtf: ScalarUDTF,
) -> Table

Create a scalar UDTF-backed materialized view (1:N row expansion).

The view is created with placeholder rows (one per source row) that are populated on view.refresh(). Each source row expands to zero or more output rows via the scalar UDTF.

Parameters:

  • name (str) –

    Name for the new view table.

  • source (GenevaQueryBuilder) –

    Query defining the source data.

  • scalar_udtf (ScalarUDTF) –

    The scalar UDTF to execute on refresh.

drop_view

drop_view(name: str) -> Table

Drop a view.

drop_table

drop_table(name: str, *args, **kwargs) -> None

Drop a table.

define_cluster

define_cluster(name: str, cluster: GenevaCluster) -> None

Define a persistent Geneva cluster. This will upsert the cluster definition by name. The cluster can then be provisioned using context(cluster=name).

Parameters:

  • name (str) –

    Name of the cluster. This will be used as the key when upserting and provisioning the cluster. The cluster name must comply with RFC 12123.

  • cluster (GenevaCluster) –

    The cluster definition to store.

list_clusters

list_clusters() -> list[GenevaCluster]

List the cluster definitions. These can be defined using define_cluster().

Returns:

delete_cluster

delete_cluster(name: str) -> None

Delete a Geneva cluster definition.

Parameters:

  • name (str) –

    Name of the cluster to delete.

define_manifest

define_manifest(
    name: str,
    manifest: GenevaManifest,
    uploader: Uploader | None = None,
) -> None

Define a persistent Geneva Manifest that represents the files and dependencies used in the execution environment. This will upsert the manifest definition by name and upload the required artifacts. The manifest can then be used with context(manifest=name).

Parameters:

  • name (str) –

    Name of the manifest. This will be used as the key when upserting and loading the manifest.

  • manifest (GenevaManifest) –

    The manifest definition to use.

  • uploader (Uploader | None, default: None ) –

    An optional, custom Uploader to use. If not provided, the uploader will be auto-detected based on the environment configuration.

list_manifests

list_manifests() -> list[GenevaManifest]

List the manifest definitions. These can be defined using define_manifest().

Returns:

delete_manifest

delete_manifest(name: str) -> None

Delete a Geneva manifest definition.

Parameters:

  • name (str) –

    Name of the manifest to delete.

local_ray_context

local_ray_context() -> AbstractContextManager[None]

Context manager for a local Ray instance. This will provision a local Ray instance and return a context manager. This is useful for development or small jobs.

context

context(
    cluster: str,
    manifest: str | None = None,
    on_exit=None,
    wait_timeout: float | None = None,
    log_to_driver: bool = True,
    logging_level=INFO,
) -> AbstractContextManager[None]

Context manager for a Geneva Execution Environment. This will provision a cluster based on the cluster definition and the manifest provided. By default, the context manager will delete the cluster on exit. This can be configured with the on_exit parameter.

Parameters:

  • cluster (str) –

    Name of the persisted cluster definition to use. Required. This will raise an exception if the cluster definition was not defined via define_cluster().

  • manifest (str | None, default: None ) –

    Optional name of the persisted manifest to use. This will raise an exception if the manifest definition was not defined via define_manifest(). If manifest is not provided, the local environment will be uploaded.

  • on_exit

    Exit mode for the cluster. By default, the cluster waits for all running jobs to complete before deleting. To retain the cluster when any job fails or the context body raises an exception, use ExitMode.RETAIN_ON_FAILURE. To always retain the cluster, use ExitMode.RETAIN.

  • wait_timeout (float | None, default: None ) –

    Internal/experimental. Maximum seconds to wait for tracked jobs during context exit. Only applies with DELETE or RETAIN_ON_FAILURE. None means wait indefinitely. For RETAIN_ON_FAILURE, a timeout is treated as a failure and the cluster is retained.

  • log_to_driver (bool, default: True ) –

    Whether to send Ray worker logs to the driver. Defaults to True for better visibility in tests and debugging.

  • logging_level

    The logging level for Ray workers. Use logging.DEBUG for detailed logs.

is_remote

is_remote() -> bool