static lance.LanceDataset.commit_batch(dest: str | Path | LanceDataset, transactions: collections.abc.Sequence[Transaction], commit_lock: CommitLock | None = None, storage_options: dict[str, str] | None = None, enable_v2_manifest_paths: bool | None = None, detached: bool | None = False, max_retries: int = 20) BulkCommitResult

Create a new version of dataset with multiple transactions.

This method is an advanced method which allows users to describe a change that has been made to the data files. This method is not needed when using Lance to apply changes (e.g. when using LanceDataset or write_dataset().)

Parameters:
dest : str, Path, or LanceDataset

The base uri of the dataset, or the dataset object itself. Using the dataset object can be more efficient because it can re-use the file metadata cache.

transactions : Iterable[Transaction]

The transactions to apply to the dataset. These will be merged into a single transaction and applied to the dataset. Note: Only append transactions are currently supported. Other transaction types will be supported in the future.

commit_lock : CommitLock, optional

A custom commit lock. Only needed if your object store does not support atomic commits. See the user guide for more details.

storage_options : optional, dict

Extra options that make sense for a particular storage connection. This is used to store connection parameters like credentials, endpoint, etc.

enable_v2_manifest_paths : bool, optional

If True, and this is a new dataset, uses the new V2 manifest paths. These paths provide more efficient opening of datasets with many versions on object stores. This parameter has no effect if the dataset already exists. To migrate an existing dataset, instead use the migrate_manifest_paths_v2() method. Default is False. WARNING: turning this on will make the dataset unreadable for older versions of Lance (prior to 0.17.0).

detached : bool, optional

If True, then the commit will not be part of the dataset lineage. It will never show up as the latest dataset and the only way to check it out in the future will be to specifically check it out by version. The version will be a random version that is only unique amongst detached commits. The caller should store this somewhere as there will be no other way to obtain it in the future.

max_retries : int

The maximum number of retries to perform when committing the dataset.

Returns:

dataset: LanceDataset

A new version of Lance Dataset.

merged: Transaction

The merged transaction that was applied to the dataset.

Return type:

dict with keys