class lance.LanceOperation.Overwrite(lance.LanceOperation.BaseOperation)

Overwrite or create a new dataset.

new_schema

The schema of the new dataset.

Type:

pyarrow.Schema

fragments

The fragments that make up the new dataset.

Type:

list[FragmentMetadata]

Warning

This is an advanced API for distributed operations. To overwrite or create new dataset on a single machine, use lance.write_dataset().

Examples

To create or overwrite a dataset, first use lance.fragment.LanceFragment.create() to create fragments. Then collect the fragment metadata into a list and pass it along with the schema to this class. Finally, pass the operation to the LanceDataset.commit() method to create the new dataset.

>>> import lance
>>> import pyarrow as pa
>>> tab1 = pa.table({"a": [1, 2], "b": ["a", "b"]})
>>> tab2 = pa.table({"a": [3, 4], "b": ["c", "d"]})
>>> fragment1 = lance.fragment.LanceFragment.create("example", tab1)
>>> fragment2 = lance.fragment.LanceFragment.create("example", tab2)
>>> fragments = [fragment1, fragment2]
>>> operation = lance.LanceOperation.Overwrite(tab1.schema, fragments)
>>> dataset = lance.LanceDataset.commit("example", operation)
>>> dataset.to_table().to_pandas()
   a  b
0  1  a
1  2  b
2  3  c
3  4  d

Public members

Overwrite(new_schema: LanceSchema | Schema, fragments)

Initialize self. See help(type(self)) for accurate signature.

__repr__()

Return repr(self).

__eq__(other)

Return self==value.

new_schema : LanceSchema | Schema
fragments : Iterable[FragmentMetadata]