- class lance.LanceOperation.Overwrite(lance.LanceOperation.BaseOperation)
Overwrite or create a new dataset.
- new_schema¶
The schema of the new dataset.
- Type:
- fragments¶
The fragments that make up the new dataset.
- Type:
list[FragmentMetadata]
Warning
This is an advanced API for distributed operations. To overwrite or create new dataset on a single machine, use
lance.write_dataset()
.Examples
To create or overwrite a dataset, first use
lance.fragment.LanceFragment.create()
to create fragments. Then collect the fragment metadata into a list and pass it along with the schema to this class. Finally, pass the operation to theLanceDataset.commit()
method to create the new dataset.>>> import lance >>> import pyarrow as pa >>> tab1 = pa.table({"a": [1, 2], "b": ["a", "b"]}) >>> tab2 = pa.table({"a": [3, 4], "b": ["c", "d"]}) >>> fragment1 = lance.fragment.LanceFragment.create("example", tab1) >>> fragment2 = lance.fragment.LanceFragment.create("example", tab2) >>> fragments = [fragment1, fragment2] >>> operation = lance.LanceOperation.Overwrite(tab1.schema, fragments) >>> dataset = lance.LanceDataset.commit("example", operation) >>> dataset.to_table().to_pandas() a b 0 1 a 1 2 b 2 3 c 3 4 d
Public members¶
- Overwrite(new_schema: LanceSchema | Schema, fragments)
Initialize self. See help(type(self)) for accurate signature.
- __repr__()
Return repr(self).
- new_schema : LanceSchema | Schema
- fragments : Iterable[FragmentMetadata]