class lance.LanceOperation.Delete(lance.LanceOperation.BaseOperation)

Remove fragments or rows from the dataset.

updated_fragments

The fragments that have been updated with new deletion vectors.

Type:

list[FragmentMetadata]

deleted_fragment_ids

The ids of the fragments that have been deleted entirely. These are the fragments where LanceFragment.delete() returned None.

Type:

list[int]

predicate

The original SQL predicate used to select the rows to delete.

Type:

str

Warning

This is an advanced API for distributed operations. To delete rows from dataset on a single machine, use lance.LanceDataset.delete().

Examples

To delete rows from a dataset, call lance.fragment.LanceFragment.delete() on each of the fragments. If that returns a new fragment, add that to the updated_fragments list. If it returns None, that means the whole fragment was deleted, so add the fragment id to the deleted_fragment_ids. Finally, pass the operation to the LanceDataset.commit() method to complete the deletion operation.

>>> import lance
>>> import pyarrow as pa
>>> table = pa.table({"a": [1, 2], "b": ["a", "b"]})
>>> dataset = lance.write_dataset(table, "example")
>>> table = pa.table({"a": [3, 4], "b": ["c", "d"]})
>>> dataset = lance.write_dataset(table, "example", mode="append")
>>> dataset.to_table().to_pandas()
   a  b
0  1  a
1  2  b
2  3  c
3  4  d
>>> predicate = "a >= 2"
>>> updated_fragments = []
>>> deleted_fragment_ids = []
>>> for fragment in dataset.get_fragments():
...     new_fragment = fragment.delete(predicate)
...     if new_fragment is not None:
...         updated_fragments.append(new_fragment)
...     else:
...         deleted_fragment_ids.append(fragment.fragment_id)
>>> operation = lance.LanceOperation.Delete(updated_fragments,
...                                         deleted_fragment_ids,
...                                         predicate)
>>> dataset = lance.LanceDataset.commit("example", operation,
...                                     read_version=dataset.version)
>>> dataset.to_table().to_pandas()
   a  b
0  1  a

Public members

Delete(updated_fragments: Iterable[FragmentMetadata], ...)

Initialize self. See help(type(self)) for accurate signature.

__repr__()

Return repr(self).

__eq__(other)

Return self==value.

updated_fragments : Iterable[FragmentMetadata]
deleted_fragment_ids : Iterable[int]
predicate : str