Manage Tags

Lance, much like Git, employs the LanceDataset.tags property to label specific versions within a dataset’s history.

Tags are particularly useful for tracking the evolution of datasets, especially in machine learning workflows where datasets are frequently updated. For example, you can create, update, and delete or list tags.

Note

Creating or deleting tags does not generate new dataset versions. Tags exist as auxiliary metadata stored in a separate directory.

>>> import lance
>>> ds = lance.dataset("./tags.lance")
>>> len(ds.versions())
2
>>> ds.tags.list()
{}
>>> ds.tags.create("v1-prod", 1)
>>> ds.tags.list()
{'v1-prod': {'version': 1, ...}}
>>> ds.tags.update("v1-prod", 2)
>>> ds.tags.list()
{'v1-prod': {'version': 2, ...}}
>>> ds.tags.delete("v1-prod")
>>> ds.tags.list()
{}

Note

Tagged versions are exempted from the LanceDataset.cleanup_old_versions() process.

To remove a version that has been tagged, you must first LanceDataset.tags.delete() the associated tag.