Data Evolution¶
add_columns
¶
Add columns to an existing Lance dataset using Ray's distributed processing.
Parameters:
uri
: Path to the Lance dataset (either uri OR namespace+table_id required)namespace
: LanceNamespace instance for metadata catalog integration (requires table_id)table_id
: Table identifier as list of strings (requires namespace)transform
: Transform function to apply for adding columnsfilter
: Optional filter expression to applyread_columns
: Optional list of columns to read from original datasetreader_schema
: Optional schema for the readerread_version
: Optional version to readray_remote_args
: Optional kwargs for Ray remote tasksstorage_options
: Optional storage configuration dictionarybatch_size
: Batch size for processing (default: 1024)concurrency
: Optional number of concurrent processes
Returns: None