Skip to content

Data Evolution

add_columns

add_columns(
    uri=None, 
    *, 
    namespace=None, 
    table_id=None, 
    transform, 
    **kwargs)

Add columns to an existing Lance dataset using Ray's distributed processing.

Parameters:

  • uri: Path to the Lance dataset (either uri OR namespace+table_id required)
  • namespace: LanceNamespace instance for metadata catalog integration (requires table_id)
  • table_id: Table identifier as list of strings (requires namespace)
  • transform: Transform function to apply for adding columns
  • filter: Optional filter expression to apply
  • read_columns: Optional list of columns to read from original dataset
  • reader_schema: Optional schema for the reader
  • read_version: Optional version to read
  • ray_remote_args: Optional kwargs for Ray remote tasks
  • storage_options: Optional storage configuration dictionary
  • batch_size: Batch size for processing (default: 1024)
  • concurrency: Optional number of concurrent processes

Returns: None