Lance ❤️ Ray

Ray effortlessly scale up ML workload to large distributed compute environment.

Ray Data can be directly written in Lance format by using the lance.ray.sink.LanceDatasink class. For example:

pip install pylance[ray]

Ray Data Dataset can be written to Lance format using the following code:

import ray
from lance.ray.sink import LanceDatasink

ray.init()

sink = LanceDatasink("s3://bucket/to/data.lance")
ray.data.range(10).map(
    lambda x: {"id": x["id"], "str": f"str-{x['id']}"}
).write_datasink(sink)