Lance ❤️ Ray¶
Ray effortlessly scale up ML workload to large distributed compute environment.
Ray Data can be directly written in Lance format by using the
lance.ray.sink.LanceDatasink
class. For example:
pip install pylance[ray]
Ray Data Dataset
can be written to Lance format using the following code:
import ray
from lance.ray.sink import LanceDatasink
ray.init()
sink = LanceDatasink("s3://bucket/to/data.lance")
ray.data.range(10).map(
lambda x: {"id": x["id"], "str": f"str-{x['id']}"}
).write_datasink(sink)