Working with SQL
You can use DuckDB and Apache Datafusion to query your LanceDB tables using SQL. This guide will show how to query Lance tables them using both.
We will re-use the dataset created previously:
import lancedb
db = lancedb.connect("data/sample-lancedb")
data = [
{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
{"vector": [5.9, 26.5], "item": "bar", "price": 20.0}
]
table = db.create_table("pd_table", data=data)
Querying a LanceDB Table with DuckDb
The to_lance
method converts the LanceDB table to a LanceDataset
, which is accessible to DuckDB through the Arrow compatibility layer.
To query the resulting Lance dataset in DuckDB, all you need to do is reference the dataset by the same name in your SQL query.
βββββββββββββββ¬ββββββββββ¬βββββββββ
β vector β item β price β
β float[] β varchar β double β
βββββββββββββββΌββββββββββΌβββββββββ€
β [3.1, 4.1] β foo β 10.0 β
β [5.9, 26.5] β bar β 20.0 β
βββββββββββββββ΄ββββββββββ΄βββββββββ
Querying a LanceDB Table with Apache Datafusion
Have the required imports before doing any querying.
Register the table created with the Datafusion session context.
βββββββββββββββ¬ββββββββββ¬βββββββββ
β vector β item β price β
β float[] β varchar β double β
βββββββββββββββΌββββββββββΌβββββββββ€
β [3.1, 4.1] β foo β 10.0 β
β [5.9, 26.5] β bar β 20.0 β
βββββββββββββββ΄ββββββββββ΄βββββββββ