Skip to content

Quick start

LanceDB can be run in a number of ways:

  • Embedded within an existing backend (like your Django, Flask, Node.js or FastAPI application)
  • Connected to directly from a client application like a Jupyter notebook for analytical workloads
  • Deployed as a remote serverless database

Installation

pip install lancedb
npm install vectordb

Rust SDK is experimental, might introduce breaking changes in the near future

cargo add vectordb

To use the vectordb create, you first need to install protobuf.

brew install protobuf
sudo apt install -y protobuf-compiler libssl-dev

Please also make sure you're using the same version of Arrow as in the vectordb crate

How to connect to a database

import lancedb
uri = "data/sample-lancedb"
db = lancedb.connect(uri)
import * as lancedb from "vectordb";
import { Schema, Field, Float32, FixedSizeList, Int32, Float16 } from "apache-arrow";

const lancedb = require("vectordb");
const uri = "data/sample-lancedb";
const db = await lancedb.connect(uri);
#[tokio::main]
async fn main() -> Result<()> {
}

See examples/simple.rs for a full working example.

LanceDB will create the directory if it doesn't exist (including parent directories).

If you need a reminder of the uri, you can call db.uri().

How to create a table

tbl = db.create_table("my_table",
                data=[{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
                      {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])

If the table already exists, LanceDB will raise an error by default. If you want to overwrite the table, you can pass in mode="overwrite" to the create_table method.

You can also pass in a pandas DataFrame directly:

import pandas as pd
df = pd.DataFrame([{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
                   {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])
tbl = db.create_table("table_from_df", data=df)
const tbl = await db.createTable(
  "myTable",
  [
    { vector: [3.1, 4.1], item: "foo", price: 10.0 },
    { vector: [5.9, 26.5], item: "bar", price: 20.0 },
  ],
  { writeMode: lancedb.WriteMode.Overwrite }
);

If the table already exists, LanceDB will raise an error by default. If you want to overwrite the table, you can pass in mode="overwrite" to the createTable function.

use arrow_schema::{DataType, Schema, Field};
use arrow_array::{RecordBatch, RecordBatchIterator};

If the table already exists, LanceDB will raise an error by default.

Under the hood, LanceDB is converting the input data into an Apache Arrow table and persisting it to disk in Lance format.

Creating an empty table

Sometimes you may not have the data to insert into the table at creation time. In this case, you can create an empty table and specify the schema.

import pyarrow as pa
schema = pa.schema([pa.field("vector", pa.list_(pa.float32(), list_size=2))])
tbl = db.create_table("empty_table", schema=schema)
const schema = new Schema([
  new Field("id", new Int32()),
  new Field("name", new Utf8()),
]);
const empty_tbl = await db.createTable({ name: "empty_table", schema });

How to open an existing table

Once created, you can open a table using the following code:

tbl = db.open_table("my_table")
const tbl = await db.openTable("myTable");

If you forget the name of your table, you can always get a listing of all table names:

print(db.table_names())
console.log(await db.tableNames());

How to add data to a table

After a table has been created, you can always add more data to it using

# Option 1: Add a list of dicts to a table
data = [{"vector": [1.3, 1.4], "item": "fizz", "price": 100.0},
        {"vector": [9.5, 56.2], "item": "buzz", "price": 200.0}]
tbl.add(data)

# Option 2: Add a pandas DataFrame to a table
df = pd.DataFrame(data)
tbl.add(data)
const newData = Array.from({ length: 500 }, (_, i) => ({
  vector: [i, i + 1],
  item: "fizz",
  price: i * 0.1,
}));
await tbl.add(newData);

How to search for (approximate) nearest neighbors

Once you've embedded the query, you can find its nearest neighbors using the following code:

tbl.search([100, 100]).limit(2).to_pandas()

This returns a pandas DataFrame with the results.

const query = await tbl.search([100, 100]).limit(2).execute();
use futures::TryStreamExt;

By default, LanceDB runs a brute-force scan over dataset to find the K nearest neighbours (KNN). For tables with more than 50K vectors, creating an ANN index is recommended to speed up search performance.

tbl.create_index()
await tbl.createIndex({
  type: "ivf_pq",
  num_partitions: 2,
  num_sub_vectors: 2,
});

Check Approximate Nearest Neighbor (ANN) Indexes section for more details.

How to delete rows from a table

Use the delete() method on tables to delete rows from a table. To choose which rows to delete, provide a filter that matches on the metadata columns. This can delete any number of rows that match the filter.

tbl.delete('item = "fizz"')
await tbl.delete('item = "fizz"');

The deletion predicate is a SQL expression that supports the same expressions as the where() clause on a search. They can be as simple or complex as needed. To see what expressions are supported, see the SQL filters section.

How to remove a table

Use the drop_table() method on the database to remove a table.

db.drop_table("my_table")

This permanently removes the table and is not recoverable, unlike deleting rows. By default, if the table does not exist an exception is raised. To suppress this, you can pass in ignore_missing=True.

await db.dropTable("myTable");

This permanently removes the table and is not recoverable, unlike deleting rows. If the table does not exist an exception is raised.


Bundling vectordb apps with Webpack

If you're using the vectordb module in JavaScript, since LanceDB contains a prebuilt Node binary, you must configure next.config.js to exclude it from webpack. This is required for both using Next.js and deploying a LanceDB app on Vercel.

/** @type {import('next').NextConfig} */
module.exports = ({
webpack(config) {
    config.externals.push({ vectordb: 'vectordb' })
    return config;
}
})

What's next

This section covered the very basics of using LanceDB. If you're learning about vector databases for the first time, you may want to read the page on indexing to get familiar with the concepts.

If you've already worked with other vector databases, you may want to read the guides to learn how to work with LanceDB in more detail.