LanceDB Changelog
May 2025
Revamped LanceDB Cloud onboarding, added Umap visualization and improved performance for upsert
Features
- Reduced Commit Conflict Upsert: Upsert operations to the same table are now designed to be conflict-free under typical concurrent workloads, enabling more reliable and higher-throughput parallel data ingestion and updates.
- Added timeout parameter for
merge_insert
for better control over long-running upserts [lancedb#2378]
- Added timeout parameter for
- Reduced IOPS to object store:
- Optimized I/O Patterns for small tables: Significant improvements reduce total IOPS to the object store by up to 95%, especially benefiting small-table workloads. [lance#3764]
- Scan cache: Introduced a scan cache to further minimize object store IOPS and accelerate query performance. #enterprise
- Faster & More Reliable Indexing:
- Indexing is now more robust and efficient, with dynamic job sizing based on table and row size, plus increased retry logic for reliability.
- IVF_PQ indexing performance: Eliminated unnecessary data copying during index creation, resulting in faster PQ training and reduced memory usage. [lance#3894]
- Configurable Scan Concurrency: Query nodes now support configurable concurrency limits for scan requests to plan executors, allowing for better resource management in enterprise deployments. #enterprise
- New
grpc.concurrency_limit_per_connection
setting in the plan executor for fine-grained control. performance. #enterprise
- New
- Improved Enterprise Deployment:
- Automate deployment for AWS environments, making setup and scaling easier for enterprise users.
- GCP deployments now support configuration of weak consistency and concurrency limits for greater flexibility and cost control.
- Filter on
large_binary
column: Users can now filter on large binary columns in their queries. [lance#3797] - LanceDB Cloud UI:
- Revamped Cloud Onboarding: Streamlined LanceDB Cloud onboarding for a smoother user experience.
- Added UMAP visualization to help users visually explore embeddings in their tables.
Bug Fixes
- Upsert Page Size Calculation: Page size for upsert operations now correctly considers pod memory instead of node memory, reducing the risk of out-of-memory errors in the plan executor. #enterprise
- Scalar Index:
- Fixed incremental indexing for
LABEL_LIST
columns, ensuring scalar indices are updated correctly on data changes. - Addressed a bug in bitmap scalar index remapping, so partial remapping during compaction no longer drops rows unexpectedly.
- Fixed incremental indexing for
- Partition Count for Small Tables: Improved partitioning logic ensures the correct number of partitions for small tables, leading to more efficient queries.
- Accurate Error for Non-Existent Index: Dropping a non-existent index now returns an IndexNotFound error (instead of a TableNotFound error). [lancedb#2380]
- Index Consistency After Horizontal
merge_insert
:Any index fragments associated with modified data are now properly removed during a horizontal merge_insert, preventing index corruption and ensuring indices always reflect the current state of the data. [lancedb#3863] - Fixed issues in create_index with empty tables:
- Resolved an error that could occur when performing operations such as deleting the last row or creating an index on an empty table. The function now safely handles empty tables, preventing division-by-zero errors during these events.
- Corrected a bug where events could be dropped if processed in the same batch as other events for empty tables. This was caused by Lance datasets evaluating to False when empty; the check now properly distinguishes between None and empty datasets, ensuring all events are processed as intended.
April 2025
Enhanced Performance and Improved Version Control
Features
- Reduced Commit Conflicts: Mitigated upsert-induced table conflicts through client/server-side retry mechanisms.
- New SDK APIs:
table.tags.create/list/update/delete/checkout
: Enables semantic versioning through intuitive tagging instead of numeric versioningwait_for_index
: Ensures complete data indexing with configurablewait_timeout
.
- Performance Improvements:
- Query Latency: Eliminated full cache ring scans when nodes < replication factor.
- Full-Text Search: Introduced configurable FTS index prewarming to accelerate search operations.
enterprise
- Table Creation: Leveraged cached database connections to reduce overhead—ideal for bulk table creation scenarios.
- Session-Level Object Store Caching: Shared connection pools via weak references to object stores. Example: 100 tables → 1 S3 connection pool.
- IVF_PQ float64 support: Expanded vector indexing to float64 datasets (previously limited to float16/32).
- Full-text search on string arrays: Extended full-text search to support string array columns for efficient multi-value keyword search.
- UI - Enhanced Table Preview:
- One-click vector copying from cells
- Full row expansion on click (no truncation)
- Consistent right-panel binary data display
- UI - Direct Page Navigation: Added page selector for instant access to specific data pages in table preview.
Bug Fixes
- Query Performance:
- Fixed hybrid search distance range filtering to properly respect lower and upper bounds. [lancedb#2356]
- Enforced valid
k
parameters to prevent overflow. [lancedb#2354] BETWEEN clause
: Improved BETWEEN query handling to return 0 results whenstart
>end
instead of panicking. [lance#3706]
- SDK/API Fixes:
- Fixed TypeScript SDK support for FTS advanced features (fuzzy search and boosting). [lancedb#2314]
- Added scalar index support for small FixedSizeBinary columns (e.g., UUID). [lancedb#2297]
- Index:
- Fixed IVF_PQ index on vector columns with NaN/INFs. [lance#3648]
- Resolved GPU-based indexing crashes on non-contiguous arrays. [lance#3675]
enterprise
- Fixed B-tree index corruption during null remapping. [lance#3704]
- Cloud Dashboard: Optimized chart rendering for better visibility in the Cloud usage dashboard.
March 2025
Enhanced Full-Text Search and Advanced Query Debugging Features
Features
- Enhanced Full-Text Search (FTS): Fuzzy Search & Boosting Now Available: it improves user experience with resilient, typo-tolerant searches and can surface the most contextually relevant results faster.
- New SDK APIs:
explain_plan
: Diagnose query performance and debug unexpected results by inspecting the execution plan.analyze_plan
: Analyze query execution metrics to optimize performance and resource usage. Such metrics include execution time, number of rows processed, I/O stats, and more.restore
: Revert to a specific prior version of your dataset and modify it from a verified, stable state.
- Scalar Indexing for Extended Data Types: LanceDB now supports scalar indexing on UUID columns of FixedSizeBinary type.
- Binary vector support in TypeScript SDK: LanceDB's TypeScript SDK now natively supports binary vector indexing and querying with production-grade efficiency.
- Support S3-compatible object store: Extended LanceDB Enterprise deployment to work with S3-compatible object stores, such as Tigris, Minio and etc.
enterprise
Bug Fixes
- Improved Merge insert performance: Enhanced the merge-insert operation to reduce error rates during upsert operations, improving data reliability.[lance#3603].
- Batch Ingestion Error: Updated error codes for batch ingestion failures from 500 to 409 to accurately reflect resource conflict scenarios.
- Cloud Signup Workflow Fix: Resolved an issue where users encountered a blank page after organization creation during the Cloud signup process.
- Table Preview Data Display Issue: Fixed a bug causing the table preview page to show a generic "Something went wrong" error for datasets containing
datetime
columns, which previously prevented data inspection. - Rate Limit Error Clarity: Added explicit error messaging for system rate limits (e.g., API keys per table), replacing vague notifications that confused users.
February 2025
Multivector Search ready and Table data preview available in Cloud UI
Features
- Multivector Search is now live: documents can be stored as contextualized vector lists. Fast multi-vector queries are supported at scale, powered by our XTR optimization.
Drop_index
added to SDK: users can remove unused or outdated indexes from your tables.- Explore Your Data at a Glance: preview sample data from any table with a single click.
lancedb-cloud
- Search by Project/Table in Cloud UI: allow users to quickly locate the desired project/table.
lancedb-cloud
Bug Fixes
- FTS stability fix: Resolved a crash in Full Text Search (FTS) during flat-mode searches.
prefilter
parameter enforcement: Fixed a bug where the prefilter parameter was not honored in FTS queries.- Vector index bounds error: Addressed an out-of-bounds indexing issue during vector index creation.
distance_range()
compatibility: Fixed errors when performing vector searches withdistance_range()
on unindexed rows.- Error messaging improvements: Replaced generic HTTP 500 errors with detailed, actionable error messages for easier debugging.
January 2025
Support Hamming Distance and GPU based indexing ready
Features
- Support Hamming distance and binary vector: Added
hamming
as a distance metric (joiningl2
,cosine
,dot
) for binary vector similarity search. - GPU-Accelerated IVF-PQ indexing: Build IVF-PQ indexes 10x faster.
enterprise
- AWS Graviton 4 & Google Axion build optimizations: ARM64 SIMD acceleration cuts query costs.
enterprise
- float16 Vector Index Supports: reduce storage size while maintaining search quality.
- Self-Serve Cloud Onboarding: new workflow-based UI guides users for smooth experience.
lancedb-cloud
Bug Fixes
list_indices
andindex_stats
now always fetch the latest version of the table by default unless a specific version is explicitly provided.- Error message fix: Improved clarity for cases where
create_index
is called to create a vector index on tables with fewer than 256 rows. - TypeScript SDK fixes: Resolved an issue where
createTable()
failed to correctly save embeddings and formergeInsert
not utilizing saved embeddings. lancedb#2065 - Multi-vector schema inference: Addressed an issue where the vector column could not be inferred for multi-vector indexes. lancedb#2026
- Hybrid search consistency: Fixed a discrepancy where hybrid search returned distance values inconsistent with standalone vector search. lancedb#2061
December 2024
Performant SQL queries at scale and more cost-effective vector search
Features
- Run SQL with massive datasets: added Apache Arrow flight-SQL protocol to run SQL queries with billions of data and return in seconds.
enterprise
. - Accelerate vector search: added our Quantinized-IVF algorithm and other optimization techniques to improve QPS/core.
enterprise
- Azure Stack Router Deployment: route traffic efficiently to serve low query latency.
enterprise
- Distance range filtering: filter query results using
distance_range()
to return search results with a lowerbound, upperbound or a range [lance#3326]. - Full-Text Search(FTS) indexing options: configure tokenizers, stopword lists and more at FTS index creation.
Bug Fixes
- Full-text search parameters: Fixed an issue where full-text search index configurations were not applied correctly. lancedb#1928
- Float16 vector queries: Addressed a bug preventing the use of lists of Float16 values in vector queries. lancedb#1931
- Versioned
checkout
API: Resolved inconsistencies in thecheckout
method when specifying theversion
parameter. lancedb#1988 - Table recreation error: Fixed an issue where dropping and recreating a table with the same name resulted in a table creation error.