Pydantic
Pydantic is a data validation library in Python. LanceDB integrates with Pydantic for schema inference, data ingestion, and query result casting.
Schema
LanceDB supports to create Apache Arrow Schema from a Pydantic BaseModel via pydantic_to_schema() method.
lancedb.pydantic.pydantic_to_schema
pydantic_to_schema(model: Type[BaseModel]) -> Schema
Convert a Pydantic model to a PyArrow Schema.
Parameters:
-
model
(Type[BaseModel]
) βThe Pydantic BaseModel to convert to Arrow Schema.
Returns:
-
Schema
β
Examples:
>>> from typing import List, Optional
>>> import pydantic
>>> from lancedb.pydantic import pydantic_to_schema
>>> class FooModel(pydantic.BaseModel):
... id: int
... s: str
... vec: List[float]
... li: List[int]
...
>>> schema = pydantic_to_schema(FooModel)
>>> assert schema == pa.schema([
... pa.field("id", pa.int64(), False),
... pa.field("s", pa.utf8(), False),
... pa.field("vec", pa.list_(pa.float64()), False),
... pa.field("li", pa.list_(pa.int64()), False),
... ])
Source code in lancedb/pydantic.py
Vector Field
LanceDB provides a Vector(dim)
method to define a
vector Field in a Pydantic Model.
lancedb.pydantic.Vector
Vector(dim: int, value_type: DataType = pa.float32(), nullable: bool = True) -> Type[FixedSizeListMixin]
Pydantic Vector Type.
Warning
Experimental feature.
Parameters:
-
dim
(int
) βThe dimension of the vector.
-
value_type
(DataType
, default:float32()
) βThe value type of the vector, by default pa.float32()
-
nullable
(bool
, default:True
) βWhether the vector is nullable, by default it is True.
Examples:
>>> import pydantic
>>> from lancedb.pydantic import Vector
...
>>> class MyModel(pydantic.BaseModel):
... id: int
... url: str
... embeddings: Vector(768)
>>> schema = pydantic_to_schema(MyModel)
>>> assert schema == pa.schema([
... pa.field("id", pa.int64(), False),
... pa.field("url", pa.utf8(), False),
... pa.field("embeddings", pa.list_(pa.float32(), 768))
... ])
Source code in lancedb/pydantic.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
|
Type Conversion
LanceDB automatically convert Pydantic fields to Apache Arrow DataType.
Current supported type conversions:
Pydantic Field Type | PyArrow Data Type |
---|---|
int |
pyarrow.int64 |
float |
pyarrow.float64 |
bool |
pyarrow.bool |
str |
pyarrow.utf8() |
list |
pyarrow.List |
BaseModel |
pyarrow.Struct |
Vector(n) |
pyarrow.FixedSizeList(float32, n) |