IBM watsonx.ai Embeddings
Generate text embeddings using IBM's watsonx.ai platform.
Supported Models
You can find a list of supported models at IBM watsonx.ai Documentation. The currently supported model names are:
ibm/slate-125m-english-rtrvr
ibm/slate-30m-english-rtrvr
sentence-transformers/all-minilm-l12-v2
intfloat/multilingual-e5-large
Parameters
The following parameters can be passed to the create
method:
Parameter | Type | Default Value | Description |
---|---|---|---|
name | str | "ibm/slate-125m-english-rtrvr" | The model ID of the watsonx.ai model to use |
api_key | str | None | Optional IBM Cloud API key (or set WATSONX_API_KEY ) |
project_id | str | None | Optional watsonx project ID (or set WATSONX_PROJECT_ID ) |
url | str | None | Optional custom URL for the watsonx.ai instance |
params | dict | None | Optional additional parameters for the embedding model |
Usage Example
First, the watsonx.ai library is an optional dependency, so must be installed seperately:
Optionally set environment variables (if not passing credentials to create
directly):
import os
import lancedb
from lancedb.pydantic import LanceModel, Vector
from lancedb.embeddings import EmbeddingFunctionRegistry
watsonx_embed = EmbeddingFunctionRegistry
.get_instance()
.get("watsonx")
.create(
name="ibm/slate-125m-english-rtrvr",
# Uncomment and set these if not using environment variables
# api_key="your_api_key_here",
# project_id="your_project_id_here",
# url="your_watsonx_url_here",
# params={...},
)
class TextModel(LanceModel):
text: str = watsonx_embed.SourceField()
vector: Vector(watsonx_embed.ndims()) = watsonx_embed.VectorField()
data = [
{"text": "hello world"},
{"text": "goodbye world"},
]
db = lancedb.connect("~/.lancedb")
tbl = db.create_table("watsonx_test", schema=TextModel, mode="overwrite")
tbl.add(data)
rs = tbl.search("hello").limit(1).to_pandas()
print(rs)