Debug School

rakesh kumar
rakesh kumar

Posted on • Updated on

Different way to embed data into vector using embedding models with langchain framework

OpenAI Embeddings API
Using OpenAI API with openai.Embedding.create()
Using OpenAIEmbeddings from langchain or another library

Ollama (Local Self-hosted Models)
Using Ollama API with ollama.embed
Using OllamaEmbeddingsfrom langchain

Hugging Face (Transformers & Sentence-BERT)
Using Hugging Face API (Direct HTTP Request)

OpenAI Embeddings API

Two way

Using OpenAI API with openai.Embedding.create()
Using OpenAIEmbeddings from langchain or another library
Enter fullscreen mode Exit fullscreen mode

Using OpenAI API with openai.Embedding.create()
OpenAI provides an API to generate embeddings using its powerful models such as text-embedding-ada-002. This method is cloud-based and requires an API key.

import openai

# Set your OpenAI API key
openai.api_key = 'your-api-key-here'

def get_openai_embedding(text):
    # Use the text-embedding-ada-002 model to get text embeddings
    response = openai.Embedding.create(
        model="text-embedding-ada-002",
        input=text
    )
    embedding = response['data'][0]['embedding']
    return embedding

# Example usage
text = "OpenAI provides cutting-edge AI models."
embedding = get_openai_embedding(text)
print(embedding)
Enter fullscreen mode Exit fullscreen mode

output

print(response)
Enter fullscreen mode Exit fullscreen mode
{
    "object": "list",
    "data": [
        {
            "object": "embedding",
            "embedding": [0.001, -0.003, 0.024, ...],  // The embedding vector
            "index": 0
        }
    ],
    "model": "text-embedding-ada-002",
    "usage": {
        "prompt_tokens": 5,
        "total_tokens": 5
    }
}
Enter fullscreen mode Exit fullscreen mode
print(embedding)
Enter fullscreen mode Exit fullscreen mode
[0.001, -0.003, 0.024, 0.010, -0.012, ...]
Enter fullscreen mode Exit fullscreen mode

Image description

Embedding Multiple document Using OpenAI Api

import openai

# Set your OpenAI API key
openai.api_key = 'your-api-key-here'

def get_openai_embeddings(texts):
    # Request embeddings for multiple documents
    response = openai.Embedding.create(
        model="text-embedding-ada-002",
        input=texts
    )
    # Extract embeddings for each document
    embeddings = [item['embedding'] for item in response['data']]
    return embeddings

# Example usage
documents = [
    "Alpha is the first letter of the Greek alphabet",
    "Beta is the second letter of the Greek alphabet"
]
embeddings = get_openai_embeddings(documents)

# Print the embedding for the second document
print(embeddings[1])
Enter fullscreen mode Exit fullscreen mode

Advantages:
High Quality: OpenAI’s models (like text-embedding-ada-002) are well-optimized for generating highly accurate and meaningful embeddings.
No Setup Required: You don’t need to worry about model architecture, training, or running inference on your hardware.
Scalability: OpenAI’s API is highly scalable for large volumes of requests.
Disadvantages:
Cost: Using OpenAI’s API incurs ongoing costs, especially with large volumes of data or frequent API calls.
Dependency: Requires a stable internet connection and a valid API key, meaning you depend on OpenAI’s availability and service uptime.
Data Privacy: Data sent to OpenAI's servers is processed externally, which may not be suitable for highly sensitive information.
When to Use:
Best for Commercial Applications: If you want a powerful, pre-trained model with high-quality embeddings without the need for infrastructure setup.
When Scalability is Needed: If your application needs to handle a large number of requests efficiently without worrying about infrastructure.

Using OpenAIEmbeddings from langchain or another library

import os
from dotenv import load_dotenv
load_dotenv() 
os.environ["OPENAI_API_KEY"]=os.getenv("OPENAI_API_KEY")
from langchain_openai import OpenAIEmbeddings
embeddings=OpenAIEmbeddings(model="text-embedding-3-large")
embeddings
text="This is a tutorial on OPENAI embedding"
query_result=embeddings.embed_query(text)
query_result
Enter fullscreen mode Exit fullscreen mode

output

print(embedding)
Enter fullscreen mode Exit fullscreen mode
OpenAIEmbeddings(client=<openai.resources.embeddings.Embeddings object at 0x000001A000D82C20>, async_client=<openai.resources.embeddings.AsyncEmbeddings object at 0x000001A003551630>, model='text-embedding-3-large', dimensions=None, deployment='text-embedding-ada-002', openai_api_version='', openai_api_base=None, openai_api_type='', openai_proxy='', embedding_ctx_length=8191, openai_api_key=SecretStr('**********'), openai_organization=None, allowed_special=None, disallowed_special=None, chunk_size=1000, max_retries=2, request_timeout=None, headers=None, tiktoken_enabled=True, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={}, skip_empty=False, default_headers=None, default_query=None, retry_min_seconds=4, retry_max_seconds=20, http_client=None, http_async_client=None, check_embedding_ctx_length=True)
Enter fullscreen mode Exit fullscreen mode
print(query_result)
Enter fullscreen mode Exit fullscreen mode
[0.001956823281943798,
 0.041745562106370926,
 -0.013878178782761097,
 -0.039858128875494,
 0.023981492966413498,
 0.004118349868804216,
 0.016626058146357536,
 -0.01966537907719612,
 0.005950269289314747,
 -0.003684656461700797,
 ...]
Enter fullscreen mode Exit fullscreen mode

Embedding with Dimension 4096 for Large Texts

from langchain_openai import OpenAIEmbeddings

# Use higher dimensions (4096) for handling complex texts
embeddings_4096 = OpenAIEmbeddings(model="text-embedding-3-large", dimensions=4096)
text = "Deep learning is a powerful technique in machine learning"
query_result = embeddings_4096.embed_query(text)

# Output the embedding and its length
print(query_result)
print(len(query_result))
Enter fullscreen mode Exit fullscreen mode

Embedding Multiple document Using OpenAIEmbeddings

from langchain.embeddings import OpenAIEmbeddings

# Initialize OpenAIEmbeddings with the desired model
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

# Embed multiple documents
r1 = embeddings.embed_documents([
    "Alpha is the first letter of the Greek alphabet",
    "Beta is the second letter of the Greek alphabet"
])

# Print the embedding for the second document
print(r1[1])
Enter fullscreen mode Exit fullscreen mode
[0.0034, -0.0012, 0.0051, ..., 0.0023]  # A list of 1536 float values.
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Ollama (Local Self-hosted Models)

Ollama provides AI models that can be run locally, offering a more flexible solution. You can either use a local Ollama server or access models via the Python API. This is particularly useful for those who want to avoid cloud costs or need more control over data privacy.

Two way

Using Ollama API with ollama.embed
Using OllamaEmbeddingsfrom langchain
Enter fullscreen mode Exit fullscreen mode

Embedding single document using Ollama API with ollama.embed

import ollama

def get_ollama_embedding(text):
    # Request the embedding from an Ollama model using embed function
    response = ollama.embed(model="mxbai-embed-large", input=text)
    return response['embedding']

# Example usage
text = "Ollama provides powerful local models."
embedding = get_ollama_embedding(text)
print(embedding)
Enter fullscreen mode Exit fullscreen mode
[0.0025, -0.0043, 0.0011, 0.0078, -0.0065, ..., 0.0032]
Enter fullscreen mode Exit fullscreen mode

Embedding Multiple document Using ollama Api

import ollama

def get_ollama_embeddings(texts):
    # Request embeddings for multiple documents
    embeddings = []  # To store the embeddings
    for text in texts:
        response = ollama.embed(model="mxbai-embed-large", input=text)
        embeddings.append(response['embedding'])
    return embeddings

# Example usage
documents = [
    "Ollama provides powerful local models.",
    "It allows efficient embeddings and processing of text.",
    "Embeddings are used for various NLP tasks."
]
embeddings = get_ollama_embeddings(documents)

# Print embeddings for each document
for i, embedding in enumerate(embeddings):
    print(f"Document {i + 1}: {embedding[:5]}...")  # Print the first 5 values for brevity
Enter fullscreen mode Exit fullscreen mode

output

Image description

How to embed a single document using OllamaEmbeddings

from langchain_community.embeddings import OllamaEmbeddings

# Initialize OllamaEmbeddings
embeddings = OllamaEmbeddings(model="gemma:2b")  # By default, uses Llama2

# Single document to embed
text = "Alpha is the first letter of the Greek alphabet."

# Embed the single document
embedding = embeddings.embed_query(text)

# Print the embedding for the single document
print(embedding)
Enter fullscreen mode Exit fullscreen mode

Image description

Embedding Multiple document Using ollama embedding wrapper class

from langchain_community.embeddings import OllamaEmbeddings
embeddings=(
    OllamaEmbeddings(model="gemma:2b")  ##by default it ues llama2
)
r1=embeddings.embed_documents(
    [
       "Alpha is the first letter of Greek alphabet",
       "Beta is the second letter of Greek alphabet", 
    ]
)

print(r1[1])
Enter fullscreen mode Exit fullscreen mode
[-2.3592045307159424,
 -0.8716640472412109,
 -0.22409206628799438,
 2.4858193397521973, 
 1.262244462966919,
 -1.7182726860046387,
 -0.11123710125684738,
 0.7157507538795471,
 1.6657123565673828,
 -0.8437683582305908,
 0.7881910800933838,
 0.3762670159339905,
 ...]
Enter fullscreen mode Exit fullscreen mode

Image description

Advantages:
Local Processing: Since it runs locally, data does not need to leave your premises, providing better privacy and security.
No API Costs: There are no ongoing costs associated with API calls, as you can run the models on your own hardware.
Customizability: Ollama allows more flexibility in customizing the models and deploying them in specific environments.
Disadvantages:
Requires Local Hardware: To run models locally, you need adequate hardware (GPUs or TPUs for larger models).
Complex Setup: Setting up Ollama models locally can be more complex than using a cloud-based API.
Performance: Local models may not always match the performance and optimization of cloud-based models like OpenAI, especially for large-scale tasks.
When to Use:
When Data Privacy is Critical: If you're working with sensitive information and cannot afford to send data to external servers.
When Cloud Services are Not Feasible: If you are working in a restricted environment where internet access or API usage is limited, running Ollama locally can be a good alternative.
For Cost-Effective Solutions: If you have the hardware to run models and want to avoid ongoing costs from API usage.

Hugging Face (Transformers & Sentence-BERT)

Hugging Face provides a broad range of transformer models, including Sentence-BERT, which is widely used for sentence embeddings. You can use Hugging Face's transformers library to easily download and use pre-trained models for generating embeddings.

Using HuggingFaceEmbeddings Wrapper Class

import os
from dotenv import load_dotenv
load_dotenv()  #load all the environment variables

os.environ['HF_TOKEN']=os.getenv("HF_TOKEN")
from langchain_huggingface import HuggingFaceEmbeddings
embeddings=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

text="this is atest documents"
query_result=embeddings.embed_query(text)
query_result
Enter fullscreen mode Exit fullscreen mode

Using Hugging face api

import os
from dotenv import load_dotenv
import requests

# Load environment variables from .env file
load_dotenv()
HF_TOKEN = os.getenv("HF_TOKEN")  # Your Hugging Face API Token

# API URL for the embedding model
API_URL = "https://api-inference.huggingface.co/models/sentence-transformers/all-MiniLM-L6-v2"

# Headers with the authorization token
headers = {
    "Authorization": f"Bearer {HF_TOKEN}"
}

# Text to embed
text = "this is a test document"

# API Request
def get_huggingface_embedding(text):
    response = requests.post(API_URL, headers=headers, json={"inputs": text})
    response.raise_for_status()  # Raise an error if the request fails
    return response.json()  # Return the embedding

# Generate the embedding
embedding = get_huggingface_embedding(text)

# Print the embedding
print("Embedding for the text:", text)
print(embedding[:5], "...")  # Display first 5 values for brevity
Enter fullscreen mode Exit fullscreen mode

output

Embedding for the text: Hugging Face provides state-of-the-art machine learning models.

[0.1234, -0.0023, 0.0456, 0.0678, -0.0031] ...
Enter fullscreen mode Exit fullscreen mode
from sentence_transformers import SentenceTransformer

# Load the pre-trained model for sentence embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')

def get_huggingface_embedding(text):
    # Get the vector embedding for the text
    embedding = model.encode(text)
    return embedding

# Example usage
text = "Hugging Face offers powerful NLP models."
embedding = get_huggingface_embedding(text)
print(embedding)
Enter fullscreen mode Exit fullscreen mode

Handling Multiple Documents

def get_huggingface_embeddings(texts):
    headers = {"Authorization": f"Bearer {API_TOKEN}"}
    response = requests.post(API_URL, headers=headers, json={"inputs": texts})
    response.raise_for_status()
    return response.json()

# Batch embedding
texts = [
    "Hugging Face provides state-of-the-art machine learning models.",
    "Transformers library is very powerful for NLP tasks."
]
embeddings = get_huggingface_embeddings(texts)

# Print embeddings for each document
for i, embedding in enumerate(embeddings):
    print(f"Embedding for document {i+1}: {embedding[:5]} ...")
Enter fullscreen mode Exit fullscreen mode

Wrapper Method (Batch Processing):

def get_huggingface_embeddings_wrapper(texts):
    inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
    with torch.no_grad():
        outputs = model(**inputs)
    embeddings = outputs.last_hidden_state.mean(dim=1)
    return embeddings.tolist()

# Batch embedding
texts = [
    "Hugging Face provides state-of-the-art machine learning models.",
    "Transformers library is very powerful for NLP tasks."
]
embeddings = get_huggingface_embeddings_wrapper(texts)

# Print embeddings for each document
for i, embedding in enumerate(embeddings):
    print(f"Embedding for document {i+1}: {embedding[:5]} ...")
Enter fullscreen mode Exit fullscreen mode

Image description

Advantages:
Open Source: Hugging Face offers a large collection of pre-trained models for various tasks, including sentence embeddings, which are open source and free to use.
Flexibility: You can fine-tune models on your own data or choose from a wide variety of pre-trained models for different NLP tasks.
Local or Cloud Execution: You can run the models locally or use Hugging Face’s API for cloud execution, giving you flexibility based on your requirements.
Disadvantages:
Requires Setup: You need to handle model management and possibly deal with hardware limitations when running models locally.
Model Size: Some transformer models are quite large, requiring significant memory and computation resources for inference.
Performance: While Hugging Face models are generally high-quality, they might not be as optimized for speed or accuracy as OpenAI's offerings.

Image description

Image description

SUMMARY

OpenAI Embeddings API
Using OpenAI API with openai.Embedding.create()[single and multiple]
Using OpenAIEmbeddings from langchain or another library[single and multiple]
Ollama (Local Self-hosted Models)
Using Ollama API with ollama.embed[single and multiple]
Using OllamaEmbeddingsfrom langchain[single and multiple]
Hugging Face (Transformers & Sentence-BERT)
Using Hugging Face API (Direct HTTP Request)[single and multiple]
response = openai.Embedding.create(model="text-embedding-ada-002",input=text) ||embedding = response['data'][0]['embedding']
Embedding Multiple document Using OpenAI Api====
documents is in list form follow above process
embeddings = [item['embedding'] for item in response['data']]
Enter fullscreen mode Exit fullscreen mode

Top comments (0)