Enhancing Retrieval Efficiency in RAG Pipelines with Hybrid Search | by Leonie Monigatti | Nov, 2023

discover extra related search outcomes by combining conventional keyword-based search with trendy vector search

hybrid search bar
Search bar with hybrid search capabilities

With the current curiosity in Retrieval-Augmented Generation (RAG) pipelines, builders have began discussing challenges in constructing RAG pipelines with production-ready efficiency. Similar to in lots of elements of life, the Pareto Precept additionally comes into play with RAG pipelines, the place reaching the preliminary 80% is comparatively easy, however attaining the remaining 20% for manufacturing readiness proves difficult.

One generally repeated theme is to enhance the retrieval element of a RAG pipeline with hybrid search.

Builders who’ve already gained expertise constructing RAG pipelines have started sharing their insights. One generally repeated theme is to enhance the retrieval element of a RAG pipeline with hybrid search.

This text introduces you to the idea of hybrid search, the way it may also help you enhance your RAG pipeline efficiency by retrieving extra related outcomes, and when to make use of it.

Hybrid search is a search approach that mixes two or extra search algorithms to enhance the relevance of search outcomes. Though it isn’t outlined which algorithms are mixed, hybrid search mostly refers back to the mixture of conventional keyword-based search and trendy vector search.

Historically, keyword-based search was the apparent selection for search engines like google. However with the arrival of Machine Studying (ML) algorithms, vector embeddings enabled a brand new search approach — known as vector or semantic search — that allowed us to look throughout knowledge semantically. Nevertheless, each search methods have important tradeoffs to contemplate:

  • Key phrase-based search: Whereas its actual keyword-matching capabilities are useful for particular phrases, comparable to product names or trade jargon, it’s delicate to typos and synonyms, which lead it to overlook essential context.
  • Vector or semantic search: Whereas its semantic search capabilities permit multi-lingual and multi-modal search based mostly on the info’s semantic which means and make it strong to typos, it could actually miss important key phrases. Moreover, it is determined by the standard of the generated vector embeddings and is delicate to out-of-domain phrases.

Combining keyword-based and vector searches right into a hybrid search permits you to leverage some great benefits of each search methods to enhance search outcomes’ relevance, particularly for text-search use instances.

For instance, take into account the search question “ merge two Pandas DataFrames with .concat()?”. The key phrase search would assist discover related outcomes for the strategy .concat(). Nevertheless, because the phrase “merge” has synonyms comparable to “mix”, “be a part of”, and “concatenate”, it might be useful if we might leverage the context consciousness of semantic search (see extra particulars in When Would You Use Hybrid Search).

If you’re , you may mess around with the completely different keyword-based, semantic, and hybrid search queries to seek for motion pictures on this stay demo (its implementation is detailed on this article).

Hybrid search combines keyword-based and vector search methods by fusing their search outcomes and reranking them.

Key phrase-based search

Key phrase-based search within the context of hybrid search usually makes use of a illustration known as sparse embeddings, which is why additionally it is known as sparse vector search. Sparse embeddings are vectors with largely zero values with just a few non-zero values, as proven beneath.

[0, 0, 0, 0, 0, 1, 0, 0, 0, 24, 3, 0, 0, 0, 0, ...]

Sparse embeddings might be generated with completely different algorithms. Essentially the most generally used algorithm for sparse embeddings is BM25 (Finest match 25), which builds upon the TF-IDF (Time period Frequency-Inverse Doc Frequency) method and refines it. In easy phrases, BM25 emphasizes the significance of phrases based mostly on their frequency in a doc relative to their frequency throughout all paperwork.

Vector search

Vector search is a contemporary search approach that has emerged with the advances in ML. Trendy ML algorithms, comparable to Transformers, can generate a numerical illustration of information objects in varied modalities (textual content, photos, and so on.) known as vector embeddings.

These vector embeddings are often densely filled with data and largely comprised of non-zero values (dense vectors), as proven beneath. For this reason vector search is also referred to as dense vector search.

[0.634, 0.234, 0.867, 0.042, 0.249, 0.093, 0.029, 0.123, 0.234, ...]

A search question is embedded into the identical vector house as the info objects. Then, its vector embedding is used to calculate the closest knowledge objects based mostly on a specified similarity metric, comparable to cosine distance. The returned search outcomes record the closest knowledge objects ranked by their similarity to the search question.

Fusion of keyword-based and vector search outcomes

Each the keyword-based search and the vector search return a separate set of outcomes, often an inventory of search outcomes sorted by their calculated relevance. These separate units of search outcomes have to be mixed.

There are a lot of completely different methods to mix the ranked outcomes of two lists into one single rating, as outlined in a paper by Benham and Culpepper [1].

Typically talking, the search outcomes are often first scored. These scores might be calculated based mostly on a specified metric, comparable to cosine distance, or just simply the rank within the search outcomes record.

Then, the calculated scores are weighted with a parameter alpha, which dictates every algorithm’s weighting and impacts the outcomes’ re-ranking.

hybrid_score = (1 - alpha) * sparse_score + alpha * dense_score

Often, alpha takes a price between 0 and 1, with

  • alpha = 1: Pure vector search
  • alpha = 0: Pure key phrase search

Beneath, you may see a minimal instance of the fusion between key phrase and vector search with scoring based mostly on the rank and an alpha = 0.5.

Minimal example showcasing the different rankings of keyword-based search, vector search, and hybrid search.
Minimal instance of how key phrase and vector search outcomes might be fused in hybrid search with scoring based mostly on rating and an alpha of 0.5 (Picture by the creator, impressed by Hybrid search explained)

A RAG pipeline has many knobs you may tune to enhance its efficiency. One in all these knobs is to enhance the relevance of the retrieved context that’s then fed into the LLM as a result of if the retrieved context shouldn’t be related for answering a given query, the LLM received’t have the ability to generate a related reply both.

Relying in your context sort and question, you need to decide which of the three search methods is most useful in your RAG software. Thus, the parameter alpha, which controls the weighting between keyword-based and semantic search, might be seen as a hyperparameter that must be tuned.

In a typical RAG pipeline using LangChain, you’d outline the retriever element by setting the used vectorstore element because the retriever with the .as_retriever() technique as follows:

# Outline and populate vector retailer
# See particulars right here
vectorstore = ...

# Set vectorstore as retriever
retriever = vectorstore.as_retriever()

Nevertheless, this technique solely permits semantic search. If you wish to allow hybrid search in LangChain, you have to to outline a particular retriever component with hybrid search capabilities, such because the WeaviateHybridSearchRetriever:

from langchain.retrievers.weaviate_hybrid_search import WeaviateHybridSearchRetriever

retriever = WeaviateHybridSearchRetriever(
alpha = 0.5, # defaults to 0.5, which is equal weighting between key phrase and semantic search
consumer = consumer, # key phrase arguments to move to the Weaviate consumer
index_name = "LangChain", # The title of the index to make use of
text_key = "textual content", # The title of the textual content key to make use of
attributes = [], # The attributes to return within the outcomes

The remainder of the vanilla RAG pipeline will keep the identical.

This small code change permits you to experiment with completely different weighting between keyword-based and vector searches. Word that setting alpha = 1 equals a totally semantic search as is the equal of defining the retriever from the vectorstore element immediately (retriever = vectorstore.as_retriever()).

Hybrid search is right to be used instances the place you wish to allow semantic search capabilities for a extra human-like search expertise but in addition require actual phrase matching for particular phrases, comparable to product names or serial numbers.

A superb instance is the platform Stack Overflow, which has lately prolonged its search capabilities with semantic search by utilizing hybrid search.

Initially, Stack Overflow used TF-IDF to match key phrases to paperwork [2]. Nevertheless, describing the coding drawback you are attempting to resolve might be troublesome. It could result in completely different outcomes based mostly on the phrases you employ to explain your drawback (e.g., combining two Pandas DataFrames might be completed in numerous strategies comparable to merging, becoming a member of, and concatenating). Thus, a extra context-aware search technique, comparable to semantic search, can be extra useful for these instances.

Nevertheless, alternatively, a typical use case of Stack Overflow is to copy-paste error messages. For this case, actual key phrase matching is the popular search technique. Additionally, you will have actual keyword-matching capabilities for technique and argument names (e.g., .read_csv() in Pandas).

As you may guess, many comparable real-world use instances profit from context-aware semantic searches however nonetheless depend on actual key phrase matching. These use instances can strongly profit from implementing a hybrid search retriever element.

This text launched the context of hybrid search as a mix of keyword-based and vector searches. Hybrid search merges the search outcomes of the separate search algorithms and re-ranks the search outcomes accordingly.

In hybrid search, the parameter alpha controls the weighting between keyword-based and semantic searches. This parameter alpha might be seen as a hyperparameter to tune in RAG pipelines to enhance the accuracy of search outcomes.

Utilizing the Stack Overflow [2] case examine, we showcased how hybrid search might be helpful to be used instances the place semantic search can enhance the search expertise. Nevertheless, actual key phrase matching continues to be essential when particular phrases are frequent.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button