Rescore only the top lexical candidates with vectors when full vector retrieval would be too broad or too slow.

Vector Rescoring for Faster Hybrid Search

Full vector retrieval is not always the best first step.

If a lexical or filtered query already narrows the result set to a few hundred documents, it is often faster and easier to:

run the normal query first
take the top N hits
rescore only that window with vector similarity

That is the same general idea as Elasticsearch’s rescore phase.

Why this helps

Hybrid search usually has two separate jobs:

lexical or structured retrieval finds the candidate set
vector similarity improves the order inside that candidate set

When those jobs are separated, vector work stays bounded. You do not spend time comparing embeddings for documents that were never strong candidates in the first place.

This is a good fit when:

filters already removed most of the corpus
lexical search returns a few hundred plausible hits
you want semantic reranking, not semantic-only retrieval

Use `VectorRescoreQuery`

VectorRescoreQuery wraps another query and rescales only the top window.

import {
  BoolQuery,
  DocumentIndex,
  MatchQuery,
  TextFieldIndex,
  VectorFieldIndex,
  VectorRescoreQuery,
  bigramVector,
  createSeededRandom
} from "@tryformation/querylight-ts";

const index = new DocumentIndex({
  title: new TextFieldIndex(),
  embedding: new VectorFieldIndex(8, 36 * 36, createSeededRandom(42))
});

index.index({ id: "1", fields: { title: ["vector search tutorial"] } });
index.index({ id: "2", fields: { title: ["vector search with filters"] } });
index.index({ id: "3", fields: { title: ["geo search guide"] } });

const embeddingIndex = index.getFieldIndex("embedding") as VectorFieldIndex;
embeddingIndex.insert("1", [bigramVector("vector embeddings tutorial")]);
embeddingIndex.insert("2", [bigramVector("filtered semantic retrieval")]);
embeddingIndex.insert("3", [bigramVector("map polygons and geohashes")]);

const baseQuery = new BoolQuery(
  [],
  [new MatchQuery("title", "vector search")],
  [],
  []
);

const hits = index.search(new VectorRescoreQuery(
  "embedding",
  bigramVector("filtered vector retrieval"),
  baseQuery,
  {
    windowSize: 100,
    queryWeight: 1.0,
    rescoreQueryWeight: 1.0
  }
));

What happens:

baseQuery determines the candidate list
only the top 100 results are rescored with vector similarity
results after that window keep their original lexical order

Choosing the window size

Start with a small window and expand only if you need more semantic movement near the top.

Typical ranges:

25 to 50 for tight UI search boxes
100 to 200 for broader docs or catalog search
larger only when the lexical stage is still too noisy near the top

If you already have a few hundred hits after filters, rescoring the top 100 is often a better tradeoff than running broad vector retrieval across all of them.

Hybrid search: cheap filters first, vector scoring second

This is usually the best high-level pattern:

apply cheap structured constraints first
optionally apply geo filtering
run lexical retrieval inside that reduced set
rescore the top window with vectors

The important idea is that boolean filters and geo constraints are often much cheaper than vector similarity over a broad candidate pool.

Example with filters and geo

import {
  BoolQuery,
  GeoPolygonQuery,
  MatchQuery,
  TermQuery,
  VectorRescoreQuery,
  bigramVector,
  rectangleToPolygon
} from "@tryformation/querylight-ts";

const baseQuery = new BoolQuery(
  [],
  [new MatchQuery("title", "coffee shop")],
  [
    new TermQuery("city", "berlin"),
    new GeoPolygonQuery("location", rectangleToPolygon(13.35, 52.48, 13.45, 52.55))
  ],
  []
);

const hits = index.search(new VectorRescoreQuery(
  "embedding",
  bigramVector("quiet place for espresso and work"),
  baseQuery,
  { windowSize: 75 }
));

This does three useful things:

non-Berlin documents are removed before vector work starts
documents outside the map area are removed before vector work starts
only the best lexical matches inside that filtered region are rescored

That is often the right pattern for:

local search
product catalogs with hard facet filters
documentation search scoped to one product area
recommendation flows with strict eligibility rules

Why this matters for performance

Suppose the full corpus has 50,000 documents.

a category filter might reduce that to 2,000
a geo filter might reduce that to 300
lexical ranking might put the best 75 candidates at the top
vector rescoring then runs only on those 75

That is a much smaller and more purposeful workload than trying to run vector retrieval as the first step across everything.

When to use rescoring instead of direct vector retrieval

Prefer rescoring when:

exact terms and filters matter
lexical recall is already decent
you want Elasticsearch-style reranking behavior

Prefer direct vector retrieval when:

the initial lexical query misses too many relevant documents
similarity is the main retrieval signal
you are doing related-content or recommendation lookup without a strong text query

Use Approximate Nearest Neighbor Vector Search for direct vector retrieval.
Use BoolQuery for Must, Should, Filter, MustNot, and MinimumShouldMatch to build the cheap prefilter stage.
Use Geo Indexing with Points and Polygons when a map or region constraint should narrow the candidate set before vector scoring.
Use Reciprocal Rank Fusion when you want to combine separate ranked lists instead of rescoring one candidate window.
Use Relevance Tuning with BM25, TF-IDF, and RRF when ranking quality is the broader problem.

Vector Rescoring for Faster Hybrid Search

Vector Rescoring for Faster Hybrid Search

Relevant APIs

Tags

Vector Rescoring for Faster Hybrid Search

Why this helps

Use `VectorRescoreQuery`

Choosing the window size

Hybrid search: cheap filters first, vector scoring second

Example with filters and geo

Why this matters for performance

When to use rescoring instead of direct vector retrieval

Relevant APIs

Tags

Vector Rescoring for Faster Hybrid Search

Why this helps

Use `VectorRescoreQuery`

Choosing the window size

Hybrid search: cheap filters first, vector scoring second

Example with filters and geo

Why this matters for performance

When to use rescoring instead of direct vector retrieval

Related patterns