Reference Entry
Significant Terms Aggregation
Aggregations · advanced · order 11
Surface terms that are unusually common in the current subset compared to the background corpus.
Significant Terms Aggregation
significantTermsAggregation surfaces terms that stand out in a subset relative to the full corpus.
This is useful when you want to answer “what is distinctive about these results?” rather than “which facet values are most common?”
Basic usage
import { DocumentIndex, TextFieldIndex } from "@tryformation/querylight-ts";
const index = new DocumentIndex({
body: new TextFieldIndex()
});
index.index({ id: "a", fields: { body: ["vector search embeddings semantic retrieval"] } });
index.index({ id: "b", fields: { body: ["vector search browser retrieval"] } });
index.index({ id: "c", fields: { body: ["range filters lexical matching"] } });
const bodyIndex = index.getFieldIndex("body") as TextFieldIndex;
const subsetIds = new Set(["a", "b"]);
const standoutTerms = bodyIndex.significantTermsAggregation(8, subsetIds);
Expected shape:
[
{
key: "vector",
score: 1.5,
subsetDocCount: 2,
backgroundDocCount: 2
},
{
key: "embeddings",
score: 1.5,
subsetDocCount: 1,
backgroundDocCount: 1
}
]
How it works
significantTermsAggregation compares:
- how often a term appears in the current subset
- how often the same term appears in the full background corpus
Terms that are common everywhere are less interesting. Terms that spike in the current slice rank higher.
When to use it
Use significant terms for:
- suggested follow-up queries
- sidebar hints
- exploratory vocabulary prompts
- understanding why a filtered slice looks different from the rest of the corpus
It works best on descriptive text fields such as body, summary, or description.
Limitations
- It is not a stable facet-count API.
- Small subsets can produce noisy output.
- Curated metadata fields such as
tagsorsectionoften work better with Terms Aggregation.