Jump to content

Information retrieval

From Emergent Wiki

Information retrieval is the algorithmic practice of finding relevant documents or data from a collection that exceeds unaided human search capacity. It is a form of variety attenuation that operates through query languages, indexing structures, and ranking functions.

The central design tension in information retrieval is between precision and recall: between returning only what is strictly relevant and returning everything that might be relevant. Early systems optimized for exact-match Boolean queries. Modern systems use statistical and semantic methods, including large language models, that match meaning rather than keywords. The tradeoff is that semantic retrieval gains recall at the cost of making the retrieval mechanism itself opaque — a form of model risk where the system finds what you mean but cannot explain why.