Back to blog
Engineering5 min read

How We Replaced Keyword Search with AI-Powered Knowledge Retrieval

When exact keyword matching stopped scaling, we built a semantic search layer that understands what you mean, not just what you type. Here is how it transformed our platform's AI capabilities.

Lora·
How We Replaced Keyword Search with AI-Powered Knowledge Retrieval

Our platform helps IT teams manage Microsoft 365 security across many tenants. Behind the scenes, we maintain a large knowledge base covering Graph API documentation, compliance frameworks, security checks, and assessment test definitions across multiple systems. To be clear: this is platform-level reference data, not customer tenant data. No customer information is stored in or queried through these search indexes.

For a long time, our AI features relied on keyword matching to find relevant information. It worked when users typed exactly the right terms. It fell apart when they didn't.

The breaking point

Search "MFA" and you'd find everything about multi-factor authentication. Search "protect against credential theft" and you'd get nothing, even though dozens of relevant articles existed.

Worse, when our AI needed to reference specific checks or test definitions, it had no reliable way to verify they actually existed. It would confidently suggest IDs that looked plausible but didn't map to anything real. The AI was hallucinating references, and we couldn't ship that to customers.

What we learned from NotebookLM

Before building anything, we studied what makes the best retrieval systems work. Three principles from Google's NotebookLM stood out:

  1. Source-grounded retrieval - answers come from real data, not model guesses
  2. Semantic understanding - matching by meaning, not just keywords
  3. Content quality matters - the richness of your indexed content determines the quality of your results

That third point turned out to be the most important. Before choosing any technology, we audited our actual production data to understand what we had to work with.

The results were surprising. Most of our content was already rich enough for semantic search. Descriptions, guidance text, and framework context were detailed and well-structured. We didn't need to generate AI summaries for every record. We just needed to combine the right fields and let the embedding model understand them.

Architecture philosophy

Narrow, purpose-built indexes

We chose multiple focused indexes over one large unified index. Each knowledge domain (API documentation, compliance requirements, security checks, test definitions) gets its own index that evolves independently.

This matters because different AI features need different slices of knowledge. Our compliance mapping engine never needs assessment findings. Our chat assistant rarely needs the compliance control catalog. Narrow indexes mean faster queries, more relevant results, and simpler maintenance.

Let the database drive the pipeline

Instead of building complex ETL pipelines in application code, we designed the system so search indexes pull directly from structured database views. A new field gets added to a view, the indexer picks it up on its next sync. No deployment, no code change.

This turned out to be one of the highest-leverage decisions we made. The entire pipeline from source data to searchable index is declarative and self-maintaining.

Hybrid search: keywords and vectors working together

Pure vector search is powerful but sometimes misses exact matches. Pure keyword search is precise but brittle. We use both simultaneously.

When someone searches "protect against credential theft," the keyword component catches any documents containing those specific words. The vector component finds related documents about browser credential stores, token delegation, and authentication policy gaps, even when there's no word overlap at all.

The results are fused together so exact matches rank high, but semantically related content surfaces too. Best of both worlds.

Eliminating hallucination

This was the real motivation. Our AI features needed to reference specific, verifiable entities: test definitions, compliance controls, security checks. When the AI guessed these references from its training data, it was wrong often enough to be a problem.

With semantic search, the AI no longer guesses. It searches by meaning, retrieves verified results from our actual data, and uses those real references in its output. Every test ID, every compliance control, every check it references is something that actually exists in the system.

The difference for customers is trust. When our AI suggests a security configuration or maps a compliance requirement, it's grounded in your real environment data, not a language model's best guess.

What changed

The transformation was dramatic:

  • Search quality: Users find what they need using natural language, not memorized keywords
  • AI accuracy: References to tests, controls, and checks are verified from real data
  • Context efficiency: Instead of dumping entire databases into AI prompts, we send only the most relevant results
  • Cross-system discovery: A single query can surface related content across all knowledge domains
  • Maintenance: The pipeline syncs automatically with no manual intervention

Principles worth sharing

A few things we'd tell anyone building retrieval systems:

Audit your data before building infrastructure. We almost over-engineered the content enrichment layer. The audit showed our existing data was richer than we assumed. Understanding what you have saves you from building what you don't need.

Start with keyword search, add vectors later. We proved the entire pipeline worked with basic keyword matching first. Then we layered in vector embeddings. Each step was independently testable, which made debugging straightforward.

Managed identity everywhere. Zero API keys in configuration. Every service-to-service connection authenticates through managed identity. This isn't just a security choice, it eliminates an entire class of operational problems.

Test the full chain before committing. We validated every link in the pipeline (data source, indexer, index, query) with test data before writing the real implementation. An hour of testing saved days of debugging.

What's next

We're exploring agentic retrieval, where complex queries are automatically decomposed into focused sub-queries for more precise results. We're also expanding the knowledge base with richer content across all domains.

The foundation is solid: hybrid search, automatic sync, multi-tenant isolation, and a clean service layer that any AI feature can plug into. Every new AI capability we build inherits this search infrastructure from day one.

aisemantic-searchmicrosoft-365azure-aiknowledge-retrieval

Written by Lora, Implora's AI. Reviewed and approved by the Implora team.