Calendar Icon - Dark X Webflow Template
March 17, 2026
Clock Icon - Dark X Webflow Template
 min read

How to Build Deep Context for AI Sales Agents

Sales Agents

How to Build Deep Context for AI Sales Agents
How Knock2 Builds Deep Context for AI Sales Agents
Product · Engineering

How Knock2 Builds Deep Context for AI Sales Agents

Most sales AI starts blind. We built a research pipeline that gives our scoring and personalization engines the same context your best SDR carries on day one — automatically.

The Context Problem With AI Sales Tools

Every AI-powered sales tool has the same dirty secret: it doesn't know anything about your business.

You connect your CRM. Install the tracking pixel. And then you hit a wall. Lead scoring is generic. Outreach is bland. The tool treats a Fortune 500 enterprise buyer the same as a 10-person startup — because it has no context on who you actually sell to, what makes you different, or why your customers chose you over the competition.

The typical fix? A long onboarding questionnaire. A sales enablement team filling out profile fields. A "ramp period" where the tool slowly learns from your data over weeks. We didn't want that for Knock2.

We wanted Knock2 to understand a customer's business — deeply — from the moment they sign up. To give our agents the kind of context that usually takes a rep months to accumulate. Here's how we built it.

The Research Pipeline

When a customer onboards to Knock2, we automatically kick off a parallel web research pipeline using neural search APIs. The system fires a series of concurrent queries — split across two surfaces — and synthesizes everything into a structured knowledge base before the customer ever clicks through to their dashboard.

knock2 / context-pipeline — live research pass
Product pages
About & team
Case studies
Review sites
Funding signals
Competitive "vs"
KNOCK2
CONTEXT
ENGINE
synthesizing…
01 / OVERVIEW
Company Overview
Stage, size, geo, ICP
02 / CUSTOMERS
Use Cases & Logos
Outcomes, verticals
03 / PERSONAS
Buyer Personas
Champions, objections
04 / COMPETE
Competitive Position
Differentiators, wins
05 / VALUE
Win Conditions
Triggers, aha moments
Research pipeline active — scanning product pages
research-pipeline.ts — 3 stages
01
Parallel Web Research
Concurrent searches across two surfaces

We fire concurrent queries — scoped to the customer's own domain and across the open web — in a single async pass. Own-domain searches cover product pages, case studies, blog content, pricing, and implementation guides. Web-wide searches pull competitive comparisons, third-party reviews, funding signals, buyer persona indicators, and urgency triggers. The full pass completes in seconds.

neural search domain-scoped Promise.all() async I/O
02
AI Synthesis
LLM structures raw snippets into knowledge

All search output feeds a large language model with a structured prompt. The model answers specific questions across five categories. One strict rule: answer only from the evidence provided — no hallucination. If research doesn't support a confident answer, we return null. We'd rather have a gap than a fabrication.

structured prompt JSON output null-safe evidence-grounded
03
Knowledge Base Storage
Versioned, section-organized, human-in-the-loop

Synthesized answers are stored with metadata — AI-generated vs. human-written, last-updated timestamp. Human overrides are respected across refresh cycles. Only AI-generated answers get re-run. Customers get full context on day one, with a clean path to refine the nuances that don't live in public content.

versioned override-safe human-in-the-loop

How This Powers Knock2's Core Engines

The knowledge base isn't a settings panel artifact. It's a live input to the two most important functions in the product — and what makes both work without a manual setup phase.

Lead Scoring Engine
ICP-aware from day one

When Knock2 de-anonymizes a visitor, the scoring engine compares them against the knowledge base — not a generic model. It knows which industries you've won, which company sizes fit, which personas are in the buying committee. A mid-market fintech visitor hits differently when the system knows you've closed three named logos in that exact segment.

Industry & vertical fit weighting
Company size & stage matching
Buyer persona signal detection
Named logo segment proximity
Personalization Engine
Outreach that mirrors your best SDR

When it's time to engage — AI-drafted email, Slack notification to a rep, or automated sequence — the personalization engine draws directly from the knowledge base. It knows your differentiators, your win conditions, the objections your buyers raise. The output mirrors what your best SDR already knows intuitively, at scale, on the first day.

Differentiator-aware messaging
Objection-preemptive framing
Win condition signal matching
Competitive context awareness

Why Neural Search — Not Scraping or Standard APIs

We evaluated three approaches before landing on the architecture we use today. The distinction matters for anyone building context pipelines for sales agents.

Approach The Problem Why We Moved On
Web Scraping Raw HTML, breaks constantly, no semantic understanding You get the text of a page — not its meaning
Standard Search APIs Optimized for human readers, not LLM synthesis Heavy post-processing, poor signal-to-noise ratio
Neural Search (Exa) Returns semantically relevant, pre-filtered snippets Ready for LLM synthesis without cleanup

The key capability is domain-scoped neural search. We can ask "find competitive positioning content on this specific company's website" and retrieve their own "why us" blog posts and comparison pages — the exact content a good SDR would find manually. That's not replicable with keyword-based search.

We use Exa for this layer. Their API is purpose-built for AI applications — structured output, highlight extraction, domain scoping — which removes the post-processing overhead that makes other approaches slow and brittle.

What We Learned Building This

Targeted queries beat broad ones

Specific search intents produce dramatically better synthesized output. The quality delta between a broad "tell me about this company" query and intent-scoped searches is significant. Focused input → focused output.

Parallel execution is non-negotiable

Sequential API calls produce unacceptable latency for an onboarding flow. Async parallelism keeps the entire research pass fast enough that customers don't notice it's happening.

Structured output beats freeform

Narrative summaries are fluent but hard to consume programmatically. Specific questions with JSON-structured answers are directly usable by downstream scoring and personalization systems.

Human overrides are essential

AI research gets you 80–90% of the way. Every business has nuances that don't live in public content. The override layer fills gaps without requiring customers to start from scratch.

What's Next

Right now, the research pipeline runs on the customer's own business — building the foundational context that makes scoring and personalization work from day one.

We're actively building the next layer: running the same pipeline outward on your prospects' businesses. Every visitor who hits your site, automatically researched — their tech stack, recent funding, competitive landscape, hiring signals — synthesized and available to your sales team at the point of action.

The infrastructure is already there. The research pipeline, the synthesis layer, the structured knowledge base — it all generalizes. We just need to point it outward.

Stay tuned.

Compare Notes
Building AI-assisted sales workflows?

If you're working on context pipelines, agent architecture, or making AI sales tools work without long ramp periods — we're happy to share what's working.

Reach out →
How to Build Deep Context for AI Sales Agents

John DiLoreto is the founder & CEO of Knock2

Latest articles

Browse all