AI Visibility

How AI Search Really Works (The Black Box, Opened)

We reverse-engineered the AI answer pipeline across Claude, ChatGPT, Gemini, and Perplexity into one seven-station model. Here is what is actually happening when an AI decides whether to name your brand.

When you ask an AI assistant a question, it rarely runs one search and reads one page. Modern answer engines run a pipeline: they decide whether to search at all, break your question into many smaller queries, fetch and re-rank candidate pages, pack the best passages into the model's context, then write an answer and attach citations. Understanding each stage explains the single most important fact for any brand: being named in the answer and being cited as a source are two separate outcomes, and you can win one without the other. This is the documented machinery behind the "black box," with the inferred industry practice flagged as such.

How AI Search Really Works From a user query to a cited, named answer in seven steps 1. Query User asks 2. Rewrite Fan-out terms 3. Retrieve Search index 4. Rank Score sources 5. Read Extract facts 6. Synthesize Draft answer 7. Answer Output result Two separate gates NAMED brand mentioned CITED linked as source Being named is not the same as being cited, win both gates to show up in AI answers.
A seven-step flow of AI search, from query through rewrite, retrieval, ranking, reading, synthesis, and answer, ending at two distinct visibility gates: NAMED (your brand mentioned) and CITED (linked as a source).

Stage one: the engine decides whether to search

Before any retrieval happens, the model makes a routing decision: answer from its training data, or go fetch live information. In practice, engines lean toward searching for time-sensitive, comparative, or factual queries (prices, news, reviews) and lean toward internal knowledge for stable, timeless questions. This decision is situational and varies by platform: Perplexity grounds nearly every response in live sources, while ChatGPT searches more selectively. The takeaway is that if the engine never searches, your content cannot be cited no matter how good it is.

Stage two: fan-out turns one question into many

Rather than searching your exact words, the engine decomposes the prompt into multiple related sub-queries and runs them in parallel, a technique Google publicly calls "query fan-out" in its AI Mode announcement, powered by a custom version of Gemini. Different sub-queries can be routed to different sources: the open web, a knowledge graph, shopping or local data, and structured feeds. The practical consequence is that you are not competing for one keyword anymore. You are competing across a spread of sub-questions the user never typed.

Stage three: fetch, re-rank, and pack the context

Each sub-query returns a set of candidate pages. The engine then re-ranks those candidates by relevance and quality and keeps only a small subset, discarding most of what it retrieved. The surviving passages, not whole pages, get packed into the model's context window as the evidence it will read. Industry analyses consistently find that the large majority of retrieved pages are never used, so simply being retrievable is necessary but far from sufficient.

Stage four: generate the answer, then attach citations

The model writes the answer from the packed passages, and citation can happen one of two ways. Some systems generate the answer and its sources together so the text is tied to evidence as it is written; others generate the answer first and attach supporting links afterward, an approach documented in research on systems like RARR. The second method, sometimes called post-hoc attribution, can produce a citation that supports a claim without being the true origin of the wording. This is why a cited link does not always mean that page is where the assistant "learned" the fact.

The key insight: named is not the same as cited

Two distinct things can happen to your brand in an AI answer. You can be mentioned, where the model names you in the recommendation itself, or you can be cited, where your URL appears as a linked source. These are decided at different stages by different signals, so they do not move together. SEMrush, which labeled this gap the "Mention-Source Divide" in 2025, reported that fewer than one in five brands earn both consistently, meaning your research can inform an answer that then recommends a competitor by name.

Key takeaways
  • AI answers come from a pipeline, not a single search: decide-to-search, fan-out into sub-queries, fetch, re-rank, pack, generate, then attribute. Optimize for the whole chain, not one keyword.
  • Fan-out means you are judged across many sub-questions the user never typed. Google documents this as "query fan-out" in AI Mode, run on a custom Gemini model.
  • Most retrieved pages are never used. Being findable gets you into the candidate pool; surviving the re-rank and making it into the packed context is the harder bar.
  • Citations are not always proof of origin. Some engines attach sources after writing the answer, so a cited link may support a claim without being where the wording came from.
  • Being NAMED in the answer and being CITED as a source are separate outcomes governed by different signals. Track both, because you can win one and lose the other, and the named brand usually captures the buyer.
The memory graph

How AI handles your data.

One pipeline, four engines, each a different dial setting. Click any node to see how that engine retrieves, ranks, combines, and finally names (or skips) a brand. The two final tracks, named and cited, are different gates you win separately.

FAQ

Questions, answered.

They are two separate gates. CITED means a page from your site survived retrieval and got attached as a source. NAMED means your brand token actually appears in the written answer. The answer is generated first and citations are mapped on afterward, so you can be cited but not named, or named but not cited. Winning AI visibility means chasing both.

Not the way they did. AI engines crush your content into meaning-vectors and match them, chunk by chunk, against an imagined ideal answer, not against the user's exact words. You win by covering the concepts a perfect answer would contain in short, self-contained, front-loaded chunks, not by repeating the query terms.

Cover the ideal answer's concepts so your chunk survives retrieval and reranking, make sure your key fact lands in the first result set, and seed the same canonical fact across multiple independent sources. When several independent sources agree, the probability of your brand's name sharpens into a confident peak, and the model writes it down.

Want this working for your brand?