An AI engine answers a query by running it through four steps: query understanding, retrieval, ranking, and answer composition. Knowing where each step happens makes it clear where AEO has leverage and where it doesn’t.
Step 1: Query understanding
The engine parses the user’s prompt, identifies entities, infers intent, and rewrites the query into one or more retrieval queries. A user asking “what’s the best CRM for a 10-person sales team” might trigger several internal queries about CRM features, small-team CRMs, and pricing.
This step is opaque to publishers. The engine’s behavior here is fixed.
Step 2: Retrieval
The engine pulls candidate sources. Depending on the engine, that means:
- A live web search through a partner index (Bing for ChatGPT, Google for Gemini grounding, an internal index for Perplexity).
- A vector lookup against an internal embedding store.
- A direct API call to specific data sources.
- A combination of the above.
This is where AEO has leverage. To be retrieved, content must be:
- Reachable by the engine’s crawlers — not blocked by robots.txt, paywalls, or rendering issues.
- Indexed in the search engine that the AI engine relies on.
- Structured so retrieval recognizes the page as a strong match for the rewritten query.
A site that ranks well in Google has a head start with engines that ground in Google. A site invisible to Bing has a hard time being cited by ChatGPT and Copilot.
Step 3: Ranking
Retrieved candidates are scored. The signals overlap with classic search ranking — relevance, authority, freshness, structured data — but with two differences:
- Source preference matters more. Engines bias toward sources they trust to produce factual answers (encyclopedias, news outlets, official documentation, well-established industry sites). See preferred AI sources.
- Passage-level scoring matters more than page-level scoring. The engine wants the specific paragraph that answers the question, not the page that broadly covers the topic. See passage ranking.
Step 4: Answer composition
The engine takes the top-scored passages and writes the answer. This involves:
- Synthesizing across multiple sources rather than quoting one.
- Generating inline citations that link back to the sources.
- Optionally rewording to fit a conversational tone or format.
A page can be retrieved and ranked highly and still not be cited if its content doesn’t fit the answer the engine ends up composing. The most reliable way to be cited is to have content that already reads like an answer to the query — concise, factual, with a clear claim per paragraph.
What this means for AEO
- Most leverage sits at retrieval and ranking. Make sure the right pages exist, are reachable, and are structured to match the questions being asked.
- The engine’s training data also matters, but indirectly and over a long horizon. Retrieval is the daily lever.
- The composition layer rewards content that is already in answer-shape: clear claims, short paragraphs, structured lists.
Implementation example
AwesomeShoes Co. asks why competitor pages are cited for “best shoes for warehouse shifts” even when its own guides are detailed. The AEO manager uses the four-step engine workflow to isolate whether the issue is query fit, retrieval access, ranking signals, or answer composition.
Implementation discussion: the SEO lead validates crawler and index presence, the content strategist rewrites key sections into direct answer passages, and the analyst compares citation outcomes by engine after changes. This turns a vague visibility issue into a role-owned diagnosis path that is technical yet easy for non-engineers to follow.