These guidelines describe the quality bar AI engines apply to content they’re willing to cite. The patterns are inferred from observed citation behavior across the major engines, and they overlap with — but go beyond — Google’s existing quality guidance.
What engines reward
- First-hand expertise. Pages written by someone with verifiable experience in the subject are cited more reliably than aggregator content.
- Concrete claims with evidence. Numbers, dates, named entities, and references to primary sources retrieve well and survive synthesis.
- Original information. Original research, proprietary data, and first-hand reporting are cited disproportionately because they’re the sources behind every other source.
- Clear structure. Headings that match how a user would phrase the question; passages that stand alone; predictable formatting aligned with content chunking.
- Maintained pages. Up-to-date content with explicit modification dates is preferred over stale pages on the same topic; see author and date signals.
What engines penalize
- Thin content. A 200-word page that doesn’t fully answer its own question gets skipped over fuller pages from the same domain.
- Heavily promotional content. Pages whose primary message is “buy from us” don’t get cited as reference material. They may still appear in branded queries; they don’t earn category authority.
- Aggregated or rewritten content. Pages that repackage information already on the engine’s index without adding anything are downweighted.
- Inaccurate or outdated claims. A page caught in a factual error loses standing for that query and often for related queries.
- Auto-generated filler. Long, fluent, unsubstantiated content is the modal failure of generative tools. Engines detect this pattern and downgrade the source.
- Cloaking and manipulation. Pages that show different content to crawlers vs users are detected and penalized.
Specific quality requirements
Author attribution
Pages that make substantive claims should attribute an author. The author should:
- Have a real, persistent author profile page.
- Have credentials or experience visible on the profile.
- Be identifiable across the web (LinkedIn, professional bio elsewhere, named on related publications).
Anonymous content gets cited less. See author bios.
Sourcing
Substantive claims need sources. The strongest reference sources are:
- Primary sources (original research, official documentation, government data).
- Recognized authoritative sources for the topic.
- The publisher’s own primary work where the publisher is the expert.
Citations to other content the engine already trusts strengthen retrieval, because the engine can cross-reference.
Recency
Content should be dated. Both the publication date and the most recent meaningful update should be visible on the page and in Article schema datePublished and dateModified fields.
For topics where freshness matters (technology, regulation, current events), engines bias toward recent content. For evergreen topics, well-maintained older content can hold position indefinitely.
Specificity
Generic content loses to specific content. “How to set up SSO” loses to “How to set up SSO with Okta and a custom IdP.” The specific page wins for both the specific query and a meaningful share of the generic query, because retrieval can pull from the specific page when it matches better.
Honesty about scope
Pages that overpromise their coverage get penalized when the engine retrieves and finds the page doesn’t actually deliver. Be honest in titles and first paragraphs about what the page covers and doesn’t.
Maintaining quality at scale
For sites publishing many pages:
- Editorial review is non-negotiable. AI drafts pass through human editors before publication.
- Have a clear retraction or correction policy for factual errors. Engines pick up on corrections.
- Audit existing pages periodically; update or merge thin pages rather than letting them drag the domain down.
- Track which pages get cited and which don’t. The non-cited ones reveal the quality gap.
Implementation example
AwesomeShoes Co. publishes dozens of footwear advice pages monthly and sees uneven citation performance. The editorial director needs a quality system that keeps speed high without introducing thin or generic content.
Implementation discussion: each draft gets role-based review (subject expert for factual integrity, editor for clarity, SEO for retrieval structure), and publication requires evidence-backed claims plus author attribution. The analytics lead tracks cited vs non-cited pages to identify recurring quality failures and feed them back into the editorial checklist.