How to run a technical GEO audit on your SaaS site

What is a technical GEO audit?

A technical GEO audit is a structured inspection of the parts of your site AI tools touch: how crawlers fetch it, how the HTML is rendered, how content is structured, and what machine-readable signals (schema, metadata, llms.txt) are attached. It is not a content audit and it is not a brand audit - it answers a single question: can an AI crawler arrive, read, and extract your facts cleanly?

How do I run a GEO audit, step by step?

Crawl your site. Run Screaming Frog or Sitebulb across your full URL set. Export the report.
Check raw HTML. For your homepage, top 3 product pages and pricing page, view-source or curl -A "GPTBot" <url>. Confirm content is in the source, not injected by JS.
Validate schema. Paste each key URL into Google's Rich Results Test and the Schema.org validator. Note what's missing or invalid.
Check robots.txt and sitemap. Verify GPTBot, ClaudeBot, PerplexityBot and Google-Extended aren't blocked, and that your sitemap is current.
Run an AI legibility test. Paste each key URL into ChatGPT, Claude and Perplexity. Ask: "what does this page say, and what's missing?" Capture the gaps.
Check Core Web Vitals. Run PageSpeed Insights on the same key URLs.
Score and prioritise. Sort every finding into "fix this week / fix this month / nice to have" using the framework below.

Budget 3-4 hours for a first audit on a typical SaaS site. The first one is the slowest - subsequent ones run in under an hour once you have a template.

What should I actually check?

Eight areas, in roughly the order they matter for AI citations.

Area

What to check

The signal

Crawlability

robots.txt, sitemap.xml, AI bot allowlist

Can GPTBot, ClaudeBot, PerplexityBot and Google-Extended actually fetch your pages?

Rendering

Server-rendered HTML vs client-only JS

Does the raw HTML response contain your content, or only a loading shell?

Indexability

Canonical tags, noindex, duplicate URLs

Is each important page reachable at one canonical URL with the right directives?

Structured data

Organization, Product, FAQPage, Article schema

Can AI extract entity facts (name, founders, category, pricing) without guessing?

Page structure

Single H1, semantic headings, lists, tables

Is the content chunkable into clean answer-shaped passages?

Metadata

Titles, descriptions, OG, canonical URLs

Do social and AI previews describe the page accurately?

Performance

Core Web Vitals, TTFB, JS payload

Do crawlers get a response fast enough to finish rendering?

llms.txt

Curated map of citable pages

Have you told LLMs which URLs to prefer?

If any single row in this table is failing on your homepage or core product pages, fix it before doing anything else - including content work. A broken foundation invalidates every on-site and off-site move you make later.

Which tools should I use?

You can run a full first audit with free tools. The paid options become useful when you want ongoing monitoring or you're auditing 1000+ URLs.

Tool

Cost

What it's best for

Google Search Console

Free

Indexation, crawl errors, sitemap status, Core Web Vitals.

Screaming Frog (free up to 500 URLs)

Free / paid

Full crawl: status codes, headings, canonicals, schema, internal links.

Sitebulb Lite

Free

Visual crawl maps and prioritised hint reports for small sites.

Schema.org validator + Rich Results Test

Free

Verify Organization, Product, FAQPage, Article markup parses cleanly.

PageSpeed Insights / Lighthouse

Free

Core Web Vitals, render-blocking JS, mobile performance.

view-source: and curl -A 'GPTBot'

Free

Confirm the raw HTML AI crawlers see actually contains your content.

ChatGPT / Claude (with browsing)

Free / paid

Paste a URL and ask: "what does this page say, and what's missing for someone choosing a [category] tool?"

Perplexity

Free

Ask it to summarise your page and list its sources - reveals what AI can actually extract.

Ahrefs / Semrush Site Audit

Paid

Ongoing technical SEO monitoring; most checks overlap with GEO needs.

Can AI help me run the audit itself?

Yes - and this is the part of GEO auditing that's changed most in the last twelve months. ChatGPT, Claude and Perplexity are now the fastest way to test how AI actually reads your site. Five prompts worth running on any key URL:

"Summarise the content of this page in three sentences." - tests basic legibility.
"What category of product is this, and who is it for?" - tests entity and positioning clarity.
"List every feature mentioned on this page." - exposes whether features are extractable or buried in marketing copy.
"How much does this product cost?" - exposes pricing pages hidden behind tabs or JS.
"Who founded this company and where are they based?" - tests Organization schema and About-page signal.

If the answers are wrong, vague or include "I couldn't find that on the page", you have a specific, fixable problem - usually a rendering, structure or schema issue. Paste the failing output into ChatGPT or Claude and ask: "what change to the page would have let you answer correctly?" The response is often a usable rewrite brief.

You can also paste your raw HTML or rendered DOM into Claude and ask it to flag missing schema, weak headings or ambiguous entity references. It's the closest thing to an on-demand technical reviewer.

How do I prioritise the fixes?

Sort every finding into one of three tiers. Ship tier one before you touch tiers two or three.

[ 01 ]

Fix this week

AI bots blocked in robots.txt (GPTBot, ClaudeBot, PerplexityBot, Google-Extended).
Key pages rendered client-side only, returning an empty HTML shell.
Missing or broken Organization schema on the homepage.
No sitemap, or a sitemap returning 404s and old URLs.

[ 02 ]

Fix this month

Thin or duplicate H1s, no semantic structure on key landing pages.
FAQ and pricing content trapped inside accordions or tabs that hide it from the HTML.
Missing FAQPage, Product or Article schema on pages that warrant it.
Slow TTFB or heavy JS that blocks the first render.

[ 03 ]

Nice to have

Publishing a llms.txt file pointing AI tools at your best pages.
Adding speakable schema and breadcrumbs.
Consolidating near-duplicate URLs with canonicals.
Tidying internal linking so canonical pages get the most internal mentions.

What's the single most important fix?

Almost always: making sure the raw HTML response contains your content. A surprising number of B2B SaaS sites still render homepage and pricing copy entirely client-side, which means AI crawlers that don't execute JavaScript - and most don't - see an empty shell. Fix that one issue and you've already lifted the ceiling on every other GEO move.

Second-most-important: a valid Organization schema block on your homepage. It's the single biggest entity-recognition signal you can ship in an afternoon, and it underpins every branded prompt buyers run in AI tools.

How often should I re-audit?

Full audit once a quarter. A lighter pass (raw HTML check, schema validation, AI legibility prompts on your top 5 pages) every month. Re-run the audit immediately after any major site redesign or framework migration - those are the moments rendering and schema break.

For the layers that sit on top of the audit, see schema markup that actually helps AI cite you and how to make your SaaS site crawlable for AI.