Crawlability (8 Checks)

Search engines can only rank what they can find. Crawlability problems silently block your best pages from ever entering Google's index. Run through these eight checks before anything else.

  1. Robots.txt is not blocking important pages. Fetch your robots.txt file at /robots.txt and verify that no Disallow directive accidentally blocks your key landing pages, category pages, or blog posts. This mistake is more common than you'd think — especially after a CMS migration.
  2. No crawl traps present. Crawl traps are URL patterns that generate infinite or near-infinite pages: infinite scroll without pagination controls, session IDs appended to URLs, calendar archives with date parameters, or faceted navigation generating millions of filter combinations. Identify these with a crawl tool like Screaming Frog or Sitebulb and address them with canonical tags or parameter exclusion rules.
  3. XML sitemap is submitted and valid. Submit your sitemap to Google Search Console and verify there are zero errors. Only include indexable URLs (no noindex pages, no 404s, no redirects). Keep your sitemap under 50,000 URLs and 50MB — split it if necessary.
  4. Internal link depth is under 3 clicks from the homepage. Every important page should be reachable within three clicks. Pages buried deeper than this receive less crawl budget allocation and tend to rank significantly worse. Use your crawl data to identify deep pages and create additional internal links to them.
  5. No orphan pages. Orphan pages — pages with no internal links pointing to them — are effectively invisible to crawlers unless they appear in your sitemap. Audit for orphans using your crawl data cross-referenced against your Analytics or Search Console data.
  6. Redirect chains contain fewer than 3 hops. Each hop in a redirect chain leaks link equity and slows page delivery. Audit all redirects and update any chains (A→B→C) to point directly to the final destination (A→C).
  7. Canonical tags point to the correct URLs. A canonical tag pointing to a page that itself has a canonical pointing elsewhere creates a canonical chain and is often ignored by Google. Every canonical should point directly to the intended indexable URL.
  8. No unintended noindex directives on important pages. A single misplaced noindex in a meta robots tag or X-Robots-Tag response header will remove a page from Google's index entirely. Audit all key pages programmatically — don't rely on visual inspection.

Indexability (6 Checks)

Getting crawled and getting indexed are two different things. Google crawls far more than it indexes. These checks ensure your valuable pages make it into the index and are represented correctly.

  1. Google Search Console indexation rate is healthy. Navigate to Index Coverage in GSC and review the ratio of indexed vs submitted URLs. A high number of "Excluded" pages warrants investigation — many of these may be legitimate (duplicate content, canonicalised pages), but some may be pages you want indexed that Google is refusing.
  2. No accidental noindex on live pages. This is critical enough to warrant its own check separate from crawlability. Use a site audit tool to crawl your entire site and flag any page returning a noindex directive that isn't intentional.
  3. Self-referencing canonicals are implemented correctly. Every indexable page should have a canonical tag pointing to itself. This signals intent to Google and prevents issues when URLs are accessed with query parameters or trailing slashes.
  4. Duplicate content has been reviewed. Near-duplicate or fully-duplicate content confuses Google about which page to rank. Use tools like Copyscape, Siteliner, or your crawl data to identify duplicate content, then decide whether to consolidate (301 redirect), canonicalise, or differentiate.
  5. Hreflang tags are correct (if international). If you serve multiple languages or regions, incorrect hreflang implementation is one of the most common and costly technical errors. Validate with hreflang testing tools and ensure every alternate URL also includes a reciprocal hreflang tag.
  6. URL parameter handling is configured in GSC. If your site uses URL parameters (sorting, filtering, tracking), configure parameter handling in Google Search Console or use canonical tags to prevent Googlebot from crawling thousands of duplicate parameter-generated URLs.

Core Web Vitals (7 Checks)

Core Web Vitals are a confirmed Google ranking factor. More importantly, they're a measure of real user experience. Poor scores hurt both rankings and conversion rates.

  1. LCP (Largest Contentful Paint) is under 2.5 seconds. Check your field data in PageSpeed Insights (which uses real Chrome User Experience data, not synthetic lab scores). If LCP is above 2.5s, identify the LCP element and work backwards to what's delaying it.
  2. CLS (Cumulative Layout Shift) is under 0.1. Layout shifts most commonly come from images without explicit dimensions, late-loading ads, or web fonts causing FOUT. Use Chrome DevTools' Performance panel to identify specific shifting elements.
  3. INP (Interaction to Next Paint) is under 200ms. INP replaced FID in March 2024 and measures responsiveness across the entire page lifetime, not just at load. High INP is almost always caused by heavy JavaScript execution on the main thread.
  4. No render-blocking resources. JavaScript and CSS loaded in the document <head> block the browser from rendering the page. Audit for render-blocking resources in PageSpeed Insights and defer or async non-critical scripts.
  5. Images are served in WebP or AVIF format. Modern image formats are 25-35% smaller than JPEG/PNG at equivalent quality. Serve WebP with AVIF as a progressive enhancement for browsers that support it. Implement via your CDN, image optimisation service, or server-side format negotiation.
  6. Lazy loading is implemented for below-the-fold images. Add loading="lazy" to all images not in the initial viewport. Never lazy-load your LCP image — that will worsen your LCP score significantly.
  7. Server response time (TTFB) is under 200ms. Time to First Byte measures how quickly your server responds. Poor TTFB cascades into poor LCP. Improvements come from caching (server-side and CDN), database query optimisation, and geographic proximity of your hosting to your users.

On-Page Fundamentals (8 Checks)

Technical SEO isn't just about server-level issues. On-page fundamentals form the bridge between your infrastructure and your content strategy.

  1. Title tags are unique and under 60 characters. Google truncates title tags beyond 60 characters in search results. More importantly, duplicate title tags are a signal of thin or duplicate content. Every page needs a unique, keyword-informed title.
  2. Meta descriptions are unique and under 155 characters. While meta descriptions are not a ranking factor, they directly influence click-through rate from search results. Treat them as ad copy: specific, benefit-driven, and with a clear call to action.
  3. One H1 tag per page. Multiple H1 tags confuse Googlebot about the primary topic of a page. Every page should have exactly one H1 that includes the primary target keyword and matches the user's search intent.
  4. Heading hierarchy (H2/H3) is logical. Headings should form an outline — H2s for major sections, H3s for subsections within those sections. Don't use heading tags for visual styling; use them for semantic structure.
  5. All images have descriptive alt text. Alt text serves two purposes: accessibility for screen reader users and keyword context for crawlers. Write descriptive alt text that would make sense if read aloud, and naturally incorporate relevant keywords where appropriate.
  6. Internal links use descriptive anchor text. "Click here" and "read more" provide no context to search engines about what the linked page covers. Use descriptive anchor text that reflects the target page's primary topic.
  7. Canonical URL matches the intended indexed URL exactly. Including trailing slash discrepancies, www vs non-www, and HTTP vs HTTPS variations. Pick one canonical form and enforce it consistently via redirects and canonical tags.
  8. No keyword stuffing on any page. Keyword stuffing — unnatural repetition of target keywords — is a negative quality signal. Modern SEO prioritises topical coverage and natural language over keyword density metrics.

Structured Data (6 Checks)

Structured data helps Google understand your content and can unlock rich results in SERPs, increasing click-through rates substantially.

  1. Organization schema is present on the homepage. Include your business name, URL, logo, social profiles, and contact information in Organization schema. This helps Google build a Knowledge Panel for your brand.
  2. FAQPage schema is implemented on pages with Q&A sections. FAQ schema can expand your SERP snippet significantly, taking up more real estate and increasing CTR. Only mark up genuine question-and-answer content.
  3. Article or BlogPosting schema is on all blog posts. Include headline, author, datePublished, dateModified, and publisher. The dateModified field is particularly important for timely content — keep it updated.
  4. BreadcrumbList schema is on all interior pages. Breadcrumb schema enables Google to show your site structure in search results, which improves CTR and helps Google understand your site hierarchy.
  5. Product schema includes price, availability, and reviews. Product schema with complete attributes can unlock rich product results including star ratings, price ranges, and availability status directly in search results.
  6. All structured data passes the Rich Results Test. Validate every schema implementation at search.google.com/test/rich-results. Fix any errors (which prevent rich results) and warnings (which may limit eligibility).

Site Architecture (7 Checks)

Site architecture determines how link equity flows through your domain and how clearly Google can understand your content's topical relationships.

  1. URL structure is flat and descriptive. Shorter URLs with clear keyword context outperform long, nested URL structures. Aim for: domain.com/category/page-name rather than domain.com/category/subcategory/sub-subcategory/page-name.
  2. Trailing slash usage is consistent sitewide. Inconsistent trailing slash usage creates duplicate content issues. Pick one format (with or without trailing slash) and enforce it with 301 redirects for all variations.
  3. HTTPS is enforced sitewide. Every page — including images, scripts, and stylesheets — must be served over HTTPS. Mixed content (HTTP resources on HTTPS pages) triggers browser security warnings and can impact rankings.
  4. All old URLs have 301 redirects. Any URL that has ever been linked to externally or ranked in search should have a permanent 301 redirect to its current location if the URL has changed. Lost rankings from missing redirects are notoriously difficult to recover.
  5. No mixed content warnings in the browser. Even after implementing HTTPS, legacy HTTP references in CSS, JavaScript, or HTML can cause mixed content warnings. Use a browser security scan or crawl tool to surface these.
  6. Pagination is implemented with rel=next/prev or canonical. For paginated content (blog archives, product listings), either implement proper pagination signals or use canonical tags pointing all pages to the first page in the series.
  7. Breadcrumb navigation is present on all interior pages. Breadcrumbs improve user experience, reinforce site structure for crawlers, and enable BreadcrumbList rich results in search. They should match your URL structure logically.

Mobile & Security (5 Checks)

Google has used mobile-first indexing since 2019, meaning the mobile version of your site is the version Google indexes and ranks. Security signals are increasingly factored into trust assessments.

  1. Site passes Google's Mobile-Friendly Test. Test at search.google.com/test/mobile-friendly. Pages that fail are at a significant ranking disadvantage. Common failures include text too small to read, clickable elements too close together, and content wider than the screen.
  2. No intrusive interstitials on mobile. Google penalises pages that show pop-ups, overlays, or interstitials that cover the main content on mobile immediately after arriving from search results. Age verification and legal notices have exceptions.
  3. HTTPS certificate is valid and not expiring soon. An expired SSL certificate immediately tanks user trust and can trigger browser warnings that reduce traffic to near zero. Set up auto-renewal and configure monitoring alerts for certificate expiry.
  4. Security headers are present. Key security headers — X-Content-Type-Options, X-Frame-Options, Referrer-Policy, and Content-Security-Policy — are increasingly considered trust signals. Verify with securityheaders.com.
  5. No mixed HTTP/HTTPS assets. All resources loaded by your pages (images, fonts, scripts, stylesheets) must be served over HTTPS. HTTP assets on HTTPS pages cause browser security warnings and may be blocked entirely in modern browsers.

Need expert help implementing these strategies for your site?

Learn about our Technical SEO service →