Indexing basics

How Google Indexing Works (Crawling, Rendering, Indexing)

June 8, 2026 · 5 min read

The short answer

Google indexing works in stages: it discovers a URL, queues it, crawls it with Googlebot, renders the page (including JavaScript), then decides whether to store it in the index, after which it can appear in search results. A page can only be indexed after it is discovered, which is why submitting URLs for indexing speeds up the first step.

Google indexing works as a pipeline: Google discovers a URL, queues it for crawling, fetches it with Googlebot, renders the page, decides whether to store it in the index, and only then can serve it in results. Each stage is distinct, and a page cannot reach the index until it has been discovered and crawled first. That ordering is the whole reason submitting URLs helps, and it is what URL Indexer's free indexing tool does: it pushes your pages and backlinks to Google so the discovery step happens sooner instead of waiting for Google to stumble on them.

What are the stages of the Google index process?

The Google index process has five main stages, run roughly in order but not always back to back. Google may discover a URL today, crawl it next week, and render it later still. Knowing where a URL is stuck tells you what to fix.

  1. 1Discovery: Google learns the URL exists, usually from links, sitemaps, or a direct indexing request.
  2. 2Crawl queue: the URL waits in a scheduling queue, prioritized by perceived importance and capacity.
  3. 3Crawling: Googlebot fetches the raw HTML over HTTP, respecting robots.txt.
  4. 4Rendering: Google runs the page like a browser, executing JavaScript to see the final content.
  5. 5Indexing: Google analyzes the rendered page and decides whether to store it, then it becomes eligible to serve in results.

How does Google discover a URL?

Google discovers a URL by finding a reference to it somewhere it already crawls. The most common sources are internal and external links pointing to the page, XML sitemaps you submit, and explicit indexing requests. Until one of those signals exists, Google has no reason to know the page is there, so a brand-new page with no links can sit invisible indefinitely. This is the bottleneck submitting URLs solves: it hands Google the address directly instead of relying on it being linked. For a deeper look at the difference between the request and the result, see what URL indexing actually is.

What does Googlebot do when it crawls a page?

When Googlebot crawls a page, it sends an HTTP request and downloads the raw HTML response, the same way a browser fetches a document before running any scripts. Before fetching, it checks the site's robots.txt to see whether crawling that path is allowed. Googlebot also reads response headers and status codes: a 200 means the page is available, a 301 sends it to a new URL, and a 404 or 410 tells Google the page is gone. How much Googlebot crawls and how often is shaped by your server's responsiveness and the site's overall importance, a topic covered in crawl budget explained.

robots.txt blocks crawling, not indexing

A robots.txt disallow rule stops Googlebot from fetching the page, but it does not guarantee the page stays out of the index. If other pages link to a blocked URL, Google can still index that URL based on those external signals, often showing it with no description because it never read the content. To keep a page out of the index, allow crawling and use a meta robots noindex tag on the page itself, so Google can fetch it, see the directive, and drop it.

Does Google render JavaScript before indexing?

Yes, Google renders JavaScript before indexing, but rendering is a separate step that can happen after the initial crawl. Googlebot first processes the raw HTML, then queues the page for rendering, where it runs the page in a headless browser, executes the scripts, and builds the final DOM. Whatever content appears only after that JavaScript runs is invisible to Google until rendering completes. If your important text, links, or canonical tags are injected client-side, they depend entirely on a successful render.

Two practical consequences follow. First, content that needs JavaScript can be indexed more slowly than content present in the initial HTML, because it waits for the render queue. Second, if a script fails, times out, or is blocked in robots.txt, Google may index the page without that content at all. Server-side rendering or putting critical content and links in the initial HTML avoids the gap.

How does Google decide what to index?

Google decides what to index by analyzing the rendered page and judging whether it is worth storing and serving. It checks for indexing directives like noindex, picks a canonical URL when duplicates exist, and weighs signals of quality and uniqueness. Not every crawled page makes it in: thin, duplicate, or low-value pages are commonly crawled and then left out. Indexing is also reversible, since Google can drop a page later if it stops finding it useful.

  • Noindex directives: a meta robots or X-Robots-Tag noindex keeps the page out, even after a full crawl.
  • Canonicalization: among similar URLs, Google chooses one canonical to index and folds the rest into it.
  • Quality and uniqueness: pages that add little over what is already indexed may be skipped.
  • Accessibility: server errors, redirects, or blocked resources can prevent a clean index.

What happens after a page is indexed?

After a page is indexed, it becomes eligible to appear in search results, but eligibility is not the same as ranking. Indexing means Google has stored the page and can return it for a query; ranking decides where it shows up among everything else stored for that query. A page can be fully indexed and still rank on page ten, or not surface for competitive terms at all. So getting indexed is the entry ticket, and ranking is a separate problem driven by relevance, content quality, and links. You can confirm a page made it in by following the steps in how to check if a URL is indexed.

Why does submitting URLs help indexing?

Submitting URLs helps because it removes the slowest, most uncertain stage: discovery. Instead of waiting for Google to find your page through a link it may not crawl for weeks, you hand the address over and put it in the queue now. URL Indexer sends standard indexing-request signals at scale for both your own pages and third-party backlinks, with no Search Console access required, so you can submit URLs on sites you do not own. It does not alter your pages or fake any signals; it just makes sure Google knows the URL exists, which is the prerequisite for everything that follows.

Frequently asked questions

What is the difference between crawling and indexing?

Crawling is Googlebot fetching a page's content over HTTP. Indexing is Google deciding to store and serve that page after analyzing it. A page can be crawled and still not indexed if Google judges it thin, duplicate, or marked noindex.

Does Google index JavaScript content?

Yes. Google renders pages in a headless browser and executes JavaScript before indexing, so client-side content can be indexed. But rendering happens in a separate, often later step, so JavaScript-dependent content can be indexed more slowly than content in the initial HTML.

Does robots.txt stop a page from being indexed?

No. Robots.txt blocks crawling, not indexing. A blocked URL can still be indexed from external links, usually with no description. To keep a page out of the index, allow crawling and add a meta robots noindex tag so Google can read the directive.

How long does Google take to index a new page?

There is no fixed timeframe and Google makes the final call. After a URL is submitted, crawlers often visit within a few days, and confirmed indexing can take from a few days to a couple of weeks depending on the site and page quality.

Does being indexed mean my page will rank?

No. Indexing only makes a page eligible to appear in results. Ranking is a separate process that decides position based on relevance, quality, and links. A page can be fully indexed and still rank poorly or not surface for competitive queries.