OpenAI's GPT bot surpasses Google's bot in indexing the web - Gizmochina


OpenAI's GPT bot surpasses Google's bot in indexing the web - Gizmochina

The bots that quietly map the internet -- the unseen engines behind search -- are starting to shift the balance of power online. For decades, Google's web crawler set the pace for how information was discovered and indexed. But that dominance is now being challenged by AI-focused crawlers, including those from OpenAI, Anthropic, and Meta, which are quickly expanding their reach across the open web.

According to new data from Hostinger, OpenAI's GPT bot has become the most active web crawler in the world. The study analyzed access logs from 5 million hosted websites and found that the GPT bot reached 4.4 million of them -- an 88% coverage rate. Google's crawler came in second, hitting 3.9 million sites, or about 78%.

The trend doesn't stop there. Other AI-focused crawlers, including Anthropic's ClaudeBot, Meta's in-house bots, and even TikTok's scrapers, collectively generated around 1.4 billion daily requests across the same sample set. In contrast, traditional players like Bing, Apple, and SEO tool Ahrefs were relatively less active.

Hostinger notes that lower coverage doesn't necessarily mean neglect. Many crawlers rotate their targets to avoid overloading servers, achieving nearly complete coverage over time. Still, the study highlights a clear imbalance in where this activity originates: roughly 80% of all crawler traffic comes from US-based companies, with Chinese bots accounting for around 10% and the rest of the world making up a small fraction.

That concentration raises new questions about who really controls what we see -- or what AI systems learn from. As AI models rely increasingly on fresh web data, the firms behind these crawlers are gaining more influence over the content that shapes summaries, search answers, and generative outputs across the internet.

Hostinger has even developed an AI audit tool that allows website owners to decide which AI bots are allowed on their website - and which aren't. As web crawling continues to evolve, the challenge will be finding a balance between open access, fair use, and sustainability.

The race to index the web is far from over, but it's clear that Google no longer runs it alone.

Previous articleNext article

POPULAR CATEGORY

corporate

15407

entertainment

18601

research

9374

misc

17999

wellness

15340

athletics

19703