Hope you enjoyed our last entry about Valentine’s Day predictions, we love giving you guys these perks from time to time 🙂
We say yes, but you need to understand what really matters and which things could be disregarded.
Gary Illyes of Google has written this blog entry about the Googlebot crawling and how it actually works.
As we all know, the bot works on a budget, which means that the more clutter, spam and irrelevant content we may have on our website – the slower the crawling process.
What we understand is simple: If I have a lower crawling budget, any change made to the website would be updated in index at a slower pace (if it all) which means that no matter how we tweak our SEO – if our budget is too low, we just won’t be crawled.
The more resources the bot would have to use (as in crawling irrelevant data), the slower it works.
It’s not just content that isn’t good, or faceted navigation and session identifiers that may hinder its performance – it’s also overall loading time of the server.
The faster it loads, the faster and more efficiently it would be crawled.
Naturally, users can put a limit in the search console to tell the crawler how much of the website should be crawled, but that limit isn’t always reached.
Ok, what affects the crawling budget then?
What affects crawling are low value add URLs that can divert budget from our website, and affect crawling.
These fall into the following categories (in order of significance):
- Faceted navigation and session identifiers
- On-site duplicate content
- Soft error pages
- Hacked pages
- Infinite spaces and proxies
- Low quality and spam content
If you spend time on these time wasters, you affect your crawling for worse with it being super slow, not updating with your great content or not even updating – which would seriously damage your exposure and overall ranking.
So what now?
In general, this isn’t something of concern to small time webmasters, as crawling is allocated on a budget and unless you have thousands of URLs on your servers, you are safe.
The ones that should worry about not being crawled efficiently are the ones that have 500 or so articles in the bunch, as the Googlebot needs to allocate resources to crawl efficiently and if your site isn’t tidy, the site won’t be crawled and if it would, it would just take more time and ergo your indexing and general ranking based on it would be low.
This is something to think about if you are a giant, as this is critical for exposure.