Now, you want to make this process as efficient and affective as possible. Why? Because the web has become so big nowadays, that Google is not able to crawl everything anymore. This simply takes up too much time and resources.
Instead, Google spends a limited amount of time on your website. This concept is also known as crawl-budget. A crawl-budget can be seen as the amount of time Google spends on your site to discover new and updated content. To simplify this concept, compare this crawl-budget with a the traindriver, working from 9 to 5 each day.
With all this knowledge, you want to make sure that this train driver (Google) spends his time visiting the most important stations (pages) via the fastest route on rails (links). While visiting the stations, its important to provide essential information about the station:
- What is the name of the station? (page title)
- Why should I visit? (meta description)
- What can I do there? (body content + headings)
- What other stations can I visit? (internal links)
As your website grows, you might delete old pages. Suddenly, Google reaches a dead-end because you are still linking to that old page somewhere on your website. Or you redirect a page to a new one. Now Google has to first visit that page to find out you have moved the content to a new destination. What a waste of time!
But if you fix both of these errors, Google is able to visit all of your pages again in the most efficient manner.
You see? Understanding crawling doesn’t have to be hard 😄.
Indexation is the process in which Google indexes your page into their massive database. When someone performs a search all relevant sites are retrieved from this database, which are ranked from most relevant to least relevant by their algorithm.
To simplify this concept, think of the Google index as a library. Here you have different sections, bookcases, shelves, books, chapters and pages. The books refer to websites. The pages in these books refer to webpages.