Have you ever wondered what is a Googlebot? I’ve found it mind-boggling sometimes to think how the web really works behind the scenes. It can be hard to visualize the bots crawling the web.
If you have more information and an infographic, it may make it easier and understand why SEO is integral to your blog today.
This information will make it easier to understand why SEO is so important, but don’t stop there! Read on for even more fascinating facts about Googlebot and its vast network of computers that keep the web crawling.
What Is A Googlebot?
A Googlebot crawls the web looking for new pages to crawl and index. Google uses a huge set of computers to crawl the web constantly. There are 17 different types of Googlebots today. These seem to grow each year with newer technology. For example, years ago there were no mobile or video Googlebots. (The varieties have grown significantly since this blog post was first written back in 2014!)
Hence, they use an algorithmic process via computer programs to determine which websites to crawl, how often, and how many pages to fetch from each website.
However, this Googlebot process can change from time to time via Google updates as well.
The Search Engine Crawl Process
Google’s crawl process begins with a list of website page URLs, generated from previous crawl processes. It is then increased with Sitemap data provided by webmasters. (That is why sitemaps are important for your blog to have.)
As a Googlebot visits each of these websites it detects links on each page and adds them to its list of pages to crawl. New websites, changes to existing websites, and dead links are noted and used to update the Google index.
Now, do you see why updating fresh content on your blog becomes so important? However, do note that updating for the sake of updating may appear as spam. Be sure to provide new information in the updated content.
Google doesn’t accept payment to crawl a site more frequently and they claim they don’t go by who spends how much on Google AdWords.
After a Googlebot does a crawl, it indexes the web pages by compiling an index of words it sees and their location on a page. Title tags and alternative attributes are definitely a plus in this crawling process. Many rich media files do not contain these and do not get indexed.
The Results In Search
The Google search engine is crawling the web in order to find websites that are relevant when you do a search. The relevancy of these sites depends on at least 200+ ranking factors, so it’s important for your site to be found by this bot and indexed properly if want to rank well.
You can always check with your webmaster tools in the Google Search Console for some insights into how things may look from their perspective.
Furthermore, you can see the new video page indexing for your blog as well. According to Google, it will show you:
- On how many pages has Google identified a video?
- Which videos were indexed successfully?
- What are the issues preventing videos from being indexed?
This will help you decide which videos to use on your blog posts or not. It will show you the issues preventing them from crawling your video as well.
How To Help The Googlebot On Your Blog
- Use internal links to guide the Googlebot.
- Have a valid sitemap.
- Check your blog for speed, a major factor for visitors and search engines.
- Have a clean URL structure.
- Use image optimization for all your images.
- Quality content, of course!
- Check your canonicals = duplicate pages.
- See your indexing in the search console, then submit an updated piece of content if needed.
- Look for blocked URLs and fix the blocked URLs. The robot files may not be working properly and made need adjustments.
The Nasty Googlebots
Did you note that every 24 Googlebots that visit your site are fake? Those are those nasty spammers and scrapers! According to Search Engine Watch, these nasty Googlebots are up 61% this year.
If you know for sure you have them you can block them via your Google Webmaster tools but be sure not to block Google or the other major search engines as well.
Speaking of nasty Googlebots and spammers – there are some now impersonating bloggers. I’ve had this happen several times now and want to share it so you can be on the lookout.
@Lisapatb Hey Lisa, someone left a comment on two of my blogs as you, but I know it’s not you. So weird..
— Mitch Mitchell (@Mitch_M) September 1, 2014
So be extra careful approving blog comments out there! Thanks to Mitch Mitchell for pointing this one out.
Your Turn on Googlebot
Does your blog or website have a sitemap? How often do you update your blog? I’d love to hear about your experience with it in the comments below.
- 3 Notorious Productivity Killers and How to Fight Them - June 2, 2023
- How Does Your Posture and Health Affect Your Day As A Blogger? - May 31, 2023
- How Does The Twitter Algorithm Work & How To Adjust Yours - May 31, 2023
Nice post Lisa.
As you said thataround 1 or 2 visits of about 25 organic visits are fake. For those fake visits there are no organic keywords, not even “unknown keywords”. It has bounce rate of 00.00% and avg page visit duration of 00:00secs!
I’m unsure whether these fake visits hurt SEO or not. 🙄
Hi Akshay, welcome to Inspire to Thrive. I would have to say they would surely hurt SEO efforts these days. Thanks for coming by and have a great day!
Thank you very much for the for the link to Adrienne’s article… 🙂
Wow… Spammers are masking Google Bots…. That is a new to me… thanks for sharing this great info… 🙂
You are welcome Karmakar. Oh yes, they are everywhere it seems. Even on Twitter I’ve noticed many offering more followers for money, really? DO people still do that? I hope you have a good weekend!
I definitely had a problem with understanding these Google bots and indexing. I just recently went to clean up broken URLs and I was wondering how Google even found these links ,many of which were broken or non-existent. I’ve been checking every 4-5 days now to see if I can keep it clean. Thanks for explaining this. It helped clear up a lot for me.
I didn’t know how much the spammers were out there. I didn’t know the numbers were so high and that upsets me as I do get a lot of spam and although it gets caught before it gets to me, it is still annoying.
Thanks for the info. Great post.
Hi Barbara – nice to see you again. Me too, Googlebots were not easy to understand. That’s a lot of work cleaning those up, been there and done that many times. I now have a a plugin for it that makes it a bit easier. Yes, spam can be a pain. I don’t have as much as I used to except for someone impersonating me at other blogs. You are welcome. Have a fantastic Friday and weekend ahead.
Great explanation of Googlebots and how they crawl a website to index it and then rank it according to their factors.
I think I know exactly what was happening in Adrienne’s case and called her to speak about it. It’s very deceptive and you personally shouldn’t respond in my opinion. Just ask the source it was posted on to remove it via private message. It’s basically a way for a hacker to find out if you’re legitimate and real. If you respond and give them feedback that you are indeed real, you’ve given them a reason to investigate further if you may be a good source to hack and steal financial data from.
As a web developer, I’ve seen this practice several times in the last year. Hackers and thieves will do anything they think they can in order to get access to your wallet and personal information. We do have to be careful out there, don’t we?
With that cheery note, I hope hope you have an amazing evening!!!
~ Don Purdum
Thank you Don. You are speaking of comments, correct? Wasn’t sure if email or private message would give them more information too. That’s pretty scary Don. Somehow I feel like it’s Halloween 🙂 Thanks for your bit of advice though, enjoy the rest of your evening.
Hi lisa, You have given very good information about Google Bot, I praised your writing style. And your post has given me a very strong and liked the infographics. I found your post write only “the natsy google bot” for more information about it. Thanks 😉
Hi Vaidhegi, welcome to Inspire to Thrive. Thank you, I’m glad it helped you to understand those googlebots better. Have a great day!
Google bots and everything else is so interesting but kinda frustrating at the same time. You know they are not all bad but trying to stop the small percentage of bad could cost you the good bots if you’re not careful.
It’s hard to believe that people will go around impersonating someone else. Well let’s not say hard to believe, they been doing it on Social Media for years so I guess lets say ‘Sad’ for lack of a better term.
The only bots I normally pay attention to are the spam comment bots. So annoying!
This is new to me thanks! Have a good rest of the weekend Lisa!
Hi Steven, I would agree with frustrating too 🙂 I find it hard to believe they are still trying to do that. Just like on Twitter I noticed some new followers that are offering to pay for followers, are you kidding me? They are still doing that too 🙁 I don’t have too many spam ones anymore since Andy’s backlink protector with CommentLuv, it’s really helped. Glad you were able to get something new out of it. It’s been a pretty good weekend. Hope yours is good as well!
I noticed you didn’t mention that the other thing googlebot does is asses your page speed. That’s why it’s smart to make sure you cache and minify all your blog resources.
The easiest explanation about Googlebot that I’ve seen Lisa. I wasn’t aware that spammers were faking Googlebot but I’m not surprized. Back when I has a forum, Googlebot seemed to live there. The forum listed site visitors on the front page and often it was just me and Googlebot. lol
I knew that wasn’t you right away. They’re going to be a lot smarter if they’re going to trick us into thinking they are you.:)
Hi Brian, me too. I thought the infograph made it easier to understand too. Oh no even in a forum? Thanks for letting me know Brian about that, they were at it again yesterday at Barry Wells blog, I’m not sure why they think they can get link juice out of it. Very odd. Happy Friday and have a great weekend ahead Brian. Thanks for coming by and commenting too.
It’s mind boggling, for sure. I’m stunned at how Google works behind the scenes. Still trying to process all of the programs, and bots, doing so much work in the background.
As for Adrienne, and now you, being mimicked by imposters, I have no clue why these fools would do this. Well I have a clue as they are desperate and lost, and are trying to capitalize on your good name, but I just can’t embrace the idea. Why? Hopefully they’ll cut it out soon as both you and Adrienne have outed them, along with Mitch, who made the spot right away.
The 60% crawl stat is oh so telling. Google dominates search like no other site on earth, so we should gear most of our SEO strategy toward pleasing the Big G…..although Bing is making some gains and positive changes it seems like Google is still running away with the lion’s share of search traffic.
Thanks so much for sharing Lisa.
I’ll tweet this in a bit.
Hi Ryan, Me too, it is hard to fathom at times. I though the infograph was helpful. I don’t get it either Ryan, seems so odd. What are they gaining from it?
It’s too bad Bing can’t get a greater share of the market. Bing even gives rewards. Maybe the MSN side of it is weaker and Google has so many products.
You are welcome Ryan and have a great day there in paradise Ryan.
You know I block fake google crawlers through Wordfence. Also, I block the whole country such as China and Ukraine since they are the most malicious country of all which I suspect does all the fake google crawling. I think it’s impossible just to block IPs since spammers and intruders oftentimes have bizillon different IPs.
As far as these impostors are concerned, oh gosh I hate them. I don’t know why they are doing this and what good will it bring them? Anyway, I hope you have a great evening.
Hi Angela, I love WordFence though I don’t have premium yet but saw your post on that one. I was surprised to see how many from the US as well. I don’t understand why they are doing this either Angela, very odd. Thanks for coming by and I hope you having a great week to start September.
didn’t really get the concept of google bots not until I read this post. nice writeup. but lately, I’ve been getting error messages saying googlebot can’t access my site due to some DNS errors
Hi Gracia, Thank you. Welcome to Inspire To Thrive. It is hard to wrap our heads around it but I thought this infograph would help 🙂 I see you have a blogger blog, I suggest you contact Mayura at https://getsatisfaction.com/Mayura4Ever – he is a whizz with that. It could be your domain is not set up right too. I remember having an issue like that but since you have .blogspot that is probably not it. Good luck!
Today you have brought a remarkable stuff to know about.:)
Most of bloggers including me always think about these Googlebots. I mean isn’t amazing as how they work so effectively? There are millions of websites and these Googlebots crawl them in a regular interval of time.
As you have mentioned that there are millions of computer which owned by Google which are the reason that people are having every information on World Wide Web.
Thanks for sharing.:)
Hope you are having a nice week.:)
Hi Ravi, thanks. I always wondered about those bots and how Google really “crawls” our sites. It is an amazing process. You are welcome. I hope you are having a great week Ravi.
Those Google Bots are a royal pain in me bum! I can’t stand them! I like to watch my live traffic sites and crack em as they come! I remember that comment I received on MGP using your email address but was no where near anythign you would have said, PLUS you had already commented! Silly bots! Thanks for the heads up!
Hi Bren, that’s funny that they commented after I already had. That is quite silly. I still don’t understand why but hopefully they will stop soon. Thanks for coming by Bren and have a great new day ahead.
Thanks for explaining how Googlebots operate, Lisa. How aggravating to fall prey to those bad ones! Here’s hoping that gets resolved soon. I noticed the new popup ad at the bottom of your page here. Can’t say I’m a fan of those, but they certainly will get your attention more than banner ads.
Love the new avatar and congratulations on becoming a grandma! 🙂
Hi Debbie, you are welcome. It sure is aggravating…Oh yes – Infolinks – I am experimenting with them. I hope to do a post soon on it. Thank you on the avatar, after almost 4 years it was about time 🙂 Thanks – I can’t wait to go and see her soon as they live across the country. Thankfully I can see pics daily on FB. Have a great day Debbie and I appreciate you coming by.
So this is why spam comments have gone up significantly on my blog? Akismet catches them so it is not really a problem, but the spam comments nearly doubled in August from July.
Mark I see you have Livefyre – is there something happening because of that system? I know with CommentLuv many had issues but it wasn’t set up right so that’s why I ask. Something to check on. If I see anything on it I will be back to let you know. Thanks for coming by Mark.
I hadn’t thought of that. They show up in my WordPress comment spam folder, but they do not show up in LIvefyre, so I don’t think that Livefyre is the culprit, but I could be wrong. As I said, it is not really a problem since Akismet catches them. If I figure anything out, I will let you know.
Hi Mark, you may want to check this link out about Livefyre and spam – http://www.wpbeginner.com/opinion/reasons-why-we-switched-away-livefyre/ Interesting read on the spam comments there.
Very interesting article. I never knew that there were fake Google Bots as well. Just a question in my mind related to this…are these bots created by Google themselves for some purpose? Or are they just pieces of codes to resemble Google bots and used for hacking purpose?
Also, spam comments are a major cause of disappointment. But, I have never heard of bots resembling bloggers. This is absolutely something new to me.
Great write-up. 😕
Understanding Google is a tough task and I really didn’t know until now how Googlebot thing works. I am sure there still remain unsolved mysteries of the working of Googlebots.
The only way to get nearest possible to what Google thinks of your blog or website is by using Google Webmasters tools. I use Google XML sitemaps which updates my blog’s sitemaps and submits it to all the webmaster accounts as soon as my blog is updated. However, rest of the information like broken links, duplicate description etc. is accessible only Google Webmaster tools.
Well, it might seem a bit funny but I have got best results from Google at the times when I didn’t care about it(I hope there are no bots around right now 😛 🙄 ). During those times I just thought of doing my natural blogging and connecting with other bloggers on social media or via blog commenting. But at present, I am paying attention to it and the results are fine too.
Since I learned quite a few things here, I would have to make a few changes. Thanks a lot for sharing this with us. It is always a treat to come here and read such tips.
I was thinking that there’s nothing we can do about it. So far, this is the best post on googlebot I’ve read so far. Thanks for the post. Wish you have a great week
Yes, I do have a sitemap, but I should probably be checking it regularly. I don’t check if Google is crawling my site and I don’t check if everything is ok. I just assume everything is great. How often do you check it?
This is a great lesson about what googlebots are used for and how it scrapes of the data from each of your blog post to get them indexed in Google! Those shows how relevant SEO still is to this day and age of blogging.
The scary thing is one of those bots are fake. I can only imagine the harm they can do to you.
Is this why you changed your profile picture?
If so, I hope everything is ok now. We all definitely need to be careful and cautious!
Thanks for sharing and the heads up on the fake bots! I hope you’re having a great week!
I believe it was last month I had Google crawling my site more times than I wanted but they weren’t the only ones so I had to slow down their time on my site because they were really making things worse for me. The good and the bad ones, you are absolutely right about those. Great definition though of what they are including the ones we don’t want crawling our sites.
Mitch had told me that they are still impersonating you. Man, I can’t believe that. If they are still impersonating me, no one has let me know. Maybe I outed them and they quit, I like to think that’s the case at least.
I’m still on the lookout for everyone now more than ever. We have to help each other out as they try to slip something by us all.
Congratulations Grandma and hope you’ve been enjoying the new baby. Have a great week Lisa.
Hi Adrienne, How did you slow down their time? That sounds interesting. I could not believe it too Adrienne, guess they won’t be giving up and will keep on trying. Me too, I’m always double checking and triple checking new commenters. It’s great that we as bloggers do look out for one another. Great community! Thank you, I was so surprised to have her arrive 3 weeks early 🙂 Have a great day ahead Adrienne and hope it’s cooled down for you there. Maybe that grass will finally come up.
Absolutely, there is a setting to tell Google how often they can crawl your site. On the Webmaster Tools Home page, click the site you want then click the gear icon in the top right hand corner. Then click Site Settings. In the Crawl rate section, select the option you want. That’s all you do and it will just slow them down but only for 90 days. When they’re crawling it along with ones you don’t want to then they can all reek havoc on your site.
I hope they quit soon Lisa but you’re being rescued so that’s the important thing. They’re not getting through.
Enjoy that baby and your week too.
Hi Adrienne, thank you. I remember now doing that for my retail sites years ago. I haven’t been doing it since with this blog but great idea. Thanks to people like you Adrienne they are being rescued 🙂 I can’t wait to go see the baby soon as they live across the country. Hope you have a terrific Thursday.
Glad to let you know about the issue and it was so easy to tell it wasn’t you. These folks are so sneaky and it’s a shame that they have to rely on nefarious means to try to make a buck.
Thanks Mitch, appreciate the heads up you gave me. I still don’t quite understand why except to get a link. Seems awful silly to go through that for it though. Thanks for coming by and have a wonderful Wednesday Mitch.
I’ve often wondered about the behind the scenes process too. Thanks for breaking it down and sharing it with us. The infographic you shared had some interesting stats.
I’m sorry you had someone impersonate you online. I’m sure that was a scary situation and I hope you don’t have to go through that again. Thanks for the reminder! I always make sure to double check the comments from people I don’t know.
Have a great day and congratulations on becoming a grandma! It’s the best feeling ever!
Hi Corina, yes it can be hard to understand at times. You are welcome. I’m sure it is still going on. I’m sure I’ll get another tweet about it soon. I hope they realize they are not getting away with it. Thanks, I was so surprisied she came 3 weeks early! I can’t wait to go and see her. Have a wonderful Wednesday!
Thanks for sharing. Oh this is odd. Google Bot does that? Hmm! Thanks for the heads up girl. Going to share this around 🙂
By the way, noticed you use Infolinks now?
Hi Reginald, LOL, I’m sure you know about that one too. I am, what do you think of them? I’m testing it out. Thanks for noticing and have a great day ahead.
Is there any specific way to identify those nasty, fake Googlebots through webmaster tools or any other 3rd party tools? How does one identify such a thing and save some bandwidth?
I would really love to know if you could shed some light on that process.
Talking about impersonating people, I have seen that happening and it is one of those things that is very hard to deal with just like it is hard to fight impersonators in real world if they are projecting themselves as you and I somewhere else. This is for the people who are approving those comments to decide if the personal is impersonating or it it is the real person. It just puts some extra work on the blogger who is receiving such a comment.,,,
Hi Lisa & Kumar 🙂
@ Kumar: I don’t think that you can save some bandwidth. Actually I have to rephrase it: you can do it, but you won’t be willing to pay the price…
Banning the IP won’t help you. The same spambot doesn’t always use the same IP. You have to ban countries (all IPs). Would you do it? 😉
@Lisa: Those infolinks are funny. Never used them, so I don’t know whether they are setup correctly or not. Kumar’s comment includes such a link for the wording “one of those” – I don’t think that is anyone willing to kill their time and click on such a keyword. Actually that wording isn’t a keyword, and that’s the problem 😉 The article itself includes such fake keywords, such as “check your” or “what”. You may want to check what’s wrong with these infolinks.
That’s me again. I see that some infolinks are changed when the page is refreshed. So you may not see the infolinks I referred to. I don’t see myself some of them. Now I see some new funny “keywords” such as “looking for” 😉
Thanks again Adrian, I can always count on you for being so observant 🙂 Have a great day.
Thanks Adrian, I will have to check that out, been offline a bit more than I’d like this past week so hadn’t noticed. I appreciate the feedback and will be an upcoming post about them as well.
Hi Kumar, here is what I found to answer your question: Test that your robots.txt is working as expected. The Test robots.txt tool on the Blocked URLs page (under Health) lets you see exactly how Googlebot will interpret the contents of your robots.txt file. The Google user-agent is (appropriately enough) Googlebot.
The Fetch as Google tool in Webmaster Tools helps you understand exactly how your site appears to Googlebot. This can be very useful when troubleshooting problems with your site’s content or discoverability in search results.
It sure can be hard when approving comments from first time folks at the blog. I always check out their websites, their email address, etc and see how it all matches up or not. If not it goess right to spam.
Thank you for coming by Kumar and for asking a great question. Have a wonderful rest of the week there.
You can lock out the fake bots away from your sites if you are using the Wordfence plugin. The option is available for both free and premium versions. Not sure of any other third party applications or Google service though.
Interesting infographic, though I wasn’t aware that Google-bots only crawl 4 page on each visit. I belong to India and I’m disappointed to know that Fake bots comes from India too. 🙁
Pankaj, I was disappointed about so many bots from the United States 🙁 I guess every country has some. Thank you for coming by Pankaj and have a great rest of the week there.