Bots have become an integral part of today’s digital space. They help us order groceries, play music on our channel, and pay colleagues for the delicious smoothies they bought us. Bots also populate the internet to perform the functions for which they were designed. But what does all this mean for webmasters? And perhaps most importantly, what does that mean for the environment? Read on to find out what you need to know about bot traffic and why you should care!
What is a traffic bot?
First of all, a bot is a program created to perform automated tasks over the Internet. Bots can mimic or replace the behavior of a real user. They are very good at performing repetitive and mundane tasks. It’s also fast and efficient, which makes it a great option if you need to do something on a large scale.
Bot traffic refers to non-human traffic to a website or application. If you own a website, chances are it has been visited by a bot. Bot traffic will account for over 40% of all internet traffic in 2022. We have seen this number increase in recent years, and we will continue to see this trend for the foreseeable future.
Bot traffic sometimes gets a bad rap, and in many cases, it is really bad. But there are also good and legal bots. It depends on the purpose of those bots. Some bots are necessary to run digital services, such as search engines or personal assistants. Some bots want to infiltrate your website and steal sensitive information. So, what are “good” and “bad” bot activities, let’s delve a little deeper into these two types of bots.
“Good” bots perform specific functions that do not harm your site or server. They advertise themselves and let you know what they are doing on your website. Search engine crawlers are probably the most popular bots of this type. Without crawlers visiting your website to discover the content, search engines would have no way of providing information when searching for something. When we talk about “good” bot traffic, we’re talking about these bots. It is quite normal for a site to receive a small percentage of traffic from “good” bots.
Aside from search engine crawlers, some other good web bots include:
If you’re in the SEO space, you’ve probably used tools like Semrush or Ahrefs to do keyword research or gain insight into your competition. For these tools to provide your information, they must also send out bots to crawl the web to collect data.
Businesses send these bots to crawl the web to collect information. For example, research companies use them to monitor market news. Advertising networks need them to monitor and improve display ads. Coupon sites collect discount codes and sales programs to provide services to users on their websites.
Website monitoring bots
They help you monitor your website’s uptime and other website metrics. They periodically check and report data like server status and uptime so you can take action if your site is in trouble.
They collect and aggregate newsworthy content to present to your website visitors or email subscribers.
“Bad” bots are created with malicious goals in mind. You are probably familiar with spam bots that spam your website with inconsequential comments, irrelevant backlinks, and disreputable ads. You may have heard of bots that replace people in online lotteries or buy good seats at concerts. Because of these malicious bots, bot traffic gets a bad reputation. Unfortunately, a large amount of bot traffic comes from such “bad” bots. It is estimated that bad bot traffic will account for 27.7% of all internet traffic in 2022.
Here are some bots you don’t want on your site:
They collect email addresses and send malicious emails to that audience.
Spam bots comments
It spams your website with comments and links that redirect people to a malicious website. Or in many cases, they spam your website to advertise or try to get backlinks to their site.
These bots will come to your website and download anything they can find. This can include text, images, HTML files, and even videos. Bot operators will then reuse your content without permission.
Bots for topping up or brute force attacks
These bots try to access your website to steal sensitive information. They do this by trying to log in like a real user.
Botnet, zombie computers
Networks of infected devices are used to carry out DDoS attacks. DDoS stands for distributed denial of service. During a DDoS attack, the attacker uses a network of devices to flood a website with botnet traffic. This will flood the web server with requests, resulting in a slow or unusable website.
Inventory and ticket bots
They visit websites to buy tickets to entertainment events or buy newly released products in bulk. Brokers use it to resell tickets or products at a higher price to make a profit.
Why should you care about bot traffic?
Now that you know about botnet traffic, let’s talk about why you should care about it.
For the security and performance of your website
We have discussed several types of bad bots and their functions. You don’t want malicious bots lurking around your website. They will undoubtedly spoil the performance and security of your website. Malicious bots disguise themselves as normal human traffic, so they may not be visible when you check your website’s traffic statistics. This can hurt your business decisions because you don’t have the right data. You may see random spikes in traffic but not know why. Or you may be confused as to why you are getting traffic and not conversions.
In addition, malicious botnet traffic will stress your web server and may overload it at times. These bots take up your server bandwidth with their requests and make your website slow or completely unavailable in the event of a DDoS attack. In the meantime, you may lose traffic and sales to other competitors.
Malicious bots will harm your site’s security. They try to log into your website using different username/password combinations or look for login vulnerabilities and report them to their operators. If you have security holes, these malicious operators may try to install viruses on your website and spread them to their users. And if you own an online store, you need to manage sensitive information like credit card details that hackers love to steal.
For the sake of the environment
Let’s go back to the question at the beginning of the post. You should be concerned with bot traffic because it affects the environment more than you think. When a bot visits your site, it sends an HTTP request to your server and requests information. Your server should respond to this request and return the necessary information. When this happens, your server must expend a small amount of energy to complete the request. But if you think about all the bots on the internet, the amount of energy spent on bot traffic is huge.
In this sense, it does not matter whether a good or lousy bot visits your site, because the process is still the same. Both use energy to perform their functions and both have consequences for the environment. Although search engines are an essential part of the internet, they are also guilty of trash.
Now you know the basics, search engines send crawlers to your site to discover new content and update old content. But they can visit your site multiple times and not pick the right changes. We recommend checking the server history to see how often crawlers and bots visit your site. The Crawl Stats report in Google Search Console also tells you how often Google crawls your site. You might be surprised at some of the numbers out there.
but that is not all. Google’s bots aren’t the only ones that visit us. There are bots from other search engines, digital services, and even bad bots. This unnecessary bot traffic stresses our website server and wastes energy that could be used on other valuable activities.
What should be done against “bad” bots?
You can try to identify bad bots and prevent them from entering your site. This saves a lot of bandwidth and reduces stress on your server, which in turn helps in saving power.
The primary way to do this is to block an individual or group of IP addresses. If you detect irregular traffic from a source, you should block that IP address. This approach is successful, but it is labor-intensive and time-consuming. Alternatively, you can use a bot management solution from providers like Cloudflare. These companies have an extensive database of good and bad bots. They also use artificial intelligence and machine learning to identify and block malicious bots before they can harm your site.
If you have a WordPress website, you must install a security plugin. Some of the most popular security add-ons (such as Sucuri Security or Wordfence) are maintained by companies that employ security researchers who monitor and fix problems. Some security plugins will automatically block some “bad” bots for you. Others let you see where unusual traffic is coming from and decide how to handle that traffic.
What about “good” bots?
As mentioned earlier, “good” bots are good because they are essential and transparent in what they do. But they can consume a lot of energy while performing their tasks, which affects the environment. Not to mention, these cute bots may not be of any use to you. Even if what they do can be considered “good”, it can harm your website and ultimately the environment. So what can you do for good bots?
1. Block them if they are not helpful
You have to think and decide if you want these “good” bots to crawl your site or not. Is it beneficial for you to get them to crawl your website? And significantly, do those who crawl you earn more than the cost of the servers, their servers, and the environment?
Take, for example, search engine bots. You know that Google is not the only search engine out there. Crawlers from other search engines may have visited you. Let’s say you check your server logs and notice that the search engine has crawled your site 500 times today, but only brought you ten visitors. If so, can we allow these search engine bots to crawl your site? Or should you block them because you are not getting much value from this search engine?
2. Limit the crawling speed of the robot
If they support delayed crawling in robots.txt, you should limit their crawl speed so that they don’t come back every 20 seconds and crawl the same links over and over. This is very useful for medium to large sites that crawlers visit frequently. But small sites also benefit from using crawl delay. You probably won’t be updating your website’s content 100 times in a given day, even for larger websites. And if you have copyright bots visiting your site to check for copyright infringement, should they come every few hours?
You can play with the crawl rate and monitor its impact on your website. You can set a specific crawl delay rate for crawlers from different sources. Start with a small delay and increase the number when you are sure there will be no negative consequences. Unfortunately, crawl delay is not supported by Google, so you don’t need to set it for Google bots.
3. Help them crawl more efficiently
You can specify which parts of your site you don’t want robots to crawl and block from access via a robots.txt file. This not only saves energy but also helps improve your crawl budget. There are many places on your website where crawlers don’t work. For example, these may be the results of your internal search. Nobody wants to see it in the general search engines. Or, if you have a staging website, you probably don’t want people to find it.
Then, you can help bots crawl your site better by removing unnecessary links that your CMS and plugins automatically generate. For example, WordPress automatically generates an RSS feed for your website comments. Of course, this RSS feed contains a link, but hardly anyone will look at it, especially if you don’t have a lot of comments. Hence, the existence of this RSS feed may not be of value to you. This only creates another link for the crawler to crawl repeatedly and wastes energy in the process.