Skip to main content

How to Prevent Unnecessary or Harmful Bot Traffic from Your Site

Updated over a week ago

All web traffic on your site isn’t caused by humans as it is also frequented by bots. For example, a web crawler or a spider also crawls your site and indexes its contents. This way your site will be categorized and it becomes more visible in various search engines. Problems can occur however, for example, if several bots arrive to scour the site simultaneously.

GoAccess reports, which can be read using the Seravo Plugin, calculate all HTTP traffic on the website. Whereas typical visitor analytics tools (such as Google Analytics) show more detailed information about website visitors and their numbers, GoAccess reports include all web traffic, including bots. Reports can be found on a monthly basis via the WordPress admin panel under Tools > Site Status > HTTP Traffic Statistics:

Blocking traffic with nginx configurations

If you want to set up an nginx configuration, also known as nginx conf, for your site, you can create a new file with the .conf extension on the server in the /data/wordpress/nginx/ directory. To make the settings take effect, restart nginx by running the wp-restart-nginx command in the command line.

Blocking individual IP addresses

If you want to block IP addresses from your website, you can do so with the following nginx configuration:

# IP block
if ($remote_addr = 12.123.12.123) {
set $block_requests "1";
}

if ($block_requests = "1") {
return 403;
}

User-agent-based blocking

Individual bots can be blocked using the following nginx configuration:

if ($http_user_agent ~* (badbot|maliciousbot)) {
return 403;
}​

Country-specific block

If you want to block traffic from an entire country, you can also do this with nginx conf. However, when blocking countries, you should consider the potential disadvantages, such as the visibility of your website in search engines, because blocking an entire country may also prevent search engine crawlers from accessing your website. When blocking countries, you should still allow IP addresses related to Seravo's monitoring.

Countries can be blocked with the following nginx configuration:

# Block traffic from geo country code ES 
if ($http_x_seravo_geo_country_code ~* "(ES)") {
set $block_requests "1";
}

# Seravo monitoring
if ($remote_addr = 2a04:3542:1000:910:7c25:3fff:fe79:23da) {
set $block_requests "0";
}

# Seravo monitoring
if ($remote_addr = 94.237.85.150) {
set $block_requests "0";
}

if ($block_requests = "1") {
return 403;
}

Restricting traffic with the robots.txt file

You can also try to influence the behavior of bots by making changes to the robots.txt file, but unfortunately, there are also bots that do not obey the rules. Even useful bots may react quite slowly to rule changes and will only examine the updated robots.txt contents on their next visit.

If you do not want to block a bot completely with a robots.txt rule, you can set a crawl-delay value for the bot, if it obeys it. This will make the bot crawl through the site at a slower pace.

Need help or more information?

Contact our customer support by sending a message to [email protected].

Did this answer your question?