From 52d46196ddafb4239797a2ef991791e3a5aac0f9 Mon Sep 17 00:00:00 2001 From: Earl Warren Date: Wed, 18 Sep 2024 15:50:50 +0200 Subject: [PATCH] nginx configuration for rate limiting crawlers Fixes: #8 --- README.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/README.md b/README.md index 533b570..54844fb 100644 --- a/README.md +++ b/README.md @@ -101,6 +101,31 @@ Forwarding TCP streams (useful for ssh) requires installing the module: sudo apt-get install libnginx-mod-stream ``` +Rate limiting crawlers is done by adding the following to `/etc/nginx/conf.d/limit.conf`: + +``` +# http://nginx.org/en/docs/http/ngx_http_limit_req_module.html +# https://blog.nginx.org/blog/rate-limiting-nginx +map $http_user_agent $isbot_ua { + default 0; + ~*(GoogleBot|GoogleOther|bingbot|YandexBot) 1; +} +map $isbot_ua $limit_bot { + 0 ""; + 1 $binary_remote_addr; +} +limit_req_zone $limit_bot zone=bots:10m rate=1r/m; +limit_req_status 429; +``` + +and the following in the location to be rate limited: + +``` + location / { + limit_req zone=bots burst=2 nodelay; + ... +``` + ## Host wakeup-on-logs https://code.forgejo.org/infrastructure/wakeup-on-logs