How to Detect and Stop Bad Bots

How to Detect and Stop Bad Bots

A bot, short for “robot,” is a software program that can perform tasks automatically, quickly, and efficiently. Both good bots and bad bots exist; Googlebot facilitates web page indexing, but LizardStresser orchestrates DDoS attacks. Because good and bad bots share certain traits, distinguishing between them can be tricky unless the correct bot detection techniques are used. In this article, we examine the evolution of bot detection techniques in response to the ever-changing threat landscape and discuss how bots can be detected and, when desirable, stopped.

What Is Bot Detection?

Bot detection is the process of identifying and distinguishing between legitimate human users, good bots, and bad bots. Because bots can mimic certain legitimate user behaviors, such as mouse movements and keystrokes, cybersecurity professionals and business leaders should implement bot detection as an integral component of their security strategy. Otherwise, you could end up with misleading analytics, compromised user experiences, and potential security breaches that can harm your organization’s reputation and bottom line.

Bot detection helps to mitigate malicious bot activities such as unethical web scraping, spamming, account takeover, click fraud, and DDoS attacks, without interfering with good bots such as website uptime monitors. Effective bot detection enhances cybersecurity and improves the web user’s overall experience.

Botnet Detection Techniques

Over the decades, different botnet mitigation techniques have been developed to deal with the challenges of stopping bad bots while allowing good bots to continue their activities. These techniques typically involve identifying the command-and-control infrastructure coordinating the botnet activities. However, since botnets keep evolving to bypass mitigation measures, new and better botnet detection and mitigation strategies are continuously being developed.

Let’s examine botnet detection techniques. We’ll start with the oldest and then look at contemporary techniques. However, new techniques build on the old, and all these techniques still play a part in botnet detection today.

Intrusion Detection Systems

Flowchart showing a basic intrusion detection system setup
Figure 1: How a basic intrusion detection system works

Intrusion detection systems (IDS) emerged in the late 1980s to monitor and analyze network traffic for security incidents like unauthorized access and policy violations. IDS can detect threats, such as botnets, and alert security teams. Intrusion prevention systems (IPS) can proactively mitigate detected threats. Modern IDPS (intrusion detection and prevention systems) combine IDS and IPS functions.

IDS is trained on data from sources like network traffic, system logs, and application activity. Botnet-focused IDS can be anomaly-based (monitoring abnormal behaviors) or signature-based (matching patterns with known botnets).

When a potential botnet is detected, the IDS generates alerts or notifications based on severity. Depending on cybersecurity policies, the IDS may block traffic, isolate systems, or alert security teams. IDS also generates incident logs and reports, detailing the time of incidents, detected threats, countermeasures, and recommendations for improvement.

Intrusion detection systems can be grouped into six types:

  • Network-Based Intrusion Detection Systems (NIDS): These monitor real-time network traffic and analyze packets on network segments or devices to detect attacks like DoS, port scanning, and reconnaissance.
  • Protocol-Based Intrusion Detection Systems (PIDS): A type of NIDS that targets specific network communication protocols (e.g., P2P, HTTP, IRC) to protect against intrusion and policy violations. PIDS is limited in scope.
  • Machine Learning-Based Intrusion Detection Systems (ML-IDS): Subset of NIDS using machine learning algorithms to detect network intrusions and malicious activities by learning from historical data. ML-IDS is more efficient than traditional rule-based systems but requires fine-tuning to minimize false positives.
  • Host-Based Intrusion Detection Systems (HIDS): Monitor the computer infrastructure they are installed on (e.g., computers, servers) to safeguard against attacks. They gather data, analyze traffic, and log suspicious behavior, providing insights into system health and security. HIDS is an approach that’s most suitable for small teams with lean overheads.
  • Hybrid Intrusion Detection Systems: Combine different detection techniques (e.g., NIDS, HIDS, anomaly-based, signature-based) in a single framework to effectively detect botnet activity and provide insightful data. Problematically, they create a single point of failure and are complex to troubleshoot.
  • Multi-Layered Intrusion Detection Systems: These systems combine different detection techniques (e.g., NIDS, HIDS, anomaly-based, signature-based) in a layered approach, with each IDS as a separate component. They eliminate a single point of failure and simplify troubleshooting but complicate setup, management, and reporting.

To summarize, intrusion detection systems (IDS) enhance network security by monitoring and analyzing traffic to detect potential threats, providing valuable insights and real-time response capabilities. However, they can produce false positives, require ongoing maintenance and fine-tuning, and may be complex to manage and integrate into existing security frameworks.

Honeynet

First used around the year 2000, a honeynet is a network of traps or decoy networks (honeypots) set up with built-in vulnerabilities to attract cyberattacks. A typical honeynet comprises two or more honeypots. Honeynets aid in botnet detection by deliberately exposing vulnerabilities that attract malicious attacks. This deception technique allows botnet attacks to be studied in a controlled environment or managed and stopped, as needed.

A typical honeynet setup
Figure 2: Honeynet setup

As such, there are two main types of honeynets: research honeynets and production honeynets. Research honeynets are primarily set up to study attack vector tactics, techniques, and procedures, while production honeynets are deployed within production environments.

Despite their effectiveness, honeynets have limitations, such as setup complexity, limited network coverage, and high maintenance overhead, especially for high-capacity setups. Additionally, honeynets can sometimes be detected, bypassed, armed, and deployed against the production network itself.

DNS-Based Botnet Detection

Setup of a DNS-based botnet detector
Figure 3: DNS-based botnet detection

Around 2005, the DNS-based botnet detection technique started to gain popularity. DNS-based botnet detection works by monitoring the way computers use the Domain Name System (DNS) to find websites. When you enter a website address into your browser, your computer uses DNS to find the numerical IP address that corresponds to that website. Botnets, which are networks of infected computers controlled by cybercriminals, often need to communicate with the attackers’ servers to receive instructions. They use DNS to find these servers.

A botnet detection system monitors all DNS requests made by network computers. They analyze which domain names are being requested and how often. Since botnets often use unusual domain names that people don’t typically visit, the systems look for patterns that indicate suspicious activity, such as frequent requests to these strange or newly created domains. They can then block the requests to these malicious domains, preventing the infected computers from communicating with the cybercriminals.

Although they provide real-time detection, network-wide coverage, low false-positive rates, and threat intelligence gathering, they are prone to evasion techniques and are limited by their reliance on external threat intelligence sources for domain reputation data.

Comparison of Botnet Detection Techniques

Here’s how these three botnet detection techniques compare.

FeatureIntrusion Detection Systems (IDS)HoneynetDNS-Based Botnet Detection
DefinitionNetwork security tools monitor and analyze network traffic for potential threatsNetwork of traps or decoy networks designed to attract cyberattacksTechnique monitoring and analyzing DNS traffic for botnet activity
Detection focusNetwork traffic, system logs, and application activityCyberattackers’ behavior and tacticsDNS traffic patterns, requests, and responses
Detection methodsSignature-based, anomaly-based, machine learningDeception through vulnerabilitiesDomain reputation checks, anomaly detection
Data collectedNetwork traffic, system logs, application activityAttack interactions with honeypotsDNS traffic, requests, responses
Alerting and responseGenerates alerts, blocks traffic, isolates systemsStudies attacks, handles malicious interactionsBlocks connections, redirects to sinkholes, alerts
Use casesPrevents unauthorized access, breaches, policy violationsStudies attack tactics, gathers threat intelligenceReal-time botnet detection, low false positives
ComplexityVaries based on IDS type (NIDS, HIDS, hybrid, multi-layered)Moderate to high due to setup and maintenanceModerate, relies on DNS traffic analysis
EffectivenessEffective for detecting network-based threatsEffective for studying attacks, gathering threat intelEffective for real-time botnet detection
LimitationsCan be bypassed by sophisticated attacksSetup complexity, limited network coverageProne to evasion techniques, reliance on external data
DeploymentNetwork-wide, host-based, hybrid, multi-layeredControlled environment, production networksDNS infrastructure monitoring
PopularityWidely used in cybersecurityLess common due to complexityIncreasing popularity
Future evolutionEvolving to integrate AI, threat intelligenceEvolving to address evasion techniquesEvolving to handle DNS tunneling
Management overheadVaried based on IDS type and deploymentHigh for setup, maintenance, and monitoringModerate for DNS traffic analysis

How to Stop Botnets

Now we know how undesirable botnets are detected, let’s turn to how they can be stopped. Three main options exist: CAPTCHA, rate limiting, and bot protection.

A. JS Challenges/CAPTCHA

One way to stop bad bot activity is by implementing JS Challenges and CAPTCHA on your websites or web applications. Both are effective security mechanisms used to protect against malicious bots, automated scripts, and other unauthorized automated activities, such as web scraping.

Image showing a typical image CAPTCHA featuring traffic lights
Figure 4: CAPTCHA

Gcore provides JS Challenge and JS CAPTCHA solutions as part of Gcore WAAP. First, a JS challenge runs a small piece of JavaScript code in the user’s browser, which a bot typically cannot execute. This code checks for typical human behavior and browser characteristics to ensure the request comes from a legitimate user. Next, a CAPTCHA presents a task that is easy for humans but difficult for bots, such as identifying objects in images or solving simple puzzles. By completing these tasks, users prove they are human, thereby preventing automated systems from accessing or abusing web services.

But there’s a downside: CAPTCHAs do not distinguish between beneficial bots (such as search engine crawlers or monitoring tools) and malicious bots. They can impede good bots from performing their intended functions. To allow good bots while still protecting against malicious ones, website administrators need to create exceptions or use alternative verification methods that can recognize and permit trusted bots. Gcore manages this process with our WAAP customers to ensure good bots continue to function effectively.

B. Rate Limiting

Image showing rate limiting in action
Figure 5: Rate Limiting

A key characteristic of bots is their ability to automate and rapidly scale tasks. For example, bots can fill and submit forms much faster than humans, sending a large number of requests to the server and receiving an equally large number of responses. This can drain server resources and degrade site performance.

Rate limiting controls the number of requests an IP address or IP range can make to a resource within a certain timeframe. This method mitigates bad bot activity on websites or web applications. Good bots don’t engage in this kind of behavior, so there’s not much risk of stopping their activity with a rate limiter.

Gcore Rate Limiter protects your websites and web applications from excessive requests that signal bad bot activity. You can specify a set of rules dictating how many requests are allowed per IP address per second. Once this limit is exceeded, the requester will receive an HTTP 429 (Too Many Requests) error message.

Stop Bad Bots with Gcore WAAP

While bot detection techniques such as honeynets, DNS-based bot detectors, and intrusion detection systems (IDSs) are effective in their own right, a hybrid or multi-layered bot detection approach is the most accurate way to detect bot activity. Gcore WAAP (Web Application Firewall + API Protection) is the ultimate all-in-one bot detection and protection solution for your websites and web applications. Gcore WAAP incorporates bot protection with a web application firewall, API security, and advanced DDoS protection to offer enhanced enterprise-grade security.

We protect against threats including and beyond the OWASP Top 10, addressing unpatched vulnerabilities and zero-day attacks by leveraging machine learning technologies. With Gcore WAAP, you enjoy API-specific protection and security against credential stuffing, account takeover, brute force attacks, and L7 DDoS attacks.

Gcore WAAP is scalable to meet your needs, regardless of industry. It is also easy to deploy—no additional hardware, software, or changes in the code are required on your part. Once you send a request, Gcore will start protecting your web resources immediately. Request Gcore WAAP today and enjoy bot-free websites and web applications.

Conclusion

Detecting and stopping bad bots involves a combination of advanced techniques tailored to identify and mitigate malicious activities while allowing beneficial bots to operate. Implementing a multi-layered bot detection strategy, such as Gcore WAAP, ensures comprehensive protection against various threats while maintaining website performance and user experience.

Gcore WAAP is integrated into Gcore’s global infrastructure, operating on 180+ global points of presence in Tier III and IV data centers, ensuring optimal performance, low latency worldwide, and outstanding security at the network’s edge. Secure your web applications and APIs against the most sophisticated cyber threats to safeguard your business’ reputation.

Discover Gcore WAAP

Subscribe to our newsletter

Stay informed about the latest updates, news, and insights.