When your app experiences a sudden flood of requests, it can degrade performance, increase latency, or even cause downtime. This situation is common for apps hosted on public endpoints with infrastructure scaled for low traffic, such as MVPs or apps in the early stages of product development.This guide outlines steps to detect, analyze, and mitigate such floods of requests on the Aptible platform, along with strategies for long-term preparation.
Send logs to a third-party service (e.g., Papertrail, LogDNA, Datadog) using a Log Drain. These services, depending on the features of each provider, allow you to:
Chart the volume of requests over time.
Analyze patterns such as bursts of requests targeting specific endpoints.
Use APM Tools to identify bottlenecks:
Purpose: Application Performance Monitoring (APM) tools provide insight into performance bottlenecks.
Key Metrics:
Endpoints with the highest request volumes.
Endpoints with the longest processing times.
Database queries or backend processes which represent bottlenecks with the increase in requests.
Determine if Endpoint or resources should be public:
If the app is not yet in production, consider implementing IP Filtering as a measure to only allow traffic from known IPs / networks.
Consider if all or portions of the app should be protected by authenticated means within your control.
Investigate Traffic Source:
Authenticated Users: If requests originate from authenticated users, verify the legitimacy and source.
Public Activity: Focus on high-traffic endpoints/pages and optimize their performance.
Monitor App and Database Metrics:
Use Aptible Metric Drains or viewing the in-app Aptible Metrics to observe CPU and memory usage of apps and databases during the event.
Scale Resources Temporarily:
Based on observations of metrics, scale app or database containers via the Aptible dashboard or CLI to handle increased traffic.
Specifically, if you see the worker_connections are not enough error message in your logs, horizontal scaling will help address this issue. See more about this error here.
Validate Performance of Custom Error Pages:
Ensure error pages (e.g., 404, 500) are lightweight and avoid backend processing or serving large or uncached assets.
A flood of requests doesn’t have to bring your app down. By proactively monitoring traffic, optimizing performance, and having a well-rehearsed response plan, you can ensure that your app remains stable during unexpected surges.