Investigating increased web request error rates
Incident Report for
We experienced a brief increase in web request errors for about two minutes.
This was caused by a sudden increase in traffic which was seemingly designed to circumvent our first line of page caching.

In total our servers served back 165k 503 errors, and successfully handled 302k requests, all in the two-minute span. Before this surge we were serving about 61k web requests an hour, about 1k a minute.

Our server auto-scaling for handling increase loads cannot respond that fast.

In this case there is not much we can do in the short term to prevent this type of abuse, how they circumvented the cache forcing our web servers to have to build pages for each request can't be fixed quickly, but it is now on our radar for future development.

In the meantime, we have left our web cluster at a higher than normal scaling to help handle other similar request storms if that occur.
Posted Sep 26, 2022 - 22:16 UTC
We are investigating increased web request error rates.
Posted Sep 26, 2022 - 21:25 UTC
This incident affected: Power Outage Systems (Power Outage Site, Power Outage API).