Customers are experiencing issues logging in to the portal
Incident Report for Check Point Services Status
Postmortem

Post-incident Report: Infinity Portal Failure on September 5, 2021

Incident information:

Incident ID CLOIS-3191
Start Date Sunday, September 5, 2021, at 00:06 AM UTC
End Date Sunday, September 5, 2021, at 00:55 AM UTC
Consequences Infinity Portal EU/US partial outage - Specific flows involving specific services were non functioning

Summary

Between 00:06 AM and 00:55 AM UTC on September 5, 2021 users in both EU and US data residencies couldn’t login to the Infinity Portal.

The event was triggered by an alert (#28252) at 00:15 AM that was acknowledged immediately by the on-call engineer. The specific alert is indicating if there is failure to login to the portal, and the number of unreachable data residencies.

Once the event was acknowledged, a software engineer, DevOps engineer and an On-Duty manager (Group Manager) collaborated to work on resolving the problem.

The issue was fixed once we restarted the active 100 instances of the gateway service in EU data center, and rollout restarted users service in both EU and US data centers.

Root Cause Analysis

The root cause of the issue was a networking failure between our gateway and some specific services (users, geo-discovery, etc.), as multiple requests got 500 or 504 status code responses for no apparent reason. Most responses took less than 50ms so having 504 (Gateway timeout) is very strange behavior. Once we restarted our EU gateway instances there were no 500 or 504 responses.

Due to this specific incident, a total of 208 requests failed between 03:00 IST and 04:00 IST.

Actions Taken

00:18 UTC – Alert created

00:22 UTC – Alert acknowledged

00:50 UTC – EU Gateway instances restarted

00:55 UTC – Issue resolved

Next Steps

Action Completion Date
Contact AWS support for help investigating the networking issue between the gateway to our multiple services – DevOps team Immediately
Posted Sep 05, 2021 - 12:28 UTC

Resolved
The outage is finally resolved, and Infinity Portal is coming back online.

We deeply apologize for any inconvenience.
Posted Sep 05, 2021 - 01:32 UTC
Monitoring
We have just applied a fix, and now we are monitoring the results. We'll let you know as soon as the problem is finally solved.
Posted Sep 05, 2021 - 01:11 UTC
Investigating
Some customers are experiencing issues logging in. We’re aware of the issue and are working to resolve it.

Please know our teams are doing their best to identify the root cause and implement a solution, and we will update you with the latest information as soon as possible.
Posted Sep 05, 2021 - 00:31 UTC
This incident affected: Infinity Portal (Infinity Portal EU Region, Infinity Portal US Region).