Status | Blacksmith - GitHub Actions is having an outage – Incident details

GitHub Actions is having an outage

Resolved
Major outage
Started 2 months agoLasted about 3 hours

Affected

Blacksmith Managed Runners

Operational from 5:32 PM to 8:32 PM

Incremental Docker Builders

Operational from 5:32 PM to 8:32 PM

API

Operational from 5:32 PM to 8:32 PM

Website

Operational from 5:32 PM to 8:32 PM

Updates
  • Resolved
    Resolved

    This incident has been resolved, queue times are back to normal.

  • Monitoring
    Monitoring

    GitHub has resolved the incident, and job dispatch has returned to normal. Our infrastructure is fully operational and new jobs are being picked up without delay. Earlier, you may have noticed some queued jobs being canceled as part of our mitigation efforts to prevent long backlogs -- this ensured new jobs could start processing quickly. Retrying canceled jobs should see them run normally once again. We’ll continue to keep an eye on metrics, but no further customer impact is expected. We apologize for the inconvenience.

  • Identified
    Identified

    We are about to issue a cancelation for some subset of queued jobs across orgs. This is in an effort to bring queue times back to normal by trimming down the backlog. Customers can retry the canceled jobs and expect normal operation thereafter.

  • Update
    Update
    Even though GitHub has reported their incident as resolved, but we’re still seeing delayed job starts for a fraction of jobs. Things are steadily improving as the backlog drains, but you may continue to see slower job starts until recovery is complete.
  • Update
    Update

    We're seeing a massive backlog of queued jobs due to the outage, which is now slowly draining, so you may see jobs being slower to get picked up for the next few minutes.