Status | Blacksmith - Job adoption delays – Incident details

Job adoption delays

Resolved
Major outage
Started 3 days agoLasted about 1 hour

Affected

Blacksmith Managed Runners

Operational from 12:45 PM to 1:14 PM

eu-central ARM

Operational from 12:45 PM to 1:14 PM

eu-central x86

Operational from 12:45 PM to 1:14 PM

us-west ARM

Operational from 12:45 PM to 1:01 PM, Partial outage from 1:01 PM to 1:14 PM, Operational from 1:01 PM to 1:27 PM

us-west x86

Operational from 12:45 PM to 1:01 PM, Partial outage from 1:01 PM to 1:14 PM, Operational from 1:01 PM to 1:27 PM

eu-west x86

Operational from 12:45 PM to 1:01 PM, Partial outage from 1:01 PM to 1:14 PM, Operational from 1:01 PM to 1:27 PM

Updates
  • Update
    Update
    This incident has been resolved.
  • Resolved
    Resolved

    We implemented a fix and jobs are being picked up as normal. This incident is now closed.

  • Monitoring
    Monitoring

    Our engineers are currently implementing a mitigation. Thank you for your patience.

  • Identified
    Identified

    We have identified the issue as GitHub sending us a high cardinality of malformed webhooks missing critical pieces of information in the payloads. We are working on a patch to work around this as we wait for GitHub to fix the upstream issue.

  • Update
    Update

    We believe this is related to an issue with Github Webhooks and are still investigating.

  • Investigating
    Investigating

    We are receiving reports of job adoption delays, we are currently investigating.