Network issue
Incident Report for The Linux Foundation
Postmortem

Vexxhost's public cloud experienced internal network issues which affected storage and internal networks for all systems. An internal spine was failing in a way that sometimes packets were passed and sometimes packets were dropped, which put the internal network in a failed state. Vexxhost replaced each part of the network spine until the problem switch was identified and the network was restored. This required some additional poweron/reboot actions from LF to bring compute instances back online.

Posted Nov 16, 2018 - 21:41 UTC

Resolved
This incident has been resolved.
Posted Nov 16, 2018 - 19:55 UTC
Update
We are continuing to monitor for any further issues.
Posted Nov 16, 2018 - 18:32 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 16, 2018 - 17:24 UTC
Update
Critical internal network infrastructure has failed and a replacement is being done right now
Posted Nov 16, 2018 - 17:08 UTC
Update
Upstream provider reported that they're having issues with their local router and they're trying to reload it. ETA will either be under 30 minutes if things go smoothly or it might extend into several hours
Posted Nov 16, 2018 - 15:30 UTC
Identified
An issue with an upstream network provider is impacting our CI services.
Posted Nov 16, 2018 - 13:34 UTC