Crashdoom (he/him)MA to

Pawb.Social AnnouncementsEnglish · 2 years ago

[Unexpected Downtime Postmortem] Oct 6th

2

29

[Unexpected Downtime Postmortem] Oct 6th

Crashdoom (he/him)MA to

Pawb.Social AnnouncementsEnglish · 2 years ago

2

One of the data storage systems (CEPH) encountered a critical failure when Proxmox lost connection to several of its nodes, ultimately resulting in the CEPH configuration being cleared by the Proxmox cluster consensus mechanism. No data, except ElasticSearch, was stored on CEPH.

When the connection was lost to the other nodes, a split-brain occurred (when nodes disagree on which changes are authoritative and which should be dropped). As we tried to recluster all of the nodes, a resolution occured that resulted in the ceph.conf file being wiped and the data on CEPH being unrecoverable.

Thankfully, we’ve suffered no significant data loss, with the exception of having to rebuild the Mastodon ElasticSearch indexes from 6 AM this morning to present.

I’d like to profusely apologize for the inconvenience, but we felt it necessary at the time to offline all services as part of our disaster recovery plan to ensure no damage occurred to the containers while we investigated.

You must log in or register to comment.

Chat

phantomkitty
link
fedilink
English
arrow-up
10·
2 years ago
Downtime is better than data loss. It was the right decission to make.
Stefen Auris
link
fedilink
English
arrow-up
8·
2 years ago
I feel very reassured that you have plans for things like this and you do the right thing even if it means a little downtime

Pawb.Social Announcements

pawbsocial_announcements

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !pawbsocial_announcements@pawb.social

Community locked: only moderators can create posts. You can still comment on posts.

Official announcements from the admin team of Pawb.Social

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
54 users / month
122 users / 6 months
247 local subscribers
476 subscribers
25 Posts
195 Comments
Modlog