So cool story while we wait, and it's definitely relatable as something similar happened to us a few weekends ago.
Few Saturdays back, I started getting flooded with Pagerduty alerts right after my Gym workout. Got back to the house and got on the work laptop and checked and we have a complete service outage of all production systems. Got on the horn and it's all hands on deck. Primary datacenter was frozen, secondary was active but in a very weird state.
Our storage guys started digging and it turns out their super expensive cross DC synchronous storage devices had gotten into an argument over who was in charge of the LUN's and the primary datacenter system decided to force all storage pools offline. This is functionally the same as ripping your hard drive out while your computer is still, only worse since the hypervisor thought the world was ending. Immediately got Dell (they owned it) on the call as a P1 and they were able to remote in and see the arrays frozen. Dug deeper and it's the result of some bug that they were supposed to get fixed but people had kept kicking it down the road. Took Dell + our storage folks over six hours to unfreeze the storage devices, apply the update, declare the primary DC as master forcing the second to discard all storage changes since the problem began. Then we had to bring everything back online in sequence, which took several more hours. Finally smoke test and validate everything good then declare incident over and go to bed. Storage guys had to write a very length report that Monday and our CIO + Ops Director chewed out some Dell operations manager who had been the one to say the bug wouldn't' effect us and we could wait to apply the fix.