DDO unavailable: Saturday March 30th

Kessaran · Mar 31, 2024

Specimen Spiff said:
I have to laugh at all of you who are all in on the "a good data center is foolproof! They must have cheaped out!" Two real world examples, both Microsoft Azure: One time they were doing cleanup and had a script to delete old SQL VM instance backups. Unfortunately the script was written wrong on instead of deleting "old backups" it deleted the live customer VMs. Every SQL instance running on the whole East Coast node went poof. Oopsie. That was a three hour outage for my company. Another real case, they somehow had a whole rack of management servers that was totally outside all the fault tolerance mechanisms. No backups, no secondary power supply, not even scripted into the VM management systems. And then they had a power failure, and all the actual customer machines were fine, they failed over fine, but this rack of network routers, VM managers, etc went down hard. And when it got back on power and rebooted, there was nothing to tell the master VM manager that these servers were supposed to be core infrastructure, and it just saw a bunch of high power servers and started spinning up high price tier customer VMs on it, overwriting all the critical VMs that weren't backed up. That one didn't hit my company, but it was a 24+ outage for a lot of Azure customers.

By that logic, Microsoft Azure was operational ~3 hours after the incident. Even the 2nd outage you explained was up ~24 hours later. It's been nearly 36 hours and a video game data center still isn't operational. This is part of their jobs and they aren't doing them. When your companies entirely livelihood depends on the games being operational and making money by keeping customers happy, a critical failure like this that prevents happy customers should be addressed immediately with the full workforce.

Mindos · Mar 31, 2024

So what other companies has this datacenter outage affected? Why is it only LOTRO and DDO?

Jack Jarvis Esquire · Mar 31, 2024

Can I just point out, I don't have any of these these problems with golf.

Those are MUCH worse!

Sylvado · Mar 31, 2024

New friend said:
people that pay a subscription for this game should really reconsider it. down time for ~2days is mind boggling. No rollback, lack of contingency plan and mitigation plan, how is their issue management process this bad, escalation path for issues. why are you paying a sub fee, if they can't intelligently manage a game?

I am VIP and will continue VIP. It is not always just roll back and restart. I support applications for one of the largest and most heavily regulated companies in the world. I have been on command center calls that went on for 48 hours with dozens of people working the problem. No one on this forum knows the scope of the issue so stop playing IT hero.

rohmer · Mar 31, 2024

Specimen Spiff said:
I have to laugh at all of you who are all in on the "a good data center is foolproof! They must have cheaped out!" Two real world examples, both Microsoft Azure: One time they were doing cleanup and had a script to delete old SQL VM instance backups. Unfortunately the script was written wrong on instead of deleting "old backups" it deleted the live customer VMs. Every SQL instance running on the whole East Coast node went poof. Oopsie. That was a three hour outage for my company. Another real case, they somehow had a whole rack of management servers that was totally outside all the fault tolerance mechanisms. No backups, no secondary power supply, not even scripted into the VM management systems. And then they had a power failure, and all the actual customer machines were fine, they failed over fine, but this rack of network routers, VM managers, etc went down hard. And when it got back on power and rebooted, there was nothing to tell the master VM manager that these servers were supposed to be core infrastructure, and it just saw a bunch of high power servers and started spinning up high price tier customer VMs on it, overwriting all the critical VMs that weren't backed up. That one didn't hit my company, but it was a 24+ outage for a lot of Azure customers.

Redic.

The number of times there are outtages at cloud services can be counted on one hand

New friend · Mar 31, 2024

Sylvado said:
I am VIP and will continue VIP. It is not always just roll back and restart. I support applications for one of the largest and most heavily regulated companies in the world. I have been on command center calls that went on for 48 hours with dozens of people working the problem. No one on this forum knows the scope of the issue so stop playing IT hero.

oh cool

Silverfox · Mar 31, 2024

rohmer said:
Redic.

The number of times there are outtages at cloud services can be counted on one hand

Strange

https://www.crn.com/news/cloud/the-15-biggest-cloud-outages-of-2023

Sylvado · Mar 31, 2024

rohmer said:
Redic.

The number of times there are outtages at cloud services can be counted on one hand

That is funny, I even remember a Cloudflare issue that impacted our business. It does happen and you would need a very big hand to count on.

Jack Jarvis Esquire · Mar 31, 2024

Sylvado said:
I am VIP and will continue VIP. It is not always just roll back and restart. I support applications for one of the largest and most heavily regulated companies in the world. I have been on command center calls that went on for 48 hours with dozens of people working the problem. No one on this forum knows the scope of the issue so stop playing IT hero.

You forgot to shout "FORE!"

NotSteve · Mar 31, 2024

Kessaran said:
a critical failure like this that prevents happy customers should be addressed immediately with the full workforce.

Full workforce? How would you expect the marketing team to help with a issue like this

Orchidblossomwasgutted · Mar 31, 2024

Nemo said:
Failing at connecting to patch server , progress!

dont feed the troll

Dandonk · Mar 31, 2024

O that we now had here but one ten thousand of those men in England that do no work today!

Kalsang · Mar 31, 2024

Balloc said:
I want to hear the spilling of blood, byle, and the lamintation of the women. I want to see blood, gore and gut's, see veins in my teeth, eat dead burn't bodies.... you know, play DDO!

That's the answer! We should all go to Alice's Restaurant and chill out for a while. So, how many of you old-fogey nerds get that reference?

Kessaran · Mar 31, 2024

NotSteve said:
Full workforce? How would you expect the marketing team to help with a issue like this

Full workforce as in everyone in the respective department working on it. Sadly I just remembered that it's SSG we're talking about and they probably have 2 people with under 5 years of experience working their entire networking department.

wreck · Mar 31, 2024

anyone else remember the brief glimmer of hope when "...the next update will either be an ETA or servers online.."

Kessaran · Mar 31, 2024

Remember when the DDO store went down on the 9th of this month and it "magically" got fixed overnight?

Col Kurtz · Mar 31, 2024

Jack Jarvis Esquire said:
Can I just point out, I don't have any of these these problems with golf.

Those are MUCH worse!

just like skiing...we can get weather delays and cancellations. I should prob be skiing right now, but visibility looks pretty bad up in the local mountains right now.

on side note: I may have to start yelling ''FORE" when I drop off a cornice in your honor

Toede · Mar 31, 2024

Kessaran said:
By that logic, Microsoft Azure was operational ~3 hours after the incident. Even the 2nd outage you explained was up ~24 hours later. It's been nearly 36 hours and a video game data center still isn't operational. This is part of their jobs and they aren't doing them. When your companies entirely livelihood depends on the games being operational and making money by keeping customers happy, a critical failure like this that prevents happy customers should be addressed immediately with the full workforce.

Sorry, fiscally irresponsible

Kessaran · Mar 31, 2024

Toede said:
Sorry, fiscally irresponsible

To be fair, at least it keeps the rest of us fiscally responsible eh?

Episkopos · Mar 31, 2024

This image was (allegedly) smuggled from the datacenter.

Things look bleak.

DDO unavailable: Saturday March 30th

Well-known member

CHAOTIC EVIL

Well-known member

Well-known member

Well-known member

New member

Well-known member

Well-known member

Well-known member

New member

Well-known member

This is not the title you're looking for

New member

Well-known member

Member

Well-known member

Well-known member

Well-known member

Well-known member

Lawful Good Never Looked This Evil