DDO unavailable: Saturday March 30th

Xaerxiessia

Lost in Translation
Database backups come in different flavours for different purposes. I work with MS SQL Server databases and can't speak for other systems, but there are three different backup types: Full, differential/incremental and log.

It is customary to make a full backup only once in a while (days, weeks, months) depending on the size and activity of the database and then take differential/incremental backups with a higher frequency to create restore points where you only need to restore the latest full and the latest differential.........
This kind of infra runs on UNIX , and if the architecture is seriously designed , the DB are deployed on SAN.
With SAN you can synchronize spare volumes over RAID layers on live for real-time backup. Volume Managers are designed for this purpose. In such case the backup obviously need the time to stop and restart the DB engine.

And if you're really skilled , you put not only the DB on SAN , but the systems too (not on the same volumes , of course). I designed such an infra 22 years ago.
 

Evangelon

Member
We are continuing to work overnight to address issues preventing the game services from being available. If something changes overnight we will let you know, otherwise we will be back in touch in the morning. As to the cause, we have significant hardware that is not able to communicate properly, and we've engaged technical operations folks all day and night here to work on it at the data center.
thanks for the update, I dont envy your tech team the pressure and panic.

it may make things easier if you forget all that fibre and flexfabric modern day nonsense and embrace a true DnD solution: https://www.bbc.com/news/technology-42338067 , wet string! well, rope. 3.5mbps!
 

Sarandra

Well-known member
Based on what I read, my 'guess' is an authentication issue, not simply a matter of restoring backups.
The guessing game, OK.

1. The key people that know what's what are on holiday. They can't be arsed (rightfully) to return to office/desktop, since they took a plane and don't have their work environment near them.
2. Some database or settings file(s) or whatever got corrupted, making restarting the servers not an option.
3. Whatever, they should have safeguards in place to prevent this situation, making quick startups in case of disaster an option. Protocols need to be set up beforehand. It all boils down to being prepared for disasters. This requires work before stuff happens... and possibly they haven't done said work.

In any case, I've resigned to not playing this weekend, and possibly into next week. I hope this makes this a learning event.
 

raesene

Member
If we're playing guessing game, I'd say this "we have significant hardware that is not able to communicate properly" is the key phrase and my guess would be that it points to a loss of networking, which could be NICs, switches, or routers (or some combination of the three)

Tricky without knowing the hardware in use as to how hard that would be to replace. Has there ever been any details provided on the hardware/software stack that DDO uses?
 

Micki

Thazara of Orien
I find myself mildly concerned that it could be an issue that will cause loss of data. But seeing as they said that it's an issue with communication, it's probably a networking issue in a way or another, which should have no effect on the data.
 

Shardrena

Well-known member
The key is that us keyboard warriors don't know what's going on. None of us are part of the IT team, we don't have access to the hardware, and any guesses as to what's going on are blind guesses. Educated blind guesses for some, but still blind guesses.
 

Sarandra

Well-known member
I find myself mildly concerned that it could be an issue that will cause loss of data. But seeing as they said that it's an issue with communication, it's probably a networking issue in a way or another, which should have no effect on the data.
How long could it possibly take to solve a networking issue? A datacenter should have failover hardware in case stuff breaks down?

I'm betting something got corrupted.
 

Fnordian

Member
Overheard from the server room cleaning staff:
1. "Looks like someone accidentally spilled a potion of server shutdown. Don't worry, I've got my mop ready!"
2. "Well, seems like the servers decided to take a coffee break without informing anyone. Classic!"
3. "I swear, if I find another kobold chewing on the network cables, I'm going to start charging them rent!"
4. "Whoops! I think I mistook the 'Clean Server Room' button for the 'Shut Down Game' button. My bad!"
5. "I heard the servers went down because a wizard tried to cast 'Serverus Crashtotum.' Needless to say, it backfired!"
6. "You know, I always suspected the server room doubled as a secret disco, but this is taking it a bit too far!"
7. "I blame the gnomes. They're always tinkering with things they shouldn't be!"
8. "I swear, if I find another gelatinous cube trying to devour the servers, I'm going to switch to ghostbusting instead of cleaning!"
9. "I think the servers are just trying to unionize for better working conditions. Can't blame them, really!"
10. "Looks like the hamster powering the servers went on strike again. Time to give him a pep talk and some extra treats!"
 
Top