Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think one reason is that people are just bad at statistics. Chance of materialization * impact = small. Sure. Over a short enough time that's true for any kind of risk. But companies tend to live for years, decades even and sometimes longer than that. If we're going to put all of those precious eggs in one basket, as long as the basket is substantially stronger than the eggs we're fine, right? Until the day someone drops the basket. And over a long enough time span all risks eventually materialize. So we're playing this game, and usually we come out ahead.

But trust me, 10 seconds after this outage is solved everybody will have forgotten about the possibility.



Absolutely, but the cost of perfection (100% uptime in this case) is infinite.

As long as the outages are rare enough and you automatically fail over to a different region, what's the problem?


Often simply the lack of a backup outside of the main cloud account.


Sure, but on a typical outage how likely is it that you'll have that all up and running before the outage is resolved?

And secondly, how often do you create that backup and are you willing to lose the writes since the last backup?

That backup is absolutely something people should have, but I doubt those are ever used to bring a service back up. That would be a monumental failure of your hosting provider (colo/cloud/whatever)


> Sure, but on a typical outage how likely is it that you'll have that all up and running before the outage is resolved?

Not, but if some Amazon flunky decides to kill your account to protect the Amazon brand then you will at least survive, even if you'll lose some data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: