Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you monitor your product closely enough to know that there weren't other brief outages? E.g. something on the scale of unscheduled server restarts, and minute-long network outages?


I personally do through status monitors at larger cloud providers at 30 sec resolutions, never noticed a downtime. They will sometimes drop ICMP though, even though the host is alive and kicking.


Surprised they allow ICMP at all


why does this surprise you?

actually, why do people block ICMP? I remember in 1997-1998 there were some Cisco ICMP vulnerabilities and people started blocking ICMP then and mostly never stopped, and I never understood why. ICMP is so valuable for troubleshooting in certain situations.


Security through obscurity mostly, I don't know who continues to push the advice to block ICMP without a valid technical reason since at best if you tilt your head and squint your eyes you could almost maybe see a (very new) script kiddie being defeated by it.

I've rarely actually seen that advice anywhere, more so 20 years ago than now but people are still clearly getting it from circles I don't run in.


I don’t disagree. I am used to highly regulated industries where ping is blocked across the WAN


I do. Routers, switches, and power redundancy are solved problems in datacenter hardware. Network outages rarely occur because of these systems, and if any component goes down, there's usually an automatic failover. The only thing you might notice is TCP connections resetting and reconnecting, which typically lasts just a few seconds.


Of course. It's a production SaaS, after all. But I don't monitor with sub-minute resolution.


I do for some time now, on the scale of around 20 hosts in their cloud offering. No restarts or network outages. I do see "migrations" from time to time (vm migrating to a different hardware, I presume), but without impact on metrics.


Having run bare-metal servers for a client + plenty of VMs pre-cloud, you'd be surprised how bloody obvious that sort of thing is when it happens.

Also sorts of monitoring gets flipped.

And no, there generally aren't brief outages in normal servers unless you did it.

I did have someone accidentally shut down one of the servers once though.


to stick to the above point, this wasn't a minute long outage. if you care about seconds/minutes long outages, you monitor. running on aws, hetzer, ovh, or a raspberry in a shoe box makes no difference




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: