What is not realistic? To do simple input validation on data that has the potent...

raxxorraxor · 2025-11-19T13:06:16 1763557576

Cloudflares success was simplicity to build a distributed system in different data centers around the world to be implemented by third party IT workers while Cloudflare were a few people. There are probably a lot of shitty iPhone apps that do less important work and are vastly more complex than the former Cloudflare server node configuration.

Every system has a non-reducible risk and no data rollback is trivial, especially for a CDN.

aquariusDue · 2025-11-19T12:44:37 1763556277

Yeah, I don't quite understand the people cutting Cloudflare massive slack. It's not about nailing blame on a single person or a team, it's about keeping a company that is THE closest thing to a public utility for the web accountable. They more or less did a Press Release with a call to action to buy or use their services at the end and everybody is going "Yep, that's totally fine. Who hasn't sent a bug to prod, amirite?".

It goes over my head why Cloudflare is HN's darling while others like Google, Microsoft and AWS don't usually enjoy the same treatment.

miyuru · 2025-11-19T13:40:39 1763559639

>It goes over my head why Cloudflare is HN's darling while others like Google, Microsoft and AWS don't usually enjoy the same treatment.

Do the others you mentioned provide such detailed outage reports, within 24 hours of an incident? I’ve never seen others share the actual code that related to the incident.

Or the CEO or CTO replying to comments here?

>Press Release

This is not press release, they always did these outage posts from the start of the company.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

aquariusDue · 2025-11-19T16:43:23 1763570603

> Do the others you mentioned provide such detailed outage reports, within 24 hours of an incident? I’ve never seen others share the actual code that related to the incident.

Azure (albeit pretty old): https://devblogs.microsoft.com/devopsservice/?p=17665

AWS: https://aws.amazon.com/message/101925/

GCP: https://status.cloud.google.com/incidents/ow5i3PPK96RduMcb1S...

The code sample might as well be COBOL for people not familiar with Rust and its error handling semantics.

> Or the CEO or CTO replying to comments here?

I've looked around the thread and I haven't seen the CTO here nor the CEO, probably I'm not familiar with their usernames and that's on me.

> This is not press release, they always did these outage posts from the start of the company.

My mistake calling them press releases. Newspapers and online publications also skim this outage report to inform their news stories.

I wasn't clear enough on my previous comment. I'd like all major players in the internet and web infrastructure to be held to higher standards. As it stands when it comes to them or the tech department of a retail store the retail store must answer to more laws when surface area of combined activities is took into account.

Yes, Cloudflare excels where others don't or barely bother and I too enjoyed the pretty graphs, diagrams and I've learned some nifty Rust tricks.

EDIT: I've removed some unwarranted snark from my comment which I apologize for.

dspillett · 2025-11-19T12:24:33 1763555073

> To do simple input validation on data that has the potential to break 20% of the internet?

There will always be bugs in code, even simple code, and sometimes those things don't get caught before they cause significant trouble.

The failing here was not having a quick rollback option, or having it and not hitting the button soon enough (even if they thought the problem was probably something else, I think my paranoia about my own code quality is such that I would have been rolling back much sooner just in case I was wrong about the “something else”).