I can’t believe this isn’t satire. I hope these incompetent fuckers get sued into bankruptcy
“All of CrowdStrike understands the gravity and impact of the situation”
Here’s $10.
Or: next time go Linux instead
You’re not safe there either, they had almost the same issue on the Linux version of the product a few months ago.
Concerning linux, yesterday I was watching this video on computerphile on the crowdstrike incident. https://www.youtube.com/watch?v=rlaNMJeA1EA (*)
What is interesting is the comment made in the video on how chromebooks do software upgrades with dual “OS” disk-partitions and the ability to rollback to the previous OS-partition.
Question: is something like this also possible on one of the major linux distros? (debian, ubuntu, rocky, …) What would be the procedure to do this kind of “dual partition” system-upgrade?
(*) a great video that explained some of the technical details in a very clear way, including some very interesting ‘lessons learned’ and "what if"s If you ever need to explain crowdstrike to your manager, this video is a good start.
If I’m understanding the question right. This is what Immutable Linux distros do. Such as Nixos, fedora silver blue, and vanilla os.
I use nixos myself. But its quite different then most distros. The way you config it and install packages. For the better in my opinion.
Something like silverblue works pretty much the same as normal Fedora except you can’t install packages like you normally would. Because the system files can’t be edited. You mostly use flatpak for everything. Except the system updates. Which you have to reboot to switch to the new updated image. But past images are saved so you can rollback if needed.
From what I understand Chromebook os is a Immutable Linux distro same as the ones I mentioned. Just with Google with built in.
Yes, that was indeed the question.
If I read it correct, you need a specialised distro for this. You cannot do this on a off-the-shelf Debian or Ubuntu?
I’ll do some searching on ‘unmutable Linux’. Thanks for the (very quick) answer! 😀
I think the answers given here don’t quite fit the question.
Android and Windows have dedicated recovery partitions sectioned off on the disk that the OS never boots to and does not interact with during normal system operation.
If something goes wrong with the OS, then a signal is sent to the BIOS or other non-OS system to “hey, recover from this partition”.
Btrfs, NixOS, Guix, and other immutable (file-)systems, implement this via having a file system hierarchy protected by various permissions and softlinks to create a checkpoint of sorts, which is managed by a dedicated service that runs with the OS during normal system operation.
The drawback of these systems is that if something does go wrong with the OS, it cannot fallback to the BIOS to save it. The OS has to somehow signal to itself that it needs to restore from an earlier checkpoint.
Just watched some videos on btrfs. I start to understand the conceps. Perhaps I should also look into how exactly
On windows and the “recovery partion”. I guess what you say is that it should always be possiblity to boot in some kind of system, but it will not happen automatically as there is no way for a system to detect that the system completely hangs.
Thinking about it. It kind of strange. Embedded systems have watchdog interrupts that get fired if the system hangs (i.e. if it does not provide a “yes, I still live” signal every “x” milliseconds). Does a PC not have something similar?
This is very misleading!
CrowdStrike did not send gift cards to customers or clients. We did send these to our teammates and partners who have been helping customers through this situation. Uber flagged it as fraud because of high usage rates.
I mean, it makes it a little better, but I’d still be annoyed by it just being 10 bucks.
They might as well not do it. I’d be more insulted than a boss throwing a pizza party
I lost a day’s holiday, and our team spent 8 man days on this entirely preventable mistake.
$10? Try extending our licence by another year for free, that might start going towards it.
Why would you want another year of their software for free? This is their second screw up (apparently they sent out a bad update that affected some Debian and RHEL machines a couple years ago). I’d be transitioning to a competitor at the first opportunity. It seems they aren’t testing releases before pushing them out to customers, which is about as crazy to me as running alpha software on a production system.
I’m sure you have reasons, and this isn’t really meant to be directed at you personally, it’s just boggling to me that the IT sector as a whole hasn’t looked at this situation and collectively said “fuck that.”
[This comment has been deleted by an automated system]
Nah, I don’t buy that. When you’re in critical infrastructure like that it’s your job to anticipate things like people being above or below versions. This isn’t the latest version of flappy bird, this is kernel level code that needs to be space station level accurate, that they’re pushing remotely to massive amounts of critical infrastructure.
I won’t say this was one guy, and I definitely don’t think it was malicious. This is just standard corporate software engineering, where deadlines are pushed to the max and QA is seen as an expense, not an investment. They’re learning the harsh realities of cutting QA processes right now, and I say good. There is zero reason a bit of this magnitude should have gone out. I mean, it was an empty file of zeroes. How did they not have any pipelines to check that file, code in the kernel itself to validate the file, or anyone put eyes on the file before pushing it.
This is a massive company wide fuckup they had, and it’s going to end up with them reporting to Congress and many, many courts on what happened.
[This comment has been deleted by an automated system]