Yeah I tried it too. Killed at 65G. Disappointed that Linux killed Chrome first.
Oct 12 15:47:52 x99 kernel: [552390.074468] Out of memory: Kill process 7898 (git) score 956 or sacrifice child
Oct 12 15:47:52 x99 kernel: [552390.074471] Killed process 7898 (git) total-vm:65304212kB, anon-rss:63789568kB, file-rss:1384kB, shmem-rss:0kB
Interesting. Linux didn't kill Chrome, it died on its own.
Oct 12 15:42:21 x99 kernel: [552060.423448] TaskSchedulerFo: segfault at 0 ip 000055618c430740 sp 00007f344cc093f0 error 6 in chrome[556188a1d000+55d1000]
Oct 12 15:42:21 x99 kernel: [552060.439116] Core dump to |/usr/share/apport/apport 16093 11 0 16093 pipe failed
Oct 12 15:42:21 x99 kernel: [552060.450561] traps: chrome trap invalid opcode ip:55af00f34b4c sp:7ffee985fb20 error:0
Oct 12 15:42:21 x99 kernel: [552060.450564] in chrome[55aeffb76000+55d1000]
Oct 12 15:47:52 x99 kernel: [552390.074289] syncthing invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0, oom_score_adj=0
Seems Chrome faulted first, but it was probably capturing all signals and didn't handle OOM. Then next, syncthing faulted and it started the oom-killer which correctly selected 'git' to kill.
I'm setting up my last machine for my wife for gaming. Athlon X4 630, and 16 GB of RAM. I loaded windows up and said it had ~2 GB free and I was like "oh crap, the RAM sticks must be dead" (because the last motherboard that I just replaced broke some RAM slots).
I fixed my old video card, a GTX 560, and wanted to see what it could run. I loaded steam and PUBG said "invalid platform error". It took me a moment. I hit alt-pausebreak, presto, Windows 32-bit. Whoops.
Hadn't had that problem in a long time except at clients running ancient windows server versions complaining about why Exchange 2003 won't work with their iPhones anymore "it used to work and we didn't change anything!" (Yeah... but the iPhone DID change--including banning your insecure 2003 Exchange protocols.)
Wow, I didn't notice just how much fluctuation there has been in RAM prices. My Newegg order history shows I paid $65 for 16 GB of DDR3/1600 at the end of 2015. Now the exact same product is sold by Newegg for $122. Crazy!
I'm curious how this was uploaded to GitHub successfully. I guess they do less actual introspection on the repo's contents than I thought. Did it wreak havoc on any systems behind the scenes (similar to big repos like Homebrew's)?
From there, you can do some operations like `git log` and `git cat-file -p HEAD` (I use the "dump" alias; `git config --global alias.dump catfile -p`), but not others `git checkout` or `git status`.
 Thanks to Jim Weirich and Git-Immersion, http://gitimmersion.com/lab_23.html. I never knew the guy, but, ~~8yrs~~ (corrected below) 3.5yrs after his passing, I still go back to his presentations on Git and Ruby often.
Thanks for the correction, he truly was a brilliant mind. One of my regrets was not being active and outgoing enough to go meet him myself. I was lived in the Cincinnati area from 2007-2012. I first got started with Ruby in 2009, and quickly became aware of who he was (Rake, Bundler, etc) and that he lived/worked close by. But, at the time, I wasn't interested in conferences, meetups, or simply emailing someone to say thanks.
Since GitHub paid a bounty and Ok'd release, perhaps they've patched some aspects of it already. Might be impossible to recreate the issue now.
My naive question is whether CLI "git" would need or could benefit from a patch. Part of me thinks it doesn't, since there are legitimate reasons for each individual aspect of creating the problematic repo. But I probably don't understand god deeply enough to know for sure.
Visual Studio Team Services has a fundamentally different architecture, but we do some similar mechanisms despite that. (I should do some talks about it - but it's always hard to know how much to say about your defenses lest it give attackers clever new ideas!)
Tested locally on a GitLab instance: trying to push the repo results in a unicorn worker allocating ~3GB and pegging a core, then being killed on a timeout by the unicorn watchdog.
Counting objects: 18, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (17/17), done.
Writing objects: 100% (18/18), 2.13 KiB | 0 bytes/s, done.
Total 18 (delta 3), reused 0 (delta 0)
remote: GitLab: Failed to authorize your Git request: internal API unreachable
To gitlab.example.com: lloeki/git-bomb.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'firstname.lastname@example.org:lloeki/git-bomb.git'
I had "Prevent committing secrets to Git" enable though. Disabling this makes the push work. The repo first then can be browsed at the first level only from the web UI, but clicking in any folder breaks the whole thing down with multiple git processes hanging onto git rev-list.
Huh, I've tested on a bunch of devices/connections and haven't encountered that. Do you know what causes AMP to be that slow for you? I'll take a look at serving non-AMP pages by default. It will require tweaking how image inclusion works.
>You need a problem to have a solution to it. What do you consider to be the problem here?
>This is essentially something that can be expressed in relatively few bytes that expands to something much larger.
That seems to be the problem. I mean, if an object expands to something much larger to the point that it crashes services just by the sheer volume of the resources it takes... That is pretty much the definition of an attack vector of a denial-of-service attack.
That's not a solution, that's sweeping the problem under the rug: "just have the OS provide storage, therefore it's not my problem any more, solved. (Never mind that with a few more layers, the tree would decompress into a structure larger than all the storage ever available to mankind)"
One option is to modify each of the utilities so that it doesn't have a full representation of the whole tree in memory. I doubt this is feasible in all cases, though for something like 'git status' it should be doable.
If the tree object format was required to store its own path, then you wouldn't be able to repeat the tree a bunch of times. The in-memory representation would be the same size, but you would now need that same number of objects in the repository. No more exponential fanout.
But that would kind of defeat the purpose of Git for real use cases (renaming a directory shouldn't make the size of your repo blow up).
If I understood correctly, this problem isn't caused by recursive copies but simply by expanding references. The example shows that the reference expansion leads to an exponential increase in resources required by the service.
This means the same in this context; if it was just expanding references one by one while walking through the tree this would not happen - the bomb requires copies of expanded references to be stored in memory.
Directory hard links would "fix" this issue since `git checkout` could just create a directory hard link for each duplicated tree. I wonder why traditional UNIX does not support this for any filesystem.
(Yes you would need to add a loop detector for paths and resolve ".." differently but it's not like doing this is conceptually hard.)
You can't make a valid recursive tree without a pre-image attack against SHA1. However `git` doesn't actually verify the SHA1s when it does most commands. If you make a recursive tree and try `git status` it will segfault because the directory walking gets stuck in infinite recursion.