Software Heritage Archive

(archive.softwareheritage.org)

30 points | by mxschumacher 2079 days ago

2 comments

  • dddddaviddddd 2078 days ago
    The article describing the archive suggests it be used for research in computer science, as well existing to preserve software culture. Commercial uses envisioned:

    > Software Heritage makes two key contributions to the IT industry that can be leveraged in software processes. First, Software Heritage intrinsic identifiers can precisely pinpoint specific software versions, independently of the original vendor or intermediate distributor. This de facto provides the equivalent of “part numbers” for FOSS components that can be referenced in quality processes and verified for correctness independently from Software Heritage (they are intrinsic, remember?).

    > Second, Software Heritage will provide an open provenance knowledge base, keeping track of which software component—at various granularities: from project releases down to individual source files—has been found where on the Internet and when. Such a base can be referenced and augmented with other software-related facts, such as license information, and used by software build tools and processes to cope with current development challenges

    [0] https://hal.archives-ouvertes.fr/hal-01590958/file/ipres-201...

  • gsaga 2078 days ago
    The page says that this archive has 4,782,131,719 source files and 4,188,748,858 directories. Isn't that weird? These numbers suggest that every 4 directories have less than 5 files on average. I would expect the first number to be much higher than the second one.
    • ComputerGuru 2078 days ago
      Perhaps there are long nested paths, e.g. `/usr/local/share/foo/{many files}` with nothing in the directories before `/foo` besides the child directory?
      • randoramax 2077 days ago
        The effects of com.java.some.class.there.and.here maybe :)
      • randoramax 2077 days ago
        The effect of Com.java.some.class.name ?