Cool URIs don't change (1998)

(w3.org)

309 points | by Tomte 1041 days ago

27 comments

  • jasode 1041 days ago
    >When you change a URI on your server, you can never completely tell who will have links to the old URI. [...] When someone follows a link and it breaks, they generally lose confidence in the owner of the server.

    With 2+ decades to look back on after this was written, it turns out the author was wrong about this. Web surfers often get 404 errors or invalid deep links get automatically redirected to the main landing page -- and people do not lose confidence in the site owner. Broken links to old NYTimes articles or Microsoft blog posts don't shake the public's confidence in those well-known companies. Link rot caused by companies reorganizing their web content is just accepted as a fact of life.

    This was the 2013 deep link I had to USA Treasury rates: http://www.treasurydirect.gov/RI/OFNtebnd

    ... and the link is broken now in 2021. I don't think that broken link shakes the public's confidence in the US government. Instead, people just use Google/Bing to find it.

    • tialaramex 1041 days ago
      > people do not lose confidence in the site owner

      I don't agree. I know Microsoft will move or even outright delete important content because chasing shiny new ideas will get somebody promoted. And so I direct people away from Microsoft solutions because that's going to make my life easier.

      Any link to Microsoft documentation that doesn't both go via aka.ms and have a significant user community to ensure the aka.ms link stays correct will decay until it's useless.

      So by the time you read this https://aka.ms/trustcertpartners may point at a page which is now styled as a wiki or a blog post or a video, but it will get you the list of CAs trusted by Microsoft's products.

      However links from that page (e.g. to previous lists) are internal Microsoft links and so, unsurprisingly, they've completely decayed and are already worthless.

      For the US Treasury, just like Congress or the Air Force, I don't have some alternative choice, so it doesn't really matter whether I think they're good or bad at this. But Microsoft is, to some extent at least, actually in a competitive business and I can just direct people elsewhere.

      • jasode 1041 days ago
        >I don't agree. [...] And so I direct people away from Microsoft solutions

        I don't doubt there is a narrower tech audience (e.g. HN) that would base tech stack strategy on which company has the most 404 errors but my comment was specifically responding to the author's use of "generally".

        I interpreted "they generally lose confidence in the owner of the server" -- as making a claim about the psychology of the general public non-techie websurfers. TBL had an intuition about that but it didn't happen. Yes, people are extremely annoyed by 404 errors but history has shown that websurfers generally accept it as a fact of life.

        In any case, to followup on your perspective, I guess it's possible that somebody avoided C# because the Microsoft website has too many 404 errors and chose Sun Java instead. But there were lots of broken links in sun.java.com as well (before and after Oracle acquisition) so I'm not sure it's a useful metric.

        • xmprt 1041 days ago
          Why can't it be both? They accept it as a fact of life but also lose confidence. I understand why links get broken and accept that it can happen but when it does, I try to stray away from that website. Non-techie websurfers probably don't even understand why the link is broken so they get even more confused. One one hand they might think it's their fault but on the other hand, they might assume the entire website stopped working.
      • catblast01 1040 days ago
        > And so I direct people away from Microsoft solutions because that's going to make my life easier.

        Ah, so that’s why Microsoft’s financials have been tanking.

      • Spare_account 1041 days ago
        >I know Microsoft will move or even outright delete important content because chasing shiny new ideas will get somebody promoted. And so I direct people away from Microsoft

        Do the other providers that you recommend in place of Microsoft have a better record of not changing/removing URIs on their website?

        • tialaramex 1041 days ago
          Yes. You might be underestimating just how bad Microsoft are at this.

          I was originally going to say that although the links on the Microsoft page currently work they can't be expected to stay working. They had worked when I last needed them after all. But that was at least a year ago and so I followed one, and sure enough meanwhile Microsoft had once again re-organised everything and broke them all ...

          Even the link to their current list as XML (on Microsoft's site) from this page is broken, it gives you some XML, as promised, but I happen to know the correct answer and it's not in that XML. Fortunately the page has off-site links to the CCADB and the CCADB can't be torched by somebody in a completely different department who wants to show their boss that they are an innovator so the data in there is correct.

          Microsoft provides customer facing updates for this work by the way. The current page about that assures you that updates happen once per month except December, and tell you about January and February's changes. Were there changes in March, April, or May? Actually I know there were, because Rob Stradling (not a Microsoft employee) mirrors crucial changes to a GitHub repository. Are they documented somewhere new that I couldn't find? Maybe.

      • avipars 1041 days ago
        thank god for the wayback machine ;)
    • bartread 1041 days ago
      > and people do not lose confidence in the site owner.

      No, but some of us do get incredibly pissed off with them.

      It's unendingly tiresome to find that some piece of content has disappeared from the web, or been moved without a redirect. Often archive.org can help but there's plenty of stuff that's just ... gone.

      I don't necessarily run into this problem every day, but anything up to two or three times a week depending on what I'm looking for.

      • npteljes 1041 days ago
        I see this just as how ads are perceived to be annoying. People can complain all the time but this annoyance, or in this case distrust, just doesn't seem to affect anything.
        • parafactual 1040 days ago
          Gwern found that placing banner ads on his site significantly decreased traffic: https://www.gwern.net/Ads
          • gwern 1040 days ago
            It's a good comparison. An effect can be real, and people just not notice it, ever. Let's round off the ad effect to 10% loss of users. How do you notice that? You can simulate out traffic, and decrease the mean 10% at some random point and draw time-series: it's actually quite hard to see, particularly with any kind of momentum or autocorrelation trends. And that's with website traffic where you can quantify it. How do you measure the impact on something more intangible? If people can not notice that effect of ads, they can certainly not notice subtler effects like the long-term harm of casually breaking links...

            Is <10% worth caring about? It's certainly not the difference between life and death for almost everyone; no successful business or website is going to shut down because they had ads, or because they broke old URLs. On the other hand, <10% is hardly trivial, people work hard for things that gain a lot less than that, and really, is defining redirects that hard?

            • sbierwagen 1039 days ago
              Speaking of noticing small effects, Mechwarrior Online was four months into its open beta before anyone noticed that an expensive mech upgrade that was supposed to make your weapons fire 5% faster actually made them fire 5% slower. http://www.howtospotapsychopath.com/2013/02/20/competitively...
            • parafactual 1039 days ago
              This principle seems like it extends far beyond site traffic, especially since something like life satisfaction is much harder to measure.
              • gwern 1039 days ago
                Yes, it's a kind of slippery slope: as typically set up, changes are biased towards degrading quality. If you run a normal significance test and your null is leaving it unchanged, then you will only ever ratchet downwards in quality: you either leave it unchanged, or you have a possibly erroneous choice which trades off quality for degradation, and your errors will accumulate over many tests in a sorites. I discuss this in the footnote about Schlitz - to stop this, you need a built-in bias towards quality, to neutralize the bias of your procedures, or to explicitly consider the long-term costs of mistakes and to also test quality improvements as well. (Then you will be the same on average, only taking tradeoffs as they genuinely pay for themselves, and potentially increasing quality rather than degrading it.)
      • gwoplock 1041 days ago
        I swear this happens every time I visit a form. Either the 1 image I need has been purged or every post links to some other dead form. I've had pretty bad luck with archive.org for these things.
    • sacado2 1041 days ago
      Yes, because the examples you give already have a solid reputation.

      You can get away with it when you are "the government .gov", but if you are "small start-up.com" and I want to invest in your risky stocks and all I get is an "Oops page" when I want to know a little more about your dividend policy, I'm gone.

      • organsnyder 1041 days ago
        For me it depends on where the link originated. If it's a 404 within your own site, that shows sloppiness that will make me question your ability to run a business. However, if I get a 404 on a link from an external site, I'll assume things have been reorganized and try to find the new location myself (while wondering why it was moved). Changing URIs without setting up forwarding is so common I wouldn't give it much thought.
    • kbenson 1041 days ago
      > Web surfers often get 404 errors or invalid deep links get automatically redirected to the main landing page -- and people do not lose confidence in the site owner.

      I do, to a degree. When I follow some old link to a site and realize they've changed their structure and provided no capability to find the original article if it's still available but at a new URI, I lose confidence.

      Not in them overall, but in their ability to run a site that's easy to find what you want and that's usefulness over time isn't constantly undermined. I lose confidence in their ability to do that, which affects how I view their site.

      > ... and the link is broken now in 2021. I don't think that broken link shakes the public's confidence in the US government.

      Maybe not the Govt itself, but in their ability to provide information through a website usefully and in a way that's easy to understand and retains its value? I'm not sure most people had the confidence to begin with required for them to lose it.

    • c22 1041 days ago
      Consumers will accept a lot of abuse before their faith is completely shaken, especially with large companies. There's still nothing cool about it.
    • nfriedly 1041 days ago
      I disagree. In your example, it doesn't cause me to loose faith in the US treasury itself, but I am less confident that they can reliability and correctly run a website.
    • aqsalose 1041 days ago
      It is more like, I lose confidence in whatever claims (if any) the links were meant to support and confidence that the linked-to site can be linked to as a source.
    • nitwit005 1041 days ago
      I suspect it depends on what the existing reputation is. I recall a discussion here on Hacker News where someone asked why people weren't using Google+ for software blog posts anymore, and the response was that they'd broken all the links.

      I'm sure that wouldn't have been an issue if it was super popular and respected, but it was already fading away at that point.

    • robertlagrant 1041 days ago
      The people who have to navigate the MS documentation are not generally the same people who choose to use MS as a supplier. If you're that big and rich you will have totally separate engagements with different layers of a business. For most suppliers that's not the case, and documentation is a competitive advantage.
    • _greim_ 1041 days ago
      I don't know. If I consistently see 404s, I eventually associate that domain with "broken" and start avoiding it. Especially in search scenarios where there's lots of links to choose from.
    • seniorThrowaway 1041 days ago
      I've seen some crazy stuff driven by the fear of broken links. One place I worked had all the traffic for their main landing page hitting a redirect on a subsystem because the subsystem had a url that was higher in google etc. I worked on the subsystem and rather then fix things in DNS and the search engines they preferred to expect us to keep our webservers up at all times to serve the redirect. We were on physical hardware in those days and while we had redundancy it was all in one location, made for some fun times.
    • bee_rider 1041 days ago
      Well, the claim is that people "generally lose confidence," which I'd interpret as a decrease in confidence, not a total destruction of confidence. Microsoft and the US treasury have some dead links, sure, but they run big sites. Most the vast majority of links that you'd encounter through normal browsing lead to reasonable places.
    • la_fayette 1041 days ago
      Yes I agree to the point, that I wouldn't lose confidence in the company.

      However, it is extremly annoying and it happend to me just recently with Microsoft, when I researched PWAs... I had bookmarked a link to their description on how PWAs are integrated into Microsoft Store. I could only find it on archive.org...

    • npteljes 1041 days ago
      I agree. It matters a lot more if the content is (re)discoverable.
  • osobo 1041 days ago
    Ah, the digital naiveté of the nineties. Nowadays, cool URLS get changed on purpose so you have to enter the platform through the front instead of bouncing off your destination.
    • gary_0 1041 days ago
      And they get loaded down with query string fields for analytics. I kill those with https://github.com/ClearURLs/Addon
      • BeFlatXIII 1041 days ago
        I wonder if there is a similar extension that leaves in the UTM parameters but fills them with curse words and other sophomoric junk data.
        • jraph 1040 days ago
          Good idea, but risky if not done right or not adopted by many people. You'd be increasing your specificity and therefore the ability to track you.
          • BeFlatXIII 1040 days ago
            Good point. If you’re the only user with utm_origin=boogers&utm_medium=poop, it’ll be trivial to connect you between websites. It’s all automated, so there’s slim to no chance of making a server admin chuckle while checking the logs, unfortunately.
      • toastal 1041 days ago
        I keep checking to see when this add-on will get approved to Firefox Android's anointed list
      • TremendousJudge 1040 days ago
        What's happening with this project? The rules repo hasn't been updated for months even with several PRs
    • slver 1041 days ago
      I'm not aware of anyone doing this on purpose. It's bad for bookmarks, it's bad for SEO, bad for internal linking, and external linking.

      URLs change because systems change.

    • chrisweekly 1041 days ago
      Characterizing TBL as naive is absurd. He was right. And the URLs you describe are anything but cool.
      • jimlikeslimes 1041 days ago
        I'm fairly sure the parent was joking and in full agreement with you. Just so this isn't a point out the joke post, a few years ago I saw a HN post from a bunch of people who used print to PDF, and pdfgrep as a bookmarking solution. It doesn't solve the original problem, but it does act a a coping strategy for when content goes missing. I've been using it for a good while now and it's been real useful already.
    • jaimex2 1041 days ago
      Fantastic way to lose your traffic though.

      Someone follows a link, gets a 404 or main page and they're gone.

      • bojan 1041 days ago
        It's not traffic they are after, but monetization. The traffic that does not monetize only costs money.
        • ninkendo 1041 days ago
          Sigh. I remember when people used to put up websites for fun.
      • antifa 1039 days ago
        Also when example.com/2010/10/10/article-title redirects to the homepage m.example.com
    • npteljes 1041 days ago
      Up there with the Netiquette. I'd love a timeline where this matters, but this ship has sailed long, long ago.
  • dang 1041 days ago
    Past related threads:

    Cool URIs Don't Change (1998) - https://news.ycombinator.com/item?id=23865484 - July 2020 (154 comments)

    Cool URIs Don't Change - https://news.ycombinator.com/item?id=21720496 - Dec 2019 (2 comments)

    Cool URIs don't change. (1998) - https://news.ycombinator.com/item?id=21151174 - Oct 2019 (1 comment)

    Cool URIs don't change (1998) - https://news.ycombinator.com/item?id=11712449 - May 2016 (122 comments)

    Cool URIs don't change. - https://news.ycombinator.com/item?id=4154927 - June 2012 (84 comments)

    Tim Berners-Lee: Cool URIs don't change (1998) - https://news.ycombinator.com/item?id=2492566 - April 2011 (25 comments)

    Cool URIs Don't Change - https://news.ycombinator.com/item?id=1472611 - June 2010 (1 comment)

    Cool URIs Don't change - https://news.ycombinator.com/item?id=175199 - April 2008 (9 comments)

  • mustardo 1041 days ago
    Ha, tell that to anyone at *.microsoft.com I swear every link to a MS doc / MSDN site I have clicked 404s or goes to a home page
    • mavhc 1041 days ago
      Yesterday I found a link on a MS document that went via a protection.outlook.com redirect, and another link on the same page linked to itself
    • prepend 1039 days ago
      This is just a trick to hide that the documentation is terrible and useless. If we get to a “this content doesn’t exist” then we think that there is an answer to our question, we just can’t find it. We eventually give up, blaming ourself for not finding it; instead of finding and getting something stupid and not useful.
    • ape4 1041 days ago
      They can't conceive of anyone coming to a page from anywhere other than an internal recently generated page :(
    • vmateixeira 1041 days ago
      And Oracle as well. Good luck trying to find old software and/or documentation links that don't 404.
      • marcosdumay 1041 days ago
        Old software? It's difficult enough to find documentation on the current Oracle database. You only get random versions through a search engine, and can neither navigate into a different one, nor replace the version on the URL and get to a functioning page. Also, you can't start from the top and try to follow the table of contents because it's completely different.
  • Gametroleum 1041 days ago
    > What to leave out

    > [..]

    > File name extension. This is a very common one. "cgi", even ".html" is something which will change. You may not be using HTML for that page in 20 years time, but you might want today's links to it to still be valid. The canonical way of making links to the W3C site doesn't use the extension.(how?)

    And the page URL is [...]/URI.html

    • cpach 1041 days ago
      It works fine without the suffix as well: https://www.w3.org/Provider/Style/URI
    • yoursunny 1041 days ago
      Since I switched from ASP to PHP (2008), I avoided file extensions in page URI in most cases, and instead placed every page into its own folder. This is compatible with every web server without using rewrite rules.

      When I switched from PHP to static generators (2017), most URIs continued working without redirects.

    • japanuspus 1041 days ago
      Although there is no redirect, the proposed best practice URL also works: https://www.w3.org/Provider/Style/URI
    • ketozhang 1041 days ago
      To be fair, it's the OP that chose to use the URL with the extension. However, you could say WC3 could've disable their servers from automatically creating URLs with file extension if they wanted to follow this.

      1. https://www.w3.org/Provider/Style/URI works

      2. If https://www.w3.org/Provider/Style/URI.html is ever 404 then you get a very useful 300 page (e.g., try https://www.w3.org/Provider/Style/URI.foobar)

      3. However, from (2) you can see Spanish page is encoded as `.../URL.html.es` which is bad because `.../URI.es` does not exist

    • usrusr 1041 days ago
      A page URL. I guess this could be considered the canonical example of hn not always linking to the coolest source?
    • slver 1041 days ago
      > You may not be using HTML for that page in 20 years time

      Yes it might be all Flash.

  • brigandish 1041 days ago
    I'm still not sure why Tag-URI[1] hasn't gained more support:

    > The tag algorithm lets people mint — create — identifiers that no one else using the same algorithm could ever mint. It is simple enough to do in your head, and the resulting identifiers can be easy to read, write, and remember.

    They would seem to solve at least part of the problem Berners-Lee opined about:

    > Now here is one I can sympathize with. I agree entirely. What you need to do is to have the web server look up a persistent URI in an instant and return the file, wherever your current crazy file system has it stored away at the moment. You would like to be able to store the URI in the file as a check…

    > You need to be able to change things like ownership, access, archive level security level, and so on, of a document in the URI space without changing the URI.

    > Make a database which maps document URN to current filename, and let the web server use that to actually retrieve files.

    Not a bad idea, if you have a good URI scheme to back it up.

    I even wrote a Ruby library for it[2] (there are others[3][4] but no Javascript one that I can find, it being the language that produces the worst URLs and doesn't seem to have a community that cares about anything that happened last week, let alone 20 years ago, that's not a surprise)

    [1] http://taguri.org/

    [2] https://github.com/yb66/tag-uri/

    [3] https://gitlab.com/KitaitiMakoto/uri-tag/

    [4] https://metacpan.org/pod/URI::tag

  • tosser0001 1041 days ago
    I recall it was almost axiomatic that the more expensive the content management system, the worse the URLs it produced.

    I completely believe in the spirit of TBL's statement, and wish the internet had evolved to followed some of the ideals implicit there in. I recall that people took some pride in the formatting of their HTML so that "view source" showed a definite structure as if the creator was crafting something.

    Now for the most part, URLs are often opaque and "view source" shows nothing but a bunch of links to impenetrable java script. I actually wonder when "view source" is going to be removed from the standard right-click window as it barely has meaning to the average user any more.

    • pjmlp 1041 days ago
      I expect to eventually browsers be split into users and developer editions, with the tooling only available on developer edition.

      Actually I am quite surprised that it hasn't happened yet.

      • scottlamb 1041 days ago
        Hasn't that already happened? The user browsers are on phones and tablets; the developer browsers are on laptops and desktops.
        • echelon 1041 days ago
          Which is why you can still install software you want from any source on laptops (for the time being), but have to go through Apple to get software on your phone.

          The end goal is to no longer have any thick device. Engineers will store and manage all of their code in the cloud, then pay a subscription fee to access it.

          Rent everything, own nothing.

          I bet TBL hates this just as much as the thing the web of documents and open sharing mutated into.

        • spicybright 1041 days ago
          Not wrong, the mobile market is massive.

          But lots of users still use a desktop browsers for working jobs, especially now a days.

      • ludwigschubert 1041 days ago
        Mozilla making your point for you: Firefox Developer Edition (https://www.mozilla.org/en-US/firefox/developer/)
        • Kwpolska 1041 days ago
          Compared to the regular Firefox, there aren't any extra features (other than features that will reach the regular version in a few weeks, since Dev Edition is an alpha/beta of the next release). It's just a branding thing, a few extras by default, and a separate browser profile (might come in handy too).
      • turdnagel 1041 days ago
        Why?
        • spicybright 1041 days ago
          Yeah, seems a bit pointless to split builds for no reason.

          You always want to test on the browser users will ultimately use anyways, even if you have guarantees the code works exactly the same for dev and user editions.

        • pjmlp 1041 days ago
          Because it has been a common trend in computing to split consumer and development devices.
          • jraph 1041 days ago
            For browsers, the opposite has been true. Developer tools that you once installed as extensions (hello Firebug) are now shipped in the browser.
            • cxr 1041 days ago
              Firefox's current devtools shipped beginning with Firefox 4. It was only Firefox 3.x that shipped without any kind of tooling for web developers. Prior to Firefox 3, the original DOM Inspector was built-in by default (just like the case with Mozilla Suite and SeaMonkey).
            • pjmlp 1041 days ago
              I know, just expect it to turn full circle.
      • jtxt 1041 days ago
        It kind of did with mobile / desktop.
    • theandrewbailey 1041 days ago
      View source doesn't show anything that cURL doesn't, and browser inspectors are far more useful. Still, I would hate to see view source disappear.
      • nsomaru 1041 days ago
        Inspector has more features, but for me view source with CTRL-F is more performant
        • wongarsu 1041 days ago
          Inspector shows what the browser is currently rendering. View source might match that, or it might be something subtly (or not so subtly) different.
        • wizzwizz4 1041 days ago
          Except in Microsoft Edge, where it hangs.
    • dheera 1041 days ago
      The DOM view on the other hand is super useful as it is effectively a de-obfuscated version of what those javascript soup create.

      And you can remove things like popups instead of agreeing to them.

    • londons_explore 1041 days ago
      Perhaps 'view source' should link to the authors github, highlighting the repo that can be built to produce the page.

      If the repo can't reproduce the contents of the page, simply treat that as an error and don't render the page.

      • pc86 1041 days ago
        Yes because all code is available via GitHub.
        • londons_explore 1041 days ago
          (or other code repo)
          • spicybright 1041 days ago
            Barely any commercial sites have public repos though.

            I'd even say barely any sites in general outside of dev blogs + projects.

            And even then they always link to the repo.

          • jefftk 1041 days ago
            If you actually want to do this, you could have the browser check for source maps, and refuse to render any pages that did not provide them: https://developer.mozilla.org/en-US/docs/Tools/Debugger/How_...

            (Of course most pages are not willing to make their raw source available, so you would see a lot of errors.)

            • cxr 1041 days ago
              ... and a "go fuck yourself" to anybody who publishes pages that simply don't require source maps to begin with? I admit that would at least be in line with the current trend of prioritizing/encouraging the kinds of authors and publishers who engage in (the mess of) what is considered "good" modern Web development.
              • jefftk 1041 days ago
                I mean, I think the whole idea of only showing websites with unminified source available is silly, but I'm willing to think about the idea ;)

                I think a source map is a better approach than a GitHub link, is was all I was saying!

                It also probably wouldn't be too hard to make some heuristics to figure out whether scripts are unminified and not require a source map. It wouldn't be 100% accurate, and it wouldn't avoid some form of intentional obfuscation that still uses long names for things, but it would probably work pretty well.

      • chriswarbo 1041 days ago
        I do this with http://chriswarbo.net

        Most HTML pages are generated from a corresponding Markdown file, in which case a "View Source" link is placed in the footer which goes to that Markdown file in git.

  • superkuh 1041 days ago
    Tell it to Tor. They are destroying every URI created, indexed, known, and used over the last 15 years on Oct 15th, https://blog.torproject.org/v2-deprecation-timeline . Because of this indieweb has refused to add tor onion service domain support for identity.
    • ______- 1041 days ago
      Onionland people had plenty of warning though. My old v2 Onion bookmarks are all discarded. The new V3 addresses are a good indicator of which .onion operators are serious and want to stay online no matter what.
      • superkuh 1041 days ago
        I consider myself pretty active on tor. I've hosted a number of services for a decade. While I heard about the new Tor v3 services a long time ago I didn't hear about Tor v2 support being completely removed until April 2021. It was quite a shock.
  • crazygringo 1041 days ago
    At the end of the day, it's simply an unrealistic expectation that URIs don't change. And declaring for 20+ years that this isn't "cool" isn't going to change a thing about it...

    The web is dynamic, not static. Sites come and go, pages come and go, content gets moved and updated and reorganized and recombined. And that's a good thing.

    If content has moved, you can often use Google to figure out the new location. And if you want history, that's what we fortunately have archive.org for.

    For any website owner, ultimately it's just a cost-benefit calculation. Maintaining old URIs winds up introducing a major cost at some point (usually as part of a tech stack migration), and at some point benefits from that traffic isn't worth it anymore.

    There's no purist, moral, or ethical necessity to maintain URIs in perpetuity. But in reality, because website owners want the traffic and don't want to disrupt search results, they maintain them most of the time. That's good enough for me.

    • rimliu 1041 days ago
      It is very sad that this "cost-benefit" calculation ends up ruining the web. Devs too lazy to learn the basics and doing everything with React and a gazzilion of libs, nobody caring about accessibility, nobody caring about the url structure. It is very possible to go through 50 CMSes and still keep the URLS intact. Just that we forgot the users and UX moved away giving up the place for the DX.

      I still hope for the new web standards revoliution. Zeldmans of XXI century, where are you?

    • prepend 1039 days ago
      > it's simply an unrealistic expectation that URIs don't change.

      I disagree. If I generate urls systematically, then when I change the scheme I can easily send a 302 with the new url.

      It’s also not hard to just keep a mapping table with old and new.

      I think that it’s lazy programmers who can’t handle it. Or chaotic content creators.

      I’ve had many conversations with SharePoint people who frequently change urls by renaming files and then just expect everyone linking to it to change their links. They seem to design content without ever linking because links break so much.

      It’s the damndest thing as it makes content hard to link to and reuse. People are too young to even care about links and stuff.

      Of course if SharePoint search didn’t suck it wouldn’t be as much of an issue.

      • crazygringo 1039 days ago
        > I can easily send

        Honestly that's just not true. If you upgrade to a new CMS/system, the way it routes URLs can be completely incompatible with the old format and it just can't accomodate it.

        And if you're dealing with 10,000,000's of URL's and billions of pageviews per month, you're talking about setting up entire servers and software in front of your CMS dedicated just to handling all the redirect lookups.

        And you also easily run into conflicts where there is overlap between the old naming convention and the new one, so a URL still exists but it's to something new, while the old content is elsewhere.

        Yes it's possible. But the idea that it's "not hard" is also often very false.

        > I think that it’s lazy programmers

        No, it's the managers who decided it wasn't worth paying a programmer 2 or 4 weeks to implement, because other programming tasks were more important and they employ a finite number of programmers.

        For commercial websites, it's not laziness or "chaos". It's just simple cost-benefit.

  • tiew9Vii 1041 days ago
    This is a recommended read on the subject "Designing a URL structure for BBC programmes" https://smethur.st/posts/176135860.

    Clean user friendly urls are often at odds with persistent urls because of the information you can't keep in a url for it to be truly persistent.

    • selfhoster11 1041 days ago
      It's easy to put a reverse proxy in front of your service to translate old permanent URLs into whatever you're running currently. It's a table of redirects, and perhaps a large one, but it doesn't require a lot of logic to handle.
  • alistairjcbrown 1041 days ago
    Jeremy Keith's 11 year long bet against cool URLs (specifically on longbets.org) comes up next year -- and it looks like he might lose it https://longbets.org/601/
    • perilunar 1041 days ago
      Looks like he will lose the bet.

      Interestingly though, the original URL was: http://www.longbets.org/601

      • franze 1041 days ago
        > A 301 redirect from www.longbets.org/601 to a different URL containing that text would also fulfill those conditions.
        • perilunar 1041 days ago
          Yes. Good thing it's still an "HTML document". One could argue that a single page app loading content via json would not qualify.
  • grenoire 1041 days ago
    We've been thinking about link rot since then, and yet we're still letting it happen without much care. Kinda' sad, but at least we're resting on the shoulders of archive.org.
    • ricardo81 1041 days ago
      So much of it is unnecessary. Any stats on link rot in 2021? Last I remember it's something like 10% of all links every year.
  • franze 1041 days ago
    This is my battle-proven (tired and true) checklist for what I call targeted pages. (Pages that are entry pages for users, mostly via organic Google traffic) https://www.dropbox.com/s/zfwd331ehrucgw4/targeted-page-chec...

    And these are the URL rules I use. Whenever I make any compromise on them or change the priority, I regret it down the road. Nr 1 and nr 2 are directly taken form the OP article.

      URL-rules
        - URL-rule 1: unique (1 URL == 1 resource, 1 resource == 1 URL)
        - URL-rule 2: permanent (they do not change, no dependencies to anything)
        - URL-rule 3: manageable (equals measurable, 1 logic per site section, no complicated exceptions, no exceptions)
        - URL-rule 4: easily scalable logic
        - URL-rule 5: short
        - URL-rule 6: with a variation (partial) of the targeted phrase (optional)
    
    URL-rule 1 is more important than 1 to 6 combined, URL-rule 2 is more important than 2 to 6 combines, ... URL-rule 5 and 6 are always a trade-of. Nr 6. is optional. A truly search optimized URL must fulfill all URL-rules.
  • thomond 1041 days ago
    Interestingly the National Science Foundation URLs they used as examples of URLs that probably won't work in a few years all still work.
  • whirlwin 1041 days ago
    My experience is that most developers don't even check the access logs to see what impact removing/changing an endpoint will have.

    Access logs are underrated. It is not some legacy concept. It is supported my all new platforms as well, including various Kubernetes setups.

  • yoursunny 1041 days ago
    I opened my website in 2006. Since 2009, I've kept all the links unchanged or redirected through two rebuilds.

    It's a lot of work because I need to customize URI structure in blog systems and put complex rewrite rules.

    To ensure the setup is more or less correct, I have a bash script that tests the redirects via curl. https://bitbucket.org/yoursunny/yoursunny-website/src/912f25...

    • slver 1041 days ago
      I've a bunch of small sites whose links end in ".php" despite they no longer run on PHP.
      • JadeNB 1041 days ago
        > I've a bunch of small sites whose links end in ".php" despite they no longer run on PHP.

        Kudos for maintaining that compatibility, but it seems that this kind of thing is addressed in the linked document:

        > File name extension. This is a very common one. "cgi", even ".html" is something which will change. You may not be using HTML for that page in 20 years time, but you might want today's links to it to still be valid. The canonical way of making links to the W3C site doesn't use the extension.

        • slver 1040 days ago
          Back then we didn’t have routers
      • smhenderson 1041 days ago
        I'm not sure if they still use it but a site I redid for a company many moons ago in Perl had Apache set to treat .htm files as CGI scripts so we didn't have to change the previous, static site's URL's.

        The original site ran on IIS and was old enough that MS software at the time still used three letter extensions for backwards compatibility, around 1997-1998 IIRC.

  • graiz 1041 days ago
    Ironically this was linked to with the .htm extension link but it was nice to discover that the non-page specific URI also worked on this site: https://www.w3.org/Provider/Style/URI
  • ketzu 1041 days ago
    > There are no reasons at all in theory for people to change URIs (or stop maintaining documents), but millions of reasons in practice.

    I am not convinced that some of the reasons are not theoretical as well. The article in general seems very dismissive of the practical reasons to change URIs, like listing only the 'easily dismissable' ones to begin with.

    > The solution is forethought - make sure you capture with every document its acceptable distribution, its creation date and ideally its expiry date. Keep this metadata.

    What to do with an expired document?

    Don't get me wrong, it's good to strive for long lasting URIs and good design of URIs is important, but overall I believe a good archiving solution is a better approach. It keeps the current namespaces cleaner and easier to maintain. (Maintaining changes in URI philosophy and organization over tens to hundreds of years sounds like a maintenance nightmare.)

    The archiving solution should have browser support: If a URI can not be resolved as desired, it could notify the user of existing historic versions.

    • Symbiote 1041 days ago
      An expired document can be removed, and a "410 Gone" response returned. This is clearer for the user, who can expect not to find the document elsewhere on the website.
    • eythian 1041 days ago
      > The archiving solution should have browser support: If a URI can not be resolved as desired, it could notify the user of existing historic versions.

      If you use this add-on: https://addons.mozilla.org/nl/firefox/addon/wayback-machine_... it will pop up a box when it sees a 404 offering to search the wayback machine for older versions.

    • kijin 1041 days ago
      Many APIs use versioning to keep the current namespace clean while still supporting the old version. It goes like /v1/getPosts, /v2/post/get, etc.

      This might be argument in favor of preemptively adding another level of hierarchy to your URLs, so that when the time comes to move the entire site to a new backend, you can just proxy the entire /2019/ hierarchy to a compatibility layer.

      But who are we kidding, we live in a world where 12 months of support for a widely used framework is called "LTS". Nobody seems to care whether their domains will even resolve in 10 years.

      • ketzu 1041 days ago
        > Many APIs use versioning to keep the current namespace clean while still supporting the old version. It goes like /v1/getPosts, /v2/post/get, etc.

        I think that is an important thing to do (as I agree desigining URIs is important!) I am just not sure it is reasonable, or even desireable, for them to be indefinately maintained. I am not sure what the right timeframe for deprecation would be either.

        • giantrobot 1041 days ago
          URLs should be maintained even if the content is gone. At the very least you can give a useful HTTP return code, a permanent redirect or gone is more useful than a catch-all 404. You're either bridging old links to current links or telling visitors the old content has been removed.
        • kijin 1041 days ago
          I don't think it would be feasible to maintain a fully functional copy of a dynamically generated page at the old URL for any length of time. That's just a recipe for security nightmare, not to mention the SEO penalty for duplicate content.

          301/308 redirects, on the other hand, can be maintained more or less indefinitely as long as you keep around a mapping of old URLs to new URLs. If you need to change your URLs again, you just add more entries to the mapping database. Make your 404 handler look up this database first.

          One thing you can't do, though, is repurposing an existing URL. Hence the versioning. :)

    • anoncake 1041 days ago
      > (Maintaining changes in URI philosophy and organization over tens to hundreds of years sounds like a maintenance nightmare.)

      I can't see why. You maintain a table of redirects. When you change the URI organization, which would break the URIs, you add the appropriate redirects to prevent that. Then, when you change it again, you just append the new redirects without removing the old ones. If necessary, resolve redirect chains server-side. The table may grow large, but it doesn't seem much more complicated than maintaining redirects across just two generations. Am I missing something?

  • bofh23 1039 days ago
    Why do webmasters and web developers neglect the title element?

    To mitigate link rot I share URLs along with the title and a snippet from the page. A good title can act as an effective search key to find the page at its new URL even if the domain has changed.

    Unfortunately, a lot of web pages have useless titles. Often sites use the same title on every page even when pages have subjects that are obvious choices for inclusion in the title (make, model & part # or article title that appears in the body element but not in the title). With the rise of social media, lots of pages now have OpenGraph metadata (e.g. og:title) set but don’t propagate that to the regular page title.

    Page titles are used by search engines and by browsers to set window, tab, and bookmark titles.

    Again, why do webmasters and web developers neglect the title element?

  • kevinslin 1039 days ago
    Urls are the foundation of the internet and it always makes me sad when navigating to a page results in a 404 instead. On a somewhat related sidenote, this is why I created a note taking tool that publishes *permanent urls* - because your notes can change but your links shouldn't :)

    https://wiki.dendron.so/notes/2fe96d3a-dcf9-409b-8a09-fdaa5a...

  • diveanon 1041 days ago
    Cool URIs now return a 200, redirect you to index.html, and then show a 404 page.
  • slaymaker1907 1040 days ago
    Sometimes I think that immutability should have been a built in part of the web that way we wouldn't be so reliant on archivers like the Internet Archive. Someone still needs to store it, but it really shouldn't have been foisted onto one organization.
  • juloo 1041 days ago
    One of the things that hype me about IPFS is that URIs can't change. Resources are immutable and the URI is only the hash of the content.

    https://ipfs.io/

  • lolive 1041 days ago
    I did a small script to go through my past tweets. An awful lot of the links I mentionned in them DO NOT WORK anymore. Such a pity.
  • beders 1040 days ago
    I bet you not even 50% of your bookmarks are still working. It is a shame.
  • ChrisArchitect 1040 days ago
    tomte, why do you keep resubmitting these old things over?

    Recent discussion on this one's last submission was not even a year ago

  • greyface- 1041 days ago
    The original 1998 URL to this article was http, not https. It changed.
    • sleepyhead 1041 days ago
      It did not change. http is still served but if you are getting https it is due to hsts. Regardless; redirect != it changed
      • chuckhoupt 1041 days ago
        Note that HTTP is also upgraded if the UA requests it with the Upgrade-Insecure-Requests header:

          $ curl -I -H 'Upgrade-Insecure-Requests: 1' 'http://www.w3.org/Provider/Style/URI.html'
          HTTP/1.1 307 Temporary Redirect
          location: https://www.w3.org/Provider/Style/URI.html
          vary: Upgrade-Insecure-Requests
    • ricardo81 1041 days ago
      $ curl -i 'http://www.w3.org/Provider/Style/URI.html'

      HTTP/1.1 200 OK

      date: Thu, 17 Jun 2021 10:18:14 GMT

      last-modified: Mon, 24 Feb 2014 23:09:53 GMT ...

      • greyface- 1041 days ago

          $ curl -I 'https://www.w3.org/'
          HTTP/2 200 
          [...]
          strict-transport-security: max-age=15552000; includeSubdomains; preload
          content-security-policy: upgrade-insecure-requests
    • detaro 1041 days ago
      The original URL still works, that's the key.
    • srmarm 1041 days ago
      Fair point but I believe the article refers to URI's rather than URL's so they're still technically correct.

      Edit - I've just had a look and URI covers the lot so my bad

    • user3939382 1041 days ago
      In the next 5-10 years the https:// will be amp:// when Google completes its process of consuming the web like noface in Spirited Away.