Simple software things that are actually very complicated

(construct.net)

167 points | by AshleysBrain 702 days ago

27 comments

  • deathanatos 701 days ago
    No, text on a canvas is a somewhat obvious nightmare.

    No, we must go simpler.

    left-pad: you need to account for wide characters, such as CJK characters but also emoji, you need to account for things like combining characters, etc. left-pad is not some trivial function.

    Of course, the infamous left-pad package was widely mocked "why do you need a whole package for such a simple task" — so simple, indeed, that it failed to handle many of those cases. But don't worry, it's been standardized now as padStart() … and still fails those cases:

      > 'e\u0301'.padStart(2)
      'é'
    
    Okay, strings aren't simple! But numbers, numbers are fine right?

    So then, Go issue #4594, "add a Round function [to Go]"[1]. Go maintainer Ross Cox suggests that "it's a one liner" and suggests … a buggy implementation. There are a total of no less than 6 different implementations in the issue, and every single one of them is buggy. The bug is closed, the function is "too simple" to add. Thankfully, the wound gets reopened nearly a half-decade later in #20100. Another buggy implementation is suggested in that thread, and thankfully, Go gets a round function. (A non-buggy one … I think.)

    [1]: https://github.com/golang/go/issues/4594

    • gernb 701 days ago
      The entire idea of padding by characters only fits a few use cases. leftPad would never have covered those cases. If you want tabular data use a table or grid. No amount of string gymnastics will get you aligned strings across languages and fonts. So yea, leftPad was always a bad idea because the only useful implementation is too trivial too have been put in a 3rd party library. Unless you believe there should have been a library called 'add' that was just `export const add = (a, b) => a + b;`
    • mro_name 701 days ago
      the basics are incredibly hard. Listen to a physicist explaining time, matter or energy.
  • ronnier 701 days ago
    I've come to the conclusion that nothing is simple. Nothing. It's only "simple" because thousands of years of work were applied to get us to where we are today and we are building on top of that. It's arrogant to think anything is simple.

    Just to build a key cap for this laptop I'm typing on, very difficult if I actually had to do that. I need to use tools that were ultimately kicked off by stone age men thousands of years ago that progressed by building tools with those tools until we go to the point today where we can create a key cap for a laptop.

    • mrwh 701 days ago
      I'm sure you know this story already, but nice example: https://gizmodo.com/one-mans-nearly-impossible-quest-to-make...
      • ghgr 701 days ago
        Thanks for sharing! It brought back to my mind this project of mine (more like a "wish" than an actual project) to build the current technology tree (like Age of Empire's or Factorio's) as a sort of wiki. And of course the wonderful essay by Leonard E. Read "I, Pencil".
      • amelius 701 days ago
        A better example would be someone sitting in the middle of a jungle with no access to technology trying to make a toaster from scratch.
    • meribold 701 days ago
      > I've come to the conclusion that nothing is simple.

      Surely it's more complicated than that ;)

    • dwohnitmok 701 days ago
      Oh it goes even further than that, all the way to the Big Bang. As Carl Sagan said, “If you wish to make an apple pie from scratch, you must first invent the universe.”
  • Gigachad 701 days ago
    A lot of the complexity in this article isn't so much that the task is very complex, but that if you want to support every single language, culture, format, then it becomes a very long task.

    I do wonder if we will end up just moving towards more unified formats and doing away with the need for all this complexity. I could be very wrong on this, but it seems like for the smaller to mid scale software projects, I18n is just not worth it. Throwing a bunch of YAML files over the wall to a translation company and then just dropping them in the app does not work and the end result is so bad that native speakers would rather just use the app in correct English than broken native language.

    Now you don't have to care about how to wrap or hyphenate a word in an obscure language which doesn't support it, and you don't have to worry about vertical text. Your problem space just got a lot smaller. And if you are building a tool at Google scale then you should have the resources to just make the problem go away anyway.

    • colordrops 701 days ago
      Supporting "every single language, culture, format" *is" the task in the article. You can decide whether you want to do it or not, but that's a different question.
      • AnimalMuppet 701 days ago
        But who thinks that's simple? That's an insanely complex task.
        • Gigachad 701 days ago
          At a high level the task may seem simple "Wrap the text when it reaches the end of the line". And if you only care about English, it actually is reasonably simple. The complexity is stuff you wouldn't even think about unless you had experience in the area.
          • AnimalMuppet 701 days ago
            Fair, to a point.

            But colordrops said "supporting every single language, culture, format". So by definition, it's more than English. And as soon as you start looking into non-English writing systems, you realize that it is very much not simple.

        • mro_name 701 days ago
          indeed, once a task includes 'every', it's never simple.

          Or 'never' :-)

    • Falkon1313 701 days ago
      We were looking at internationalization and translations recently and I was happily surprised when the guy who is normally a strict stickler for everything being perfect just said "No, we really don't want to do that."

      Our software deals with scientific stuff and safety stuff, and even a minor mistranslation there could have a big (and potentially very bad) impact.

      So even though technically we could allow translations, we don't, and there's a very good reason for that. Human languages are messy, and every time you switch from one to another, there's a good chance of something getting lost along the way. And sometimes, that kind of loss is just not acceptable. Even if it would be a little more convenient.

      Drawing the limits and knowing when to say "no", even to what might seem like a 'simple' thing, is important.

      • rob74 701 days ago
        The problem with translations is that even the best translators can make mistakes because when you just send them a list of strings they are unaware of the context where these will be used. And then you end up with "Key" being translated as "Taste" (key on a keyboard) instead of "Schlüssel" (which would have been appropriate in the Windows Registry Editor) in German. So the translator should be:

        * fluent in both languages (including domain-specific tech lingo)

        * intimately familiar with the software

        * very thorough (has to ideally check each and every string if it works in the context of the software)

        ...only then will you get a high quality translation.

      • discreteevent 701 days ago
        This certainly may work in some contexts. But it won't for others. For example if I am a heavy machinery operator and I see some warning that is not in my native language, I may just ignore it.
    • bsder 701 days ago
      > if you want to support every single language, culture, format, then it becomes a very long task.

      HAH! Even if you just say "I only support English", "text handling" is still an infinite black hole of programming time and energy.

      Rendering text sucks.

    • dotancohen 701 days ago

        > it seems like for the smaller to mid scale software projects, I18n is just not worth it
      
      If you don't mind reading from right to left and praying towards Mecca, then sure you can disregard all the diversity of human accomplishment. But unless you are willing to have another language and culture imposed on you, suggesting that you imposing your language and culture on others is "worth it" as opposed to wrapping all user-facing strings with some gettext-compatibility layer is disingenuous.

      I write i18n compatible software, that supports not only multiple languages but also interface directionality and culture-specific features. Not because I have to, but because I respect my users.

    • hnfong 701 days ago
      > the end result is so bad that native speakers would rather just use the app in correct English than broken native language

      Is that actually supported by evidence or did you pull it out from .. somewhere?

      This claim probably holds if the user is fluent in English as a second language, but are you sure the billions of people who can't read English would really make that choice?

      I think it's the developer's choice whether to only support users that can read English to at least some degree, but claiming that an option for broken i18n is worse than nothing seems to be rather radical to me... (that said, the number of websites that presume my language preferences upon a geoip check is infuriating...)

    • terom 701 days ago
      Agreed, it shouldn't be unreasonable to expect everyone to learn Mandarin Chinese, or perhaps Spanish?

      [0] https://en.wikipedia.org/wiki/List_of_languages_by_number_of...

      • Zababa 701 days ago
        I think it would be a better idea to focus on the total number of speakers, or even second language speakers [1]. For example, English has 372 millions first language speakers and 1 billion second language speakers. Mandarin Chinese has 929 millions first language speakers and 198 millions second language speakers. This means that way more people have learned English as a second language. This, I would assume, would mean that it's generally easier to learn, there are more infrastructures/people in place to do it, people are more used to second language speakers, and it's in general a better language to communicate across cultures.

        [1]: https://en.wikipedia.org/wiki/List_of_languages_by_total_num...

        • potatoz2 700 days ago
          English is somewhat better than Chinese because it’s easier to learn and has more total speakers, but 1.4 billion people is still pretty far from 7+ billion people. Choosing to only offer something in English _and_ make it difficult or impossible to translate it in other languages is telling 5.6 billion people they don’t matter until they learn the language you happen to speak.
    • Aeolun 701 days ago
      I mean, if they could do the things listed in the article for the original gameboy, I’m inclined to believe we should be able to do them today.
      • _gabe_ 701 days ago
        Now I sped through the article a bit, but afaik the Gameboy did not have native support for html rendering and webgl...
    • Kiro 701 days ago
      Construct is a game engine so they need to support everything.
  • zokier 702 days ago
    Are there any simple software things that are actually very simple?

    I feel like tons of the very basic things that have been core software tasks for ages still have surprising amount of complexity and depth to them when you'd expect them to be simple. Stuff like text (basically anything), math (real numbers), 2d graphics (efficient high-quality vector graphics still seem like a pipedream), sorting (we just had a thread about improving sorting in postgres!), files (infamous fsync issues), color, etc just seem to constantly come up as problems when you'd think we had solved them already 20 years ago.

    • ivraatiems 701 days ago
      The issue with many of these things is that they aren't simple, not really, it's just that our brains are super good at them and computers are almost entirely unlike our brains. Human brains are able to do things like read context clues and extrapolate from them. It just makes sense, for example, to wrap your lines to the space available on a given piece of paper. But a computer doesn't know what a "line" or "space" or a "piece of paper" is. It has to be told, via a mathematical abstraction, which a human must devise.

      That is, any adult human being carries around an absolutely massive amount of mental context which they are capable of applying to every situation nearly automatically, without realizing it. All of the things you mention are tasks any sufficiently educated human can do easily - draw a shape? sort some things? do math with fractions or complex numbers? - because you can teach a human a general concept and then assume their brain will do the work of applying it to new situations, breaking it down, etc. 2 + 2 on a TV screen is exactly the same as 2 + 2 in a notebook is exactly the same as two sheep + two sheep in the field.

      But until someone invents strong general AI (hehe), computers just aren't gonna have that skillset.

    • syntheweave 701 days ago
      Most software things become simple when you can define them down into an enumeration of possibilities.

      For example, if my text problem were "display these preformatted strings in a single resolution and layout" there would be no requirement on their encoding or processing and I could do whatever I like, even hardcode them or store them as image assets. Games have done this trick since forever because they have a tremendous capacity to soak up more and more static content, and much of the core tech in a game engine is in finding the ideal representations to efficiently load and present that content while carefully minimizing the dynamic behavior that would entail a general-purpose solution.

      The problem for software in the larger view of things is always that we've defined the problem with sufficient generality that it has to accommodate all the depth, because we don't know who is using the software or what they want from it. The medium is never "set in stone" so the tools are forever adapting to a new use case. And just when you think you've modelled every possible aspect of the data, some other way of doing it will come up.

      And it's not a right thing either to think "OK, I'll just solve the most general thing I can think of now so that I don't deal with it later" because then you just have a wide field of untested edge cases and no coverage of the thing you didn't anticipate.

      Like, take image editing software as an example. They all let you draw things freehand. Many support pressure sensitive stylus input. But the way they interpret that input is all over the map: different sensitivity curves, stabilization algorithms, brush behaviors and so on. There's no winning the battle by defining the most general engine for freehand drawing and painting, because what the user craves most of all here is the path to "just works" defaults. Thus in every sufficiently developed editor, an enumeration of possibilities appears again, but as a configuration letting you browse presets.

      • _gabe_ 701 days ago
        > Games have done this trick since forever.

        Text rendering is still a very complex task in modern games unless you're using an engine (and even then the complexity is still there it's just hidden). I'm not aware of any recent releases that stored all their text as hard coded images. Most games will at the very least create a predefined charset, render each character to a bitmap or sdf as part of the loading using something like freetype or stb_truetype, cache the bitmaps or sdfs to a hash map with a lookup ID and all that stuff, upload the images to a texture on the GPU, perform some sort of layout scheme when rendering the text in the game world or HUD (which includes things like figuring out the kerning for different characters, line wrap, and all the other fun stuff).

        Like somebody else mentioned, nothing is simple. Even a bitmap font from the 80s was wayyy more complex then it would be today because of all the optimizations you would have to create to make sure it was fast, and the extreme lack of modern rendering utilities that we have today.*

        *I wasn't alive in the 80s, so this is just based on the videos I've seen of what we had then and articles/books I've read about what programming was like then.

        • __del__ 701 days ago
          bitmapped fonts are pretty damn simple, but you can compress them by storing only the pixel distances to the next inversion from 0 -> 1 or 1 -> 0 along scanlines
    • armchairhacker 701 days ago
      Text, math, 2d graphics, sorting, files, color are simple as long as you don’t care about the most optimal version. Everything the author mentions in the article can also be done “simply”, if you use monospace, ASCII, ignore mobile, etc.

      For example, text is just a null-terminated char array if you only support ASCII, or short or int array to support other languages and more symbols. To render the text, just loop over each char and index into an array to get a bitmap of a monospace font. Your text may look a bit ugly and take up more memory than necessary, but it’s passable, and 40 years ago it’s what we did.

      There are some problems like TSP and Chess where there is a simple brute-force solution, but its not “passable” because it takes impossibly long, so you need a more optimal complex solution. And then there’s the iconic “seems simple but actually really hard” problems like generalized image recognition or walking; they are “simple” to humans and animals, but a remotely half-decent solution took a long time even with giant super-computers, and it’s still only half-decent.

    • OJFord 701 days ago
      Those things only have 'complexity and depth' once you really get into them though, I think?

      And the level at which that occurs has probably got more complex/deep as time's gone on. It's almost tautological, I can't really see that that would ever not be the case for anything.

      Kicking a ball is bloody easy, but soccer's played at a high level by extremely trained extremely paid people. (I'm far from a fan, fwiw, but I think it's a clear example.)

    • lbriner 701 days ago
      Isn't the issue that humans will never accept something is finished. People have spent many years sorting many things but at some point someone suddenly decides (for good or bad) that sorting could be better, that files could be done differently, that ext4 is no longer suitable for scenario X and they start inventing something new.

      Most of these things probably die after a while if they are ever really well-known but if you work for some FANG (or whatever it is now!) company, you might have enough resource to force the new way of thinking onto everyone else so that FOMO kicks in unless you are also using the new best thing.

      In so many cases, people get religious about differences that might affect Google or Amazons bandwidth usage but no-one elses (like using a new image format). I wonder how many developers, particularly web developers, actually know for certain which parts of their system are unreasonably slow and actually need to change?

    • JoshCole 701 days ago
      Immutable data comes to mind as something relatively simple. There is a world of complexity for working with it in a fault-tolerant and efficient way, but from a structural perspective an immutable bit of data is one thing rather than many things and it is only that thing and can only be that thing forever. This strikes at the very core of what simplicity is meant to be, not in the sense of ease, but in the sense of not being complexity.
  • nraynaud 701 days ago
    I had a client simultaneously telling me he wanted a function, but that it couldn’t take that much time to develop it because it’s only useful 5% of the time. Sorry dude it’s all or nothing, it’s not like mechanics were there is a range of solutions going from sheet metal to forged titanium or from a piano hinge to magnetic bearings.

    I’m a bit unfair, because some high tech stuff is cheap, we all use tree and hash structures whose development we don’t pay because they are in the standard library, but it’s a few strategic and standardized places. The same as using a ball bearing in mechanics: you get into a very high tolerance world for cheap, the mechanical interface is standard and some unicorn mounted helve make those bearing in a chocolate factory as far as I know they might develop maps and queues on their other half time.

  • ivraatiems 701 days ago
    I agree with the general premise - sometimes "simple" things in software are really complex - but not the specific examples. The reason for the complexity in the author's examples is that they're using web canvases in a way they are not, as far as I know, intended to be used. Canvases are for 2D shapes/bitmap rendering, not complex text and user interactions. You're having trouble because you're reinventing the wheel. Maybe it's necessary for your use case, but that doesn't make it shocking that it's hard.
    • ec109685 701 days ago
      Unless you’ve done it before, there’s no way you would think it to be that complex to simulate the OS behavior. That was the point of the article.

      When reading that list, my thoughts turned to whether there is a good open source implementation that could be used / cross compiled into web assembly since surely somebody has done it correctly.

  • tester756 701 days ago
  • gernb 701 days ago
    Now if flutter would learn all of that and stop trying (and failing) reproduce it! Guess what? OS Level and browser level spell checking doesn't work in Flutter because it's rendering pixels instead of OS text input / HTML text input. And a ton of other issues that will likely never be fixed. It almost makes we wonder if Flutter's goal isn't good UX but rather user control. If it's just pixels it's much harder for users to copy and paste, quote, scrape, etc....
    • ec109685 701 days ago
      Interesting they don’t use hidden native elements to simulate that. I guess kerning and layout would be subtly different between platforms.

      I wonder what Google Docs does on iOS?

      ReactNative doesn’t have this problem since the ui is expressed ultimately as native components.

  • PaulHoule 702 days ago
    It's a funny topic for me because last summer I wrote a text layout engine for Python because I thought the alternatives sucked. I struggled with loading libraries to support vertically-oriented Japanese text and then it hit me "these characters are all squares... it can't be that hard!" Then there was the gradual epiphany that: (1) if you want to use serif typefaces and have it look good you have to kern them properly (2) Word, Powerpoint, and Adobe tools don't kern properly (3) that's why sans serif typefaces are so fashionable these days

    Of course I am using this to print cards with custom software and it doesn't have to be interactive, reflow, or be useful to anyone else.

    • dhosek 701 days ago
      Here’s the secret of typesetting vertical Japanese text: You turn all the characters anticlockwise 90°. It is now exactly the same process as typesetting any L-R text.

      Regarding your epiphany, (a) Word doesn't turn on kerning by default, but it does indeed kern properly. I don't use Powerpoint, but I'm guessing there might be a similar setting on the second tab of the font modal for turning on kerning. I don't know where you get your claim that Adobe tools don't kern properly, but they very much do. And finally, kerning is just as important for sans-serif type as it is for serifed type. The only arena it doesn't come into play is monospace type. Which, perhaps surprisingly, Japanese type traditionally is.

    • froh 701 days ago
      Yes, and also ligatures, serif fonts love ligatures...
      • PaulHoule 701 days ago
        Another complaint I have is that if you look really close at most typefaces the letters and numbers seem to not just not come from the same typeface but not even from the same planet.
      • dhosek 701 days ago
        Which are (a) not unique to serifed faces (the original Avant-Garde famously was full of discretionary ligatures) and not necessarily ideal for serifed faces. Many typefaces designed for Linotype hot metal look decidedly worse with ligatures than without because the f could not have any overhang so if you had something like fifty, the two fs looked bizarre in conjunction.

        My go-to type for demonstrating this problem was Trump Mediaeval but sometime in the transition from Type 1 to Opentype, the type gnomes at Linotype saw the obvious issue here and replaced the ugly fi ligature with a logotype with fi without the ligature (the logotype existing more for the sake of documents which might have manually specified fi rather than fi (I’m curious to see whether the difference is visible when this gets translated into its presentation form)).

  • kccqzy 701 days ago
    Somewhat famously, Knuth came up with a dynamic programming algorithm for breaking paragraphs into lines with full justification (both left aligned and right aligned). Later on there are of course further innovations like character protrusion and expansion to achieve that optically aligned look without much (if any) hyphenation.
    • PaulDavisThe1st 701 days ago
      I am not sure it's clear that Knuth's work is appropriate for all or even any non-Roman alphabetic languages.
      • svat 701 days ago
        The best typeset books I've seen in Devanagari (a non-Roman script, used for Sanskrit, Hindi, Marathi, Nepali) have been typeset with XeTeX (XeLaTeX). Of course this involves other system libraries like Harfbuzz to handle the kerning and measure widths of individual words, and leaves some of the complexities mentioned in the post to the user, but the Knuth–Plass algorithm from TeX is still what is at the heart of it, for breaking paragraphs into individual lines. Take a look at TUGboat and the TUG conferences or the TeX.SE website to see it being used for many languages and scripts (including Arabic, Hebrew, etc). TeX is also big in Japan it appears, and they maintain their own variant engines (ptex, eptex, uptex, euptex).
        • PaulDavisThe1st 701 days ago
          In the 80's, I did the typography design for two books with TeX and LaTeX, so it's not as if I'm unfamiliar with the stuff :) However, it is true that I haven't seen many examples of non-Roman script being well-set with TeX, so I guess I should take another look.
      • cracrecry 701 days ago
        It was used exactly for that application.

        Not that long ago, typesetting math formulas was extremely expensive, unless you used TeX. It was not very commercial and companies were not interested but certain individuals were.

        The same happened when you wanted your Hebrew or Arabic or Japanese just to look right. Japanese uses at least four scripts plus latin words in the same place.

        Over time, commercial entities added support for those, most of them just incorporating TeX source code (as it was public domain). They added the visual interfaces and the easiness of use that TeX lacks.

        • hnfong 701 days ago
          > Japanese uses at least four scripts

          Which four? I can only count to 3. My understanding is that all the characters are "fixed width", even the Latin ones. I imagine that makes typesetting easier, and presumably doesn't require the complexity of the Knuth algorithm.

          • astrange 698 days ago
            Japanese text is often set not fixed-width in real life. If it includes non-CJK characters like Latin text and punctuation that may be proportional in a font that matches the other text, for instance.

            And when you’re not talking about a book, but an ad or something like that, it’s common to have a bunch of different text sizes or make some of the words in a sentence bigger to emphasize them.

          • kccqzy 700 days ago
            Kanji, katakana, hiragana, Arabic numerals, Latin text, and punctuation. It depends on the quality of the typesetting and the formality, but you can expect the Arabic numerals, Latin text and punctuations to have different widths than the rest. Of course it's acceptable to make everything full-width, just like in English.
  • umvi 701 days ago
    Some of these things are only super hard if you are developing a general solution. The general solution to word wrapping across all written human languages in use on the planet is indeed difficult. But implementing basic word wrap for an HTML5 game that only English players will play isn't too hard.
    • bo0tzz 701 days ago
      Unless you're targeting a very localized audience (eg employees in one company, kids in one school district), I don't think it's safe to assume that only English users will use your software.
      • Aeolun 701 days ago
        It may not be safe to assume only English speakers will use your software, but it is certainly reasonable to support only them.
      • adrianN 701 days ago
        If your software only supports English, most of its users will be fine with English.
  • PaulDavisThe1st 701 days ago
    I found it amusing to read this, given that in the context of Ardour (0) we still have the long term stretch goal of replacing GTK with an expansion of our home-grown canvas-based widget system. There are 5 widgets missing, one of them is the text entry. The description in TFA regarding doing this in the context of an HTML canvas element is more less identical to the issues that exist for a native re-implementation, and there's consequently a similar high pressure rationale for giving up on the idea.

    (0) https://ardour.org/

  • EdSchouten 702 days ago
    Parsing and displaying a floating point number, a la strtod() and printf(“%f”). I think that’s always the prime example of something that looks simple, but simply isn’t.
  • lelanthran 701 days ago
    I don't know of any developer that thinks text rendering is "simple".

    Even some non-devs, due to using word-processors that completely "mess-up" everything the minute a single table is added, won't think that text rendering is simple.

    The best "simple-is-very-complicated" essay to read is "Reality has a surprising amount of detail" - http://johnsalvatier.org/blog/2017/reality-has-a-surprising-...

    • ec109685 701 days ago
      That was an excellent blogpost.

      This whole discussion reminds me of Joel’s Single Biggest mistake software developers do: https://www.joelonsoftware.com/2000/04/06/things-you-should-...

      Namely, rewriting software is hard because the years of subtle complexity aren’t immediately apparent and hard to get right the second time (at least harder than first thought).

  • ozzythecat 701 days ago
    There are two things I absolutely do not want to implement, ever, if I can get away with it.

    Any kind of dynamic page rendering: we want to write code once and re-use everywhere, in the Web browser, mobile, iOS, Android, etc.

    This is a fool’s errand 99% of the time. You’ll end up building something super constrained if it is usable at all. Admittedly there are some viable options for this like React Native. Yeah, I just don’t want to work on this type of code.

    Client libraries: I want to write my own client, in top of a third party networking library, to send HTTP requests, and handle threading, tcp connections, connection pools, thread pools, etc.

    Even when using something as simple as libwebcurl, innocuous things fail in weird, unexpected ways that almost never come up in (my) unit tests, integration tests, and chaos tests.

    Server side client code that calls another service can be dangerous. Device code that you ship out to millions of devices is even more dangerous.

    Oh… 50% error rate because suddenly the code can’t access an SSL cert on the device. Wait… what??? I can’t even reproduce it?

    I’ve been writing code since grade school and am a decade now into my career. I’m an above average programmer on a good day, and terribly average at best on most days. So of course, I might be sounding melodramatic here.

  • userbinator 702 days ago
    I'll offer the counterpoint that it's only complicated if you want to make it so. If you only need e.g. monospace ASCII, that doesn't take much. Do you really need proportional fonts? Vertical layouts? Keming? In my experience, a lot of the time the answer is actually "no" --- and thus the simpler (and more efficient, less bug-prone) algorithm works just fine.
    • FooBarWidget 701 days ago
      Whether you need them is up to users. If you have Asian users then ASCII will not do. No kerning? Why would users put up with ugly text rendering? Just to make our lives as developers easier?

      This attitude reminds me of what someone says about the difference between German and Japanese tech. Japanese ones last forever because they overdesign their products to avoid breaking even when misused, because they are customer-focused. Germans are rule-focused, so they design their products according to spec. German products also last forever — if you use the product exactly per instructions. But if you make a mistake... good luck.

      • astrange 698 days ago
        Conversely, Japanese game engines sometimes don’t bother implementing proportional text, and then nobody can translate the game without porting it to Unity first.

        (German translations would need about 3x the UI space of Japanese ones, and that’s with a proportional font.)

      • eska 701 days ago
        As a German working in a Japanese company this kind of romanticization of these two cultures always makes me snort.
      • userbinator 701 days ago
        "ugly" is entirely subjective.
        • can16358p 701 days ago
          Well, if 90 out of 100 users think it's ugly, that would better be resolved to have more users use your library, even if you are in the 10% as the developer.
    • pdpi 701 days ago
      I think you're missing the point. These are fundamentally, but deceivingly, hard problems. The choice to only solve a subset of the problem is almost always the right way forward, but you can only make that decision if you understand how hard the problem is. A less knowledgeable developer can and will bite off more than they can chew with these things.

      Contrast with things like interpreters that a are a lot simpler to build and work with than most engineers think they are, and a perfectly reasonable solution to lots of problems.

    • remus 701 days ago
      I think the issue with this approach is that people may nod along and day they're happy with these compromises, but in reality they don't necessarily understand the trade offs and are then surprised when seemingly simple things don't work as expected. I'm thinking less of programmers here, who will generally understand what these limitations imply, and more of less technical users.
    • gitgud 701 days ago
      > Keming

      Brilliant, I've never seen kerning (ker-ning) spelt in a way that illustrates it's importance!

      • GreenWatermelon 701 days ago
        There is a subreddit with the same name for this sort of topic: /r/keming
      • userbinator 701 days ago
        If you don't implement kerning, you won't get keming. ;-)
    • zokier 702 days ago
      Do you really need more than 1s and 0s? Maybe go back to blinkenlights and toggle switches, those are simpler and even less bug prone to implement. You can still perform any computation you want, but it sure will not be as pleasant as having nice ux.
  • joshvm 701 days ago
    There are plenty of variations on this, I'm surprised no one has mentioned Falsehoods programmers believe about names yet:

    https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...

    (as someone with a long, double barreled surname this is relatable)

    The common thread is usually what assumptions you can make about your use case, when are they valid and when do they fail?

  • dhosek 701 days ago
    Grapheme segmentation in Unicode is actually kind of interesting and completely data-driven. I wrote my own implementation of the segmentation algorithm in Rust recently¹ (because the existing segmentation crate didn’t have an interface that let me get the next grapheme cluster from a char iterator). It was fascinating to dig into the specifications and see how everything fits together.

    1. It will be published to crates.io sometime in the next month or so, there’s possible optimization work to be done, or at least, an alternate interface to the iterator to see whether the speed penalty I’m paying is an algorithmic problem or just an artifact of the trade-offs I have for my use case. I also ended up writing my own character class code as well and in benchmarking against the current popular crate got a 10x speedup,² so even if my segmentation code is useless to anyone but myself, I think that this code at least will be useful.

    2. The other crate has a really inefficient implementation of looking up character classes where it ends up having to do a binary search in the data tables for many characters while I use a two-step table approach which lets me get a character’s class in O(1) time (and the tables are designed to be small enough that they’ll stick around in the CPU cache for lookups in a tight loop).

    • kwantam 701 days ago
      Sounds like a great improvement! Would it be possible to contribute your code to the existing crate rather than release your own? Then we could all enjoy the benefits on the next recompile :)
      • dhosek 701 days ago
        The existing crate seems to have lost its maintainer.
  • bullen 701 days ago
    I'm doing .ttf OpenGL with custom row break for my MMO:

    The trick is to take as much advantage of this command-line and remove all other GUI.

    I'm only going to support ASCII which helps a great deal.

    Chat bubble / command-line above characters is the only GUI the game will have:

    Login: /sign name pass

    Register: /join name pass pass

    Sensitivity: /sens 0.6

    etc.

  • magic_hamster 701 days ago
    You got to leave out implementing semantic considerations when starting out. I.e. if your line break changes the meaning of a word, or using hyphens is not allowed in certain words for some languages, those are not technical challenges. They are semantic challenges and solving them will require heuristics or specialized handling.

    Although being aware of these considerations is helpful for designing a solution which will be able to address these issues later. Like determining allowed break spots before doing the actual line break. And having that logic extensible.

  • trhway 701 days ago
    I think the complexity we handle has stayed the same. It is just for the same amount of complexity and mental energy we get more functionality these days than decades ago. Text is a good example - the same complexity and the same amount of code handles these days a huge number of languages whereis decades ago that complexity would give you only one.
  • dhosek 701 days ago
    >Splitting strings to graphemes is actually quite slow, so you can't do it too often during word wrap, or you'll tank performance.

    If you're splitting strings to graphemes for word wrap, you're doing it wrong. Unicode segmentation defines a word segmentation algorithm that should be used for that use case instead.

  • jimbob45 701 days ago
    Browser text isn’t measurable AFAIK at least in any consistent way. I tried doing it for a project at work and came to the conclusion that it simply wasn’t possible. The SO answer below appears to agree with me.

    https://stackoverflow.com/a/12469203

  • bribri 701 days ago
    There are many hidden complexities in seemingly simple features like text input fields and wrapping text in a canvas. It is better to let the browser or operating system handle these features, as they have been perfected over many years and are much more likely to work correctly than a custom implementation.
  • ycuser2 701 days ago
    Unrelated to the content:

    What is the reason to link to external websites like this: https://www.construct.net/out?u=https%3a%2f%2fen.wikipedia.o...

    Why not set a link to wikipedia directly?

    • leksak 701 days ago
      To track if people clicked on the link maybe?
  • orangepurple 701 days ago
    I was going to nominate mutexes and semaphores
  • buescher 701 days ago
    A whole lot of the simple functions in the standard C library are in there because a naive implementation will have bugs, often subtle ones. Implementing strcpy or strlen used to be a common interview question.