Ask HN: How would you sort HN users by quality of comments?

Suppose you wanted to generate a list of HN users who consistently provide high-quality comments.

What metrics would you use to construct your formula?

26 points | by jawns 1895 days ago

15 comments

  • Sohcahtoa82 1895 days ago
    I don't think it's possible.

    A high-voted comment isn't necessarily high-quality, just really popular. I recently had a command get to 36 points [0]. I'm honestly not convinced of it's quality, even though it's my own comment. It was as a strongly worded opinion based on a high level of cynicism, not research. But people liked it, so it got a decent number of votes.

    [0] https://news.ycombinator.com/item?id=19060284

    • itd00d 1892 days ago
      Reddit still can't figure it out. "Best" sorting is clearly curated by mods, at least in the top 4-5 comments on real popular posts...the only valuable sorting on sites like this is "Top," "Bottom," and most "Controversial" (most down votes/up votes).
  • AnimalMuppet 1895 days ago
    Maybe something like (# of comments upvoted - # downvoted) / (total # of comments) or (total # of upvotes - total # of downvotes) / (# of comments)

    You could tweak this to say that downvotes get heavier weight, say.

    You could also look at number of replies, but a flamebait comment could also get a large number of replies, so that might not be a good metric.

    You could also look at the number of times their posts have been flagged.

    None of these are foolproof. You're probably going to have to experiment to find something that you think gives sane results.

    I admit that I'd like to see your results...

    • LinuxBender 1895 days ago
      I believe your idea makes sense. With time, people may learn a way around it however. One technique I use, is that if I were going to post something controversial, I just put it on a random domain and submit it as an article. Article submissions are only subject to flagging and not downvotes.
  • peterwwillis 1895 days ago
    "High quality comments" for a user in general is completely subjective. They might have "high quality comments" on C programming, and "low quality comments" on marriage equality. You'd need a custom filter for every possible topic of conversation, and even then you'd have to filter based on what you find to be "high quality".

    It's easier to just filter based on popular opinions or popular people, which is what karma is for.

    • muzani 1895 days ago
      This is a high quality comment and yet it's at the bottom fot some reason.
  • krapp 1895 days ago
    Read their comments, decide if I like them.
  • Adamantcheese 1895 days ago
    Hand select comments and produce a training set. Train some model, pump every comment through it. Make a histogram of all the values you test and any user that is a statistical outlier from that dataset (towards higher values) will be added to a list of "high quality" posters.
  • chatmasta 1895 days ago
    "Squawk to Talk" ratio

    Squawk = number of words in your hacker news comments

    Talk = number of words in children of your hacker news comments

    • diffeomorphism 1895 days ago
      That metric would seem more useful to find controversial comments, not good comments.
    • wbkang 1895 days ago
      I don't agree with this. Some people write comments that are completely false or outdated, which is followed by a series of corrections and tangential discussions.
    • yesenadam 1895 days ago
      I imagine some people perhaps prowl 'new' and mostly do ground-level comments on stories with very few or no comments; others favour stories with already many comments and leave 4th or 5th level comments mostly, as on pages with hundreds of comments, most are at that height or higher. Early commenters vs late commenters.
    • tyingq 1895 days ago
      What denotes good though? I can imagine scenarios where either high or low squawk/talk could be good...or bad.
  • CM30 1895 days ago
    Something along the lines of:

    Comment scores for all comments added together (with negative scores taken off) / number of comments. A bit simplistic, and prone to echo chamber effects (high rated may just mean they say things they know everyone will agree with), but it's probably the most objective setup here, given the lack of miraculous super smart AI to 'judge' quality on any deeper level.

    • tfehring 1895 days ago
      In addition to echo chamber effects, this would strongly favor comments in popular threads. This may or may not be appropriate, depending on how you define "high-quality."
      • zamalek 1895 days ago
        > this would strongly favor comments in popular threads

        Sum the total number of upvotes and score based on that. Track an independent score for downvotes. Display the two scores separately, in addition to the ratio between them. The MVP could be as simple as `votes_for_user_in_post/votes_in_post`. Scores would be intentionally low, you're looking for a track-record of greatness and not a once-off jackpot (or crackpot) opinion.

        There could be further improvements with population distributions: if most of the comments are receiving between 2 and 10 upvotes, a comment with 20 is truly exceptional (for that post).

        The problem that follows is that early comments that are mediocre will likely get more upvotes than late comments that are great. You're strangely aiming for the inverse of the HN comment ranking algorithm. StackOverflow has this problem: quick answers that are barely good enough typically receive more upvotes, and S/O hasn't solved that problem.

  • colvasaur 1895 days ago
    I don't think comment karma is publicly available through the site or API, but average comment karma could be calculated based on total karma of user minus sum of post karma divided by number of comments.
  • tedmiston 1895 days ago
    An aggregate metric based on comment karma scores would probably be the most useful. That said, you will need other HN users to grant you access to their accounts to be able to get this data.
  • quickthrower2 1895 days ago
    I’d do is solely by the number of times their comments have been favourited by reputable users say over 2k rep.
  • zorronimous 1895 days ago
    I would use only my own votes since quality is merely an opinion.
  • itronitron 1895 days ago
    what do you mean by 'consistently' ?
  • angersock 1895 days ago
    Number of times they've been banned. :^)
  • writepub 1895 days ago
    Whatever metric you use, since HN as a community is effectively training the metric, you'll simply capture what HN users consider as "high-quality" comments.

    HN users do have their biases (in my opinion):

    - Anti big company (GOOG, APPL, ... Lees so for APPL thouhg)

    - Mostly left leaning

    - Value political correctness over accuracy

    - Anti defense and government (loathe 3 letter agencies, despite being fully cognizant of their mission statements)

    I personally do not consider any comment falling squarely in the above silos to be high-quality, but you'll see them being consistently up-voted by the community.

    • thundergolfer 1894 days ago
      I think what you've picked out as biases on HN really just reflects your biases.

      Are you by chance right-wing? I'm solidly left wing and personally more often see the opposite bias permeate HN comments for the various topics you listed.