Show HN: 40k HN comments mentioning books, extracted using deep learning

(hacker-recommended-books.vercel.app)

1359 points | by tracyhenry 948 days ago

114 comments

tracyhenry 948 days ago
Hi HN!
I built this small app in my spare time to aggregate books recommended on Hacker News. I personally find books recommended on HN to be super helpful, so I think this is the way that I can contribute back.
This book aggregation idea is not new. A bunch of sites have done similar things [1, 2, 3].
Yet one common limitation of those sites is that they have limited recall (i.e. not able to get a comprehensive set of book mentions), and thus don't paint an accurate picture of what the top books are. They're all based on insufficient rules, e.g., looking for Amazon Links. As you can see from my app, people often do not include Amazon links when recommending a book.
I wondered, why can't we just match book names? Well, not so easy. Some books have pretty short names, e.g. Meditations [4], or Steve Jobs [5]. Some book name might as well be the name of a movie, e.g. Ready Player One [6]. Simply matching the names of the books would produce a whole lot of irrelevant results.
This is where Deep Learning comes into play. Recent advances in large NLP models (transformers and BERT in particular) have made machine language understanding unprecedentedly accurate. It enables me to fine-tune a BERT model on a couple thousand labeled HN comments and predict accurately whether each word in a comment is part of a book or not - a task commonly termed as Named Entity Recognition (NER).
As a result, my app is able to present a whole lot more results while maintaining desirable accuracy. For example, NER works pretty well on the tough examples I mentioned ([4, 5, 6]). Compared to prior sites, my app captures 9-50X more mentions and thus presents a much more complete picture of what books are recommended on HN.
Furthermore, I've made sure that the comments are presented well in the UI because the recommendations are just as useful as the books. I highlighted the mentioned book name, and used a custom NLP-based ranking function to sort the comments. These are non-trivial improvements over prior sites, which I hope you can find useful.
Nevertheless, this app is not without limitations: 1) matching book names would fail when two books have the same or similar names; 2) although not often, this approach would wrongly classify some short stop-word names [7] and 3) sometimes NER fails to see that the commenter actually hates the book. These problems can be alleviated with more Deep Learning. For 1), one can use BERT to learn the authors mentioned which can be used as a filtering criteria. 2) and 3) should be fixable with more training data (currently there are only ~4,000 hand-labeled HN comments).
Lastly, I'd like to especially thank my gf who helped me label ~1,000 comments, which boosted the model accuracy by 5 percent! I also want to thank the people who create and maintain the HackerNews big query dataset [8]. And of course, thank everyone on HN who recommends books to others.
Hope you enjoy this app! Feedback and suggestions are welcome :)
[1] https://news.ycombinator.com/item?id=15169611
[2] https://news.ycombinator.com/item?id=10924741
[3] https://news.ycombinator.com/item?id=12365693
[4] https://hacker-recommended-books.vercel.app/category/0/all-t...
[5] https://hacker-recommended-books.vercel.app/category/1/all-t...
[6] https://hacker-recommended-books.vercel.app/category/0/all-t...
[7] https://hacker-recommended-books.vercel.app/category/12/past...
[8] https://news.ycombinator.com/item?id=19304326
P.s. The amazon links are NOT sponsored. This app is free of monetization.
guidovranken 948 days ago
Nice. The Hacker News archive contains a wealth of great information. I've previously performed similar extractions like OP but with grep and SQL. I've also looked for people who have accurately predicted the stock market (I did identify one pro investor. He's now into NFTs). I've found so much cool stuff, spending whole nights looking for interesting users and reading their entire post histories and being blown away by many insightful posts. I've been considering making a blog consisting entirely of insightful HN posts that I come across.
[-]
- saadalem 947 days ago
  i think someone should start a fund that exclusively invests in startups that get torn to shreds on HN, “Why would anyone use this, I could build it in a weekend!”.. often means the startup goes on to reach a billion dollar valuation. here's my idea: understand the sentiment of HN comments using BOW models on successful startups already lunched, invest in the next ones.
  [-]
  - humanistbot 947 days ago
    > Why would anyone use this, I could build it in a weekend!”.. often means the startup goes on to reach a billion dollar valuation.
    I'm going to take extreme issue with your use of the word "often" here.
    [-]
    - puchatek 947 days ago
      In that case I shall take issue with _your_ use of the word "extreme" (even though I ultimately cannot verify the level of issue you took)
  - mikepurvis 947 days ago
    Sounds like an IFF fallacy— sure, HN hates some thing that turn out to be successful, but lots of things HN hates really are terrible (or turn out to be successful for unrelated reasons, like the team was great and they managed to pivot away from the terrible idea).
    [-]
    - 1cvmask 947 days ago
      It is like not seeing the potential in Dropbox.
      Anyone technically minded used FTP back in the days. So why is there a need for Dropbox (at that time).
      That is what makes it hard to invest if you think too much or "know" too much. You get blinkers that prevent you seeing what become obvious successes because of UX improvements for the non-technical crowd.
      [-]
      - setr 947 days ago
        It’s not like… that’s precisely the scenario being referred to. The problem is that’s always the example, which suggests that the dropbox event is likely closer to the exception than the rule. And it’s existence is used to deflect all criticisms, which is generally an indefensibly dumb strategy.
        Knowing the subject well may put blinders in some situations… but what’s the alternative? Know the subject poorly, flail about wildly and hope you land something by pure chance? Obviously knowledge isn’t the problem here; you need it to qualify if it’s a good decision or not. It’s the over-specialization, combined with the lack of empathy for the average user that derived the dropbox event.
        [-]
        riffraff 947 days ago
        time to run a statistical analysis on "I can build it easily myself" comments!
    - Semiapies 947 days ago
      AKA, "Yes, they laughed at Columbus, but they also laughed at Bozo the clown."
- air7 948 days ago
  Please do. That sounds super interesting.
- manigandham 940 days ago
  Do you have a list of those interesting users?
- moneywoes 948 days ago
  Do you mind sharing what investor
  [-]
  - thujlife 947 days ago
    John Titor
    [-]
    - tonetheman 947 days ago
      you win the internet today... pinky. but tomorrow is another day...
    - system2 947 days ago
      Thanks, now I am sucked into youtube rabbit hole.
hoten 947 days ago
You mention the Amazon links are not affiliate links. As a default, that's a nice move, but I believe you are within your rights to add a toggle to enable affiliate links. The money probably isn't the point but it's nice to make enough to buy even a single beer or coffee from a side project, and honestly I believe just about everyone would toggle the option if they found the tool useful.
[-]
- riffraff 947 days ago
  I would go further and suggest just having a link to wikipedia or goodreads or some other non-money-generating site and one (always with affiliate tags) to amazon.
  IMO it's not a "bad surprise" to a user that an amazon link is an affiliate one, it's just annoying when the only way you can get information on a book is through affiliate links.
  [-]
  - jerbearito 947 days ago
    Agreed. My approach would be: affiliate links by default, with a disclaimer and a toggle.
- muzani 947 days ago
  Yeah, I'd love to click an affiliate link. This is a very useful site and an affiliate link is a "free" donation that doesn't cost me anything.
- dionidium 947 days ago
  > As a default, that's a nice move, but I believe you are within your rights to add a toggle to enable affiliate links.
  I'd go even further. It's a loud minority that even cares at all about this and they make it feel like everybody agrees with them, but most people don't care and you're totally within your rights to do it. You should go for it!
- password4321 947 days ago
  The downside of affiliate links that I'm aware of is that Amazon makes available the entirety of any orders credited to an affiliate, so if multiple items are ordered the affiliate is able to see the entire order.
  Clicking through an affiliate link associates future orders for 24 hours, and once an order is placed no further orders are associated. (If an item is added to the cart while associated with an affiliate link it stays associated for 90 days, but it's not clear to me what information is provided to the associate regarding other items in the order if that item is eventually purchased.)
  https://affiliate-program.amazon.com/help/node/topic/GPTZ495...
- jahller 947 days ago
  Completely agree. It's always nice to give back to the community, but nobody will bat an eye if you cover your costs.
jedwhite 948 days ago
Hey this is really awesome! Well done.
You mentioned transformers and BERT for large NLP models. I've been playing around with this too and it's a really powerful approach. Have you used spacy-transformers? [0]
The approach is pretty cool and can be used with BERT, GPT-2/Hugging Face etc.
I'm just starting to experiment with GPT-J and thinking of trying this approach also [1].
Anyway, totally awesome project and the results are really good. This stuff really is almost unreasonably effective!
[0] https://explosion.ai/blog/spacy-transformers
[1] https://6b.eleuther.ai/
[-]
- tracyhenry 948 days ago
  Thanks! I used Huggingface's pretrained BERT.
  [-]
  - sillysaurusx 948 days ago
    Please write up how you did this! It may seem easy or straightforward, but I assure you it's black magic to a lot of people.
    [-]
    - tracyhenry 947 days ago
      You can refer to the medium post in this comment, which does more or less what I did: https://news.ycombinator.com/item?id=28601741
    - stanislavb 947 days ago
      That'd be great.
  - jedwhite 948 days ago
    This is a really good application of it. Getting NER right for something like book titles with so much name collision with other domains and entity types is really hard, and this works great on something that most people would never realize would be so hard!
  - malshe 948 days ago
    This is really impressive! Can you please elaborate more on the way you labeled the data? I think usually there is a lot to learn from labeling methods.
    [-]
    - tracyhenry 947 days ago
      I generated training comments by matching book names. Roughly, there are one in five of those comments that actually have a book mention. Then I use the Doccano labeling tool to label the tokens in the comments.
      [-]
      - malshe 946 days ago
        Thanks! I will check out Doccano now
  - m3at 947 days ago
    Thanks for building this, it's useful and well presented!
    Replies to your pinned top comment seems disabled, so allow me to ask here (sorry for the hijack):
    > used a custom NLP-based ranking function to sort the comments
    Can you expand on this function? (I'm familiar with NLP and most SOTA models)
    [-]
    - tracyhenry 947 days ago
      Ah right now ranking is really rudimentary: I just use TextBlob's SA library to get a sentiment score for the comment, and combine it with the comment length. In theory we can also use BERT to get a sentiment score, which should be more accurate I guess? Love to hear your thoughts.
      [-]
      - m3at 947 days ago
        Simple is good! Especially for ranking as the objective is hard to define. Having looked at some samples on your site, imo it's good enough :)
        Though if you wanted to try other things just for fun, maybe:
        - Count matching NER, comments with a lot of book recommendations tend not to detail why they like the specific one currently filtered
        - maybe down weight comments that are too short (after the matched NER is subtracted) as they seem to just have a title+link?
        Not ML but while I'm here, on Android the comment section appear fine but has an horizontal scroll that seems spurious, with lots of blank space to the right (Galaxy S10, chrome & firefox)
unmole 948 days ago
Interesting idea but not completely accurate. My own comment about how I hated Thinking, Fast and Slow seems to be counted as a recommendation.
[-]
- dang 948 days ago
  Yes. I've removed the word "recommendations" from the title because there are too many cases of negative mentions being treated as recommendations.
  Not a criticism! Sentiment analysis seems to remain an unsolved problem.
  See also
  https://news.ycombinator.com/item?id=28598341
  https://news.ycombinator.com/item?id=28596882
  Edit: oops, my title edit ("Show HN: 40k books on HN extracted using deep learning") was inaccurate. It's 40k comments, not 40k books. Fixed now.
- tracyhenry 948 days ago
  Right, the model is not perfect with limited training dataset I have (we hand labeled 4,000 - which is already tons of work for a side project). But the intention was to filter out negative ones.
  [-]
  - jimmySixDOF 948 days ago
    You did a stellar job here thanks so much for this addition to the community !
    On labeling, if you have a method statement or some go-by referance I am sure you would get some support here - I know I would help ! Maybe package a few blocks of 100 unlabeled comments with a readme & see what happens ?
  - indigodaddy 947 days ago
    It got some titles completely wrong too. For instance, “Open: an autobiography” by Andre Agassi was erroneously listed high in the title list as “139 comments,” however most of the comments are recommending various titles that start with “Open” many/mostly related to titles with “Open source” on the name— but the ML is attributing them all to the Agassi book..
  - thevagrant 947 days ago
    It's not so bad if you leave the negative comments. I often find negative reviews/comments quite informative.
- FranklinMaillot 948 days ago
  Lessons: My Path to a Meaningful Life by Gisele Bündchen, the top model, is probably the most out of place recommendation :) None of the comments is about the book obviously, they just mention the word "Lessons".
  https://hacker-recommended-books.vercel.app/category/15/all-...
  [-]
  - archon810 941 days ago
    Same with Primer.
- sampo 948 days ago
  > My own comment about how I hated Thinking, Fast and Slow seems to be counted as a recommendation.
  What is the level of sentiment analysis in natural language processing? Would it be easy to add the feature, to recognize whether the book was mentioned in a positive or negative light?
- therealdrag0 948 days ago
  This was an amusing "extraction":
```
    > I have not yet read the good book Atlas Shrugged but be sure to check it out based on your recommendation.

    You're delusional. Where did I ever recommend reading 
    Atlas Shrugged? Ayn Rand is nuts.
```
- munk-a 948 days ago
  If you want to see some amusing "recommendations" I'd check out The Communist Manifesto by Karl Marx and what comments it's drawn. I think the network trying to find recommendations needs to incorporate more sentiment analysis.
  i.e "Guards Guards by Sir Terry Pratchett is a great book" vs. "I've never read anything as slow and uninteresting as The Two Towers by J.R.R. Tolkein" or "I thought Seveneves by Neal Stephenson was good - but it probably should've been two separate books with the second half actually having some meat to it."
- jgwil2 948 days ago
  Yeah, I'm seeing some issues with Code by Petzold citing comments that are talking about e.g. Code Complete or just code in general, but with such a generic name (and given the forum) it's actually pretty impressive to me that most comments are identified correctly.
  Edit: another one that is tough is Open by Agassi - seems most of these comments do not actually have anything to do with the book. I would guess most one-word titles will have similar issues.
  [-]
  - tracyhenry 948 days ago
    That's correct observation. I'm guessing it has to do with whether the words after Open are indicative enough to the model that they should be brought in together with Open. As I said in other comments, with more training data this issue will likely go away. And these tough comments are the best candidates.
- dsr_ 947 days ago
  It also seems to have trouble with books that aren't available on Amazon -- a fair number are available through other vendors or free on various websites, and if Amazon doesn't sell it, it's not being caught.
CannisterFlux 947 days ago
It works great, I've already seen a couple of things I fancy reading now.
I am amazed that the app can find any relevant comments mentioning "It" by Steven King, for example. That seems like magic. A few IT and It's are in there too, but still impressive.
I was intrigued to see "Twilight" quite high on the list of fiction, doesn't seem like something the HN demographic would read. Turns out most comments were about the board game / video game Twilight Struggle, or Zelda Twilight Princess, and those comments mentioning the book were not doing so in a particularly positive way.
"One Hundred Years Of Solitude" is often commented as "100 Years Of Solitude" so the system can miss a few comments there.
For an improvement, it'd be handy if the comments that mention other books had a way to link you to that book, so you could see other comments. For example, if you click Dune, and see someone saying "yeah, I liked Dune, and also x, y and z", then those other books should be clickable too, so you could see what other people say about them. Seems like the information could already be there in the database as a list-like comment could potentially appear in many book recommendations, but who knows.
leobg 948 days ago
Very cool. This one’s wrong though: “Zero: The Biography of a Dangerous Idea”. Comments are talking about other books with “zero” in the tile, such as Thiel’s “Zero To One”. Perhaps parse longer titles first, and eliminate them, before matching for shorter titles? Great MVP. Had in fact been thinking about how great it would be to gather book data from HN myself just yesterday. So am really happy to see that someone actually made it. Plus, it looks great and is fun to use.
[-]
- tracyhenry 948 days ago
  Thanks. In theory this is the model's fault that's not learning "Zero to One" should be considered as a whole book. One limitation I mentioned in my root comment. Should be fixable with more training data!
farcaster 947 days ago
Amazing results!
I did something similar with RoBERTa and my own Kindle library to graph (with D3.js) all mentions/citations between my books (which books cites another books I have). I sorted the final graph by publication date to see some cool historical patterns of books citing another older books [1]
I also manually annotated ~1000 book mentions, but I combine RoBERTa with string search (I list all titles I want to search a priori) to reduce the number of false positives. I also augumented the dataset with thousands of books titles and metadata from goodreads.
I explain all the process on a blog post[2]
[1] https://thiagolira.blot.im/_projects/book_graph/main.html [2] https://medium.com/mlearning-ai/graphing-citations-between-b...
[-]
- tracyhenry 947 days ago
  Thank you!
  The medium post is amazingly written! I basically did the same thing - and you beat me with the data augmentation piece. I tried using nlpaug [0] but it didn't improve the model performance. I'll definitely try swapping book titles around.
  [0] https://github.com/makcedward/nlpaug
  [-]
  - Siira 947 days ago
    Can you also share the source code of the model and the site? I am starting to learn NLP, and I love indie applications like this.
Tycho 948 days ago
Sounds good. Blocked by my work firewall though.
A few years ago I found an article that was something like '100 short books everyone should read before they're 40'. It was a mix of fiction and non-fiction. I've never been able to find it again! But I really liked the list because these are books you can consume in a few hours and may be life changing.
I remember a few of the titles: Games People Play, Meditations, The Prince, The Art of War. (I suppose it may have been non-fiction only, although I think The Awakening may have been on there.)
Wish I could find the link again.
[-]
- ZeroGravitas 948 days ago
  If it was Oliver Sacks' Awakenings then it is non-fiction, though it did get turned into a movie.
  [-]
  - Tycho 948 days ago
    Different book - Kate Chopin
- Rd6n6 948 days ago
  I don’t understand how software engineers get away with browsing the internet for fun at work when nobody else can
  [-]
  - themodelplumber 948 days ago
    Sounds more like personal development than fun?
    [-]
    - sillysaurusx 948 days ago
      That's a bit like saying watching porn is more for personal development than fun. Perhaps you'll learn something, but it's incidental.
      I've learned a lot from HN. But it wouldn't be good to fool myself into thinking that an employer wants to fund my personal development in this regard. Otherwise, they'd pay me to HN all day.
      The crux of the issue is that it's impossible to work 8 hours every day. We all invent lies to fill the downtime.
      [-]
      - themodelplumber 948 days ago
        Is all that hyperbole really necessary? Each new sentence seems primed to leak edge and corner cases. Without giving more attention to such a rhetorical blind spot, I wonder how one could imagine they know the crux from the passenger side door.
        [-]
        sillysaurusx 948 days ago
        Which sentence is mistaken?
        [-]
        themodelplumber 948 days ago
        The one with all the generalizations
        [-]
        sillysaurusx 948 days ago
        If it’s mistaken, it should be easy to explain why. Otherwise I’m inclined to believe it’s merely an uncomfortable truth.
        Would your employer pay you to HN all day? If not, precisely how much of your day are they comfortable with you HN’ing? Are you sure it’s officially approved?
        [-]
        kritiko 947 days ago
        There’s often good intel on here that I have not been exposed to on other sources. Obviously spending 40 hours a week reading HN and getting into politics arguments on here is a waste, but there’s plenty of relevant news for most folks in tech if you stick to those topics.
        loopz 947 days ago
        Do you collect stats, how assertions line up with facts? Otherwise, what may seem likely and catchy, might just as well be opinionated, unsubstantiated and patchy.
  - chadcmulligan 948 days ago
    https://xkcd.com/303/
    Waiting for Compiles is the usual, there's a lot of waiting in software - waiting for compiles, scripts to run, someone else to do something.
MarcScott 948 days ago
HN really likes Neal Stephenson. I've never read a book of his that I didn't love, so will be definitely looking though more of the recommended fiction from the community here.
[-]
- samuel 948 days ago
  REAMDE was crap, IMO, and I'm a Stephenson fan.
  And the problem with Stephenson is that's rarely succint so a bad book from him turns into a huge loss of time.
  [-]
  - macintux 948 days ago
    I addressed that problem with Seveneves by skimming about 1/3rd of it.
    [-]
    - ImaCake 947 days ago
      I really liked Seveneves, but totally understand why someone wouldn't like parts of it. I am curious as to which third you skimmed though as the book is effectively three different books with very different moods.
      [-]
      - macintux 947 days ago
        The middle third, since by the time I reached that point I was exhausted and ready to find out how it all ended.
        Most comments I’ve read here indicate the last part of the book is the most unpopular, so perhaps I should go back someday and read the middle.
        [-]
        ImaCake 946 days ago
        The first third is very intense, I also was a bit fatigued for that middle third. When I have reread Seveneves I have skipped it. I really like the last third, but it does feel like a completely different book.
  - the__alchemist 947 days ago
    Only one I haven't read. Skip it I guess. Reading DoDo now, so verdict's out, but I absolutely adored every other of his books. Fall as a pleasant surprise, since it seemed controversial.
  - wrycoder 947 days ago
    Well, I liked it. The merging of different groups of people from different cultures and classes was zany, and the action was non-stop.
    [-]
    - gibspaulding 947 days ago
      Reamde would have been my first recomendation to introduce someone to Stevenson. Constant action, an unusually direct storyline for him, and not too too much time spent off in the weeds.
      Of course some might argue those are the best things about his books, but while Chryptonomicon is perhaps my favorite sci-fi novel I've read, I think it takes a certaim type of person to enjoy a book where the plot gets interupted for 5 page descriptions on how to eat Captain Crunch or a who fucked who of the Greek pantheon.
srcreigh 948 days ago
You helped me spend $150 on books! Two comments
1. I regret you earned $0 for helping me spending so much on books. Have you considered setting up affiliate links or a donation button? Maybe affiliate links as a service will be your next project.
2. The Amazon links are for Amazon.com, but I'm in Canada. Maybe easy internationalized Amazon affiliate links will be your next project.
[-]
- rahimnathwani 948 days ago
  For #2, there are services OP could use, that will automatically switch links to the right country store, e.g. https://geniuslink.com/how-it-works/for-affiliates
- gricardo99 948 days ago
```
   You helped me spend $150 on books!
```
  Check your local Library. Depending on where you are, it could be a fantastic resource for books.
  [-]
  - malshe 948 days ago
    After reading such comments here on HN, last month I got myself a local library card and it has turned out to be a great decision! I am using Libby app to get digital books and even audiobooks! Absolutely fantastic
    [-]
    - Rebelgecko 947 days ago
      The Libby app has also made some great usability improvements lately for people who have multiple library cards. If you live in an area with multiple library systems (like city and county), it's totally worth signing up for both.
      [-]
      - dudus 947 days ago
        Thanks for mentioning Libby. Just installed and spent 3h on it. I had no idea local libraries could be this convenient
      - malshe 947 days ago
        Thanks! I will check out
    - mitchdoogle 946 days ago
      In addition to Libby, there is also Hoopla and Kanopy. Hoopla is pretty similar with ebooks and audio books, while Kanopy is mostly video
      [-]
      - malshe 946 days ago
        Thanks, downloaded both the apps!
  - indigodaddy 947 days ago
    Or just order used books from thriftbooks.com. It’s the only place I buy books for the last few years now. Cheapest prices (almost) always, but even if they’re not quite the cheapest for a particular title, the free shipping on any order over $10 always puts it over the top.
    Note also that addall.com used book search doesn’t include thriftbooks in the results, so I just always go straight to thriftbooks and don’t bother searching addall.com anymore..
    No affiliation, just a longtime happy customer.
  - _joel 947 days ago
    Unless you're in the UK, where a lot of local libraries have shut down, unfortunately :(
- xpe 948 days ago
  I regret that so many people regret that other people are not monetizing.
  [-]
  - lostgame 948 days ago
    You know what, if commenter OP finds value in the services offered; and wishes to compensate the author of the software - just gonna say - I have no problem with that.
    [-]
    - mihaic 948 days ago
      You might not have a problem with that, but some of us dislike knowing that monetizatization has to become omnipresent, as it changes everything.
      [-]
      - srcreigh 948 days ago
        Affiliate programs are the most anti-big corp monetization strategy ever.
        Considering I already buy books on Amazon, if there's anyway I can just find an affiliate (any affiliate), Amazon gets 5.5% less revenue.
        For tracyhenry, they would get ~$8.25 CAD straight out of Amazon's pocket for my $150 purchase.
        https://associates.amazon.ca/help/node/topic/GRXPHT8U84RAYDX...
        [-]
        dublinben 948 days ago
        Amazon would get even less revenue if you bought your books somewhere else, like https://bookshop.org/ or directly from an independent book store.
        scns 948 days ago
        You can use Pi-holes' monetization link.
        (edit) just scroll down that page: https://pi-hole.net/donate/#sponsorship
        soperj 948 days ago
        except it becomes much harder to find genuine recommendations for things on the web.
      - ijidak 948 days ago
        How do you pay your bills?
        People should get paid for work.
        Whether that work is having a job. Or making a website.
        I don't see the difference...
        Someone who does useful work deserves wages.
      - thaufeki 948 days ago
        A Patreon/crypto address to make a donation to is the compromise here, surely.
  - MathCodeLove 948 days ago
    I regret that some people seem to think that any sort of compensation for services rendered or monetization in any way is automatically bad or wrong somehow.
    [-]
    - ijidak 948 days ago
      Agree.
      People should get paid for work.
      Whether that work is having a job. Or making a website.
      Someone who does useful work deserves wages.
      Even 2,000 years ago the Bible said: "the worker deserves his wages."
      Most of the people who are again monetization are perfectly happy to get paid by their employer.
      Is direct employment the only morally upright way to receive payment for hard work?
    - darwinwhy 948 days ago
      I regret having read this entire comment chain.
      [-]
      - amelius 948 days ago
        I regret that you did not get compensated for your lost time.
Andrew_nenakhov 948 days ago
Ok #3 is Dune. It'll surely be super helpful in building my interstellar empire.
Step 1: make elites addicted to drugs
Step 2: monopolize drug trade
Step 3: install a religious fundamentalistic regime with yourself at its head
(All very logical until this point, but next step might be a problem, can anyone offer advice)
Step 4: transform into a worm
??!
[-]
- robotresearcher 948 days ago
  Don't forget the step of achieving prescience, which allows you to figure out what the '??!' is.
  [-]
  - Andrew_nenakhov 948 days ago
    That's what drugs are for, no?!
motohagiography 947 days ago
If you walked into someone's house and looked at their bookshelves, and they had most of these on them, or their books were mostly in the union of a subset of these, I'd wonder what one might speculate about them.
Looking forward to the criticism that results from mapping the co-ordinates of this ontology, as one could weave a narrative around most of the books that aggregates them into types and categories themselves, then transmit the criticism without the substance to some believers, which could codify into an "anti-HN" ideology (which is just a peculiar form of fan club.) Calling "hacker-critical" as a pseudo academic backlash trend now, and a "hacker studies" course designed to encircle these ideas with criticism as levers to manage people who have them. Really, if you aren't using AI to create predictive levers about people's beliefs and behaviors to manage and extract value from them, what are we doing with it. :)
Super cool to create this though, as it would be really interesting for other comminities, potentially subreddits.
[-]
- gibspaulding 947 days ago
  The algorithm is definitely missing some recommendations in it's current form, but I suspect if it weren't I'd be pretty close to your description.
  Edit: I just did an HN search for the few books I couldn't find with this app and was able to find comments recommending all of them. Not sure if this means I need to branch out or just that HN reads a lot..
rustmachine 948 days ago
Cool project, and cool resultats. As an anthropologist who reads HN as a way to keep abreast of the tech community and tech insights, its interesting to see atlas shrugged as one of the most often recommended books. Interesting and maybe slightly disturbing. HN would make for quite interesting source material for someone who wanted to study tech culture.
[-]
- dang 948 days ago
  I'd be careful about that generalization. This software seems to be going more by mentions than by recommendations - e.g. the top reply to https://news.ycombinator.com/item?id=16323808 ("Ask HN: Which are the most damaging books you've read?") is being counted as a recommendation.
  Sentiment analysis is hard. In fact I've never seen it work yet.
ramraj07 948 days ago
Great work but do note that the list basically looks slightly better than an amazon list (atlas shrugged lol). I think some effort into more useful ranking (looking for metrics of controversiality or maybe page rank) might make it more useful!
[-]
- vavooom 948 days ago
  I am also curious to know if the # of votes is integrated into the ranking at all, possibly weighted. Could also attempt NLP Text Sentiment analysis to influence the model as well.
  Regardless, fantastic work already!
  [-]
  - tracyhenry 948 days ago
    Right now the ranking is a simple combination of sentiment and length. Including #vote definitely sounds useful!
goopthink 947 days ago
This is not a criticism of the work done, but I think the top 20-40 mentions are extremely obvious and a regular reader might be able to guess a good portion of these recommendations. What is really interesting - and started at with the “categories” — is tying the recommendations to explicit context. I didn’t dive too deep into the recommendations, but are the categories by book category, or by originating topic of conversation? It’s a narrow distinction, but a useful one. I’d love to see deep learning pull up a hierarchy of conversational topics on hacker news and match recommendations to those trees.
leobg 948 days ago
BTW, going through that list, I see why I love the HN crowd. 70 % of those books I’ve read myself, and did so before coming to HN. There must be some strong personality type filtering going on.
[-]
- reducesuffering 948 days ago
  I think it's been quite obvious there's some personality type filtering going on, as with most online communities. I'm quite curious how it'd be quantified. Surely software engineers, startup founder, ADHD, INTJ, and Meyers-Briggs-is-bogus types are overrepresented. Might tell us a bit more about ourselves...
- konschubert 947 days ago
  The top book, "Thinking Fast and Slow" is full of experiments that are flat out false. https://replicationindex.com/category/thinking-fast-and-slow...
  [-]
  - leobg 946 days ago
    Well, minus that one. If it trends on Blinkist, better to skip it.
- cinntaile 948 days ago
  There's a strange error in there. The Art of War by Sun Tzu is listed twice, why is that? Since it finds the right book and author?
sushisource 948 days ago
Heh, for a minute there I thought you meant Warhammer 40k books specifically, and I thought that was a pretty funny thing to be scraping from HN :)
[-]
- russellbeattie 948 days ago
  I'm pretty sure there's more Warhammer 40k books than there are days in the year... It's like someone heard the term "space opera" and thought that meant "soap opera in space".
  Recommendations would include comments like, "This novel is really the one that ties the previous 37 books together." or "You might want to skip the next dozen books if you're squeamish about things that ooze."
  [-]
  - LanceH 948 days ago
    While I don't consider the 40k books on par with the better science fiction out there, I do enjoy that they bring a bit of scale and what it means to space. It's a different take from the rosy, post-scarcity, future of space. Bad things are really bad. Unattended good things turn bad on their own just from drift.
    Then there is there is the unashamed embrace of over-the-top in so many different ways.
    [-]
    - KiranRao0 947 days ago
      I've always considered 40k a satire of the entire sci-fi genre (and in many ways a satire on modern politics). In that way, I find it quite refreshing. And your statement of "unashamed embrace of over-the-top" resonates quite well.
- bnbond 948 days ago
  Same. I'm a little disappointed it's not.
spookyuser 948 days ago
This is really incredible!
A while ago I created something adjacent to this that looks for hacker news review of books on goodreads (https://github.com/spookyuser/hacker-reads)
So I'm very curious how you managed to find book titles, I ran into a lot of issues trying to figure out, for example, with "Clean Code" whether to search for "Clean Code" or "Clean Code: A Handbook of Agile Software Craftsmanship" since people mentioning the book used both instances. And of course someone mentioning just "Clean Code" might be referring to the concept not the book. I ended up settling on `${titleMinusColon} - ${author}` but I'd love to know what your approach was given that you used deep learning to search.
EDIT: Just read your comment below on your approach, very interesting!
concernedctzn 948 days ago
Found it interesting that I couldn't find results for Knuth (The Art of Computer Programming) or SICP on here. Maybe the casual way we refer to these texts is hard to detect as a reference to a book, or their importance is just implied community knowledge?
[-]
- tracyhenry 948 days ago
  If there is no search result for the book name then it just means it's not in my current book database (which is limited).
  [-]
  - Siira 947 days ago
    Do you mean the training data? So each book title must be present in the training labels, and new titles can't be learned automatically?
- ir193 947 days ago
  i though sicp is most most mentioned book on HN. but sicp doesn't look like a book name.
amelius 948 days ago
Perhaps you can do the same for research papers. Would the code need to be changed in any way?
[-]
- tracyhenry 948 days ago
  Not much - but it needs a new set of training data for research papers. Btw - there seems to be an existing website for this already: https://www.hackernewspapers.com/ Although it only looks for posts.
  I'd assume that Arxiv links are often there. So it's a problem that can be addressed with an easier solution (just looking for Arxiv links).
Rebelgecko 947 days ago
This is awesome, thanks for putting it together! For me, I've had the most luck with HN fiction recommendations so I went there first. The distinction between "Literature & Fiction" vs "Science Fiction & Fantasy" seems to be a bit arbitrary. For example The Hobbit is in one category but LOTR is in the other, Neal Stephenson and Andy Weir are in both, etc. It might make sense to just merge all the fiction together. Plus that way you can short-circuit any debates about "is science fiction literature" :)
Edit: another little nit: it looks like quite a few books list audiobook narrators as coauthors
bachmeier 948 days ago
Interesting idea, but this is mentions of books, not recommendations. It includes comments by someone that's reading the book, has it on their reading list, or read it and thought it was terrible.
[-]
- tracyhenry 948 days ago
  The intention was to only show recommendations. But because of limited training data (we hand labeled ~4000 comments), the model wasn't able to filter out bad ones effectively. More training data should be able to solve it.
johannesha 947 days ago
Nice! I built a similar NER for book recommendations fine-tuned on a manually labeled dataset of book recommendations mentioned in podcast transcripts. The whole project is open-source and I already added a few podcast shows with all their book recommendations (I have to add a lot more though): https://github.com/JohannesHa/PodcastBookLibraryMonoRepo
metalliqaz 947 days ago
A book that I and others have recommended doesn't show up in the database.
Animal, Vegetable, Junk: A History of Food, from Sustainable to Suicidal by Mark Bittman
rahimnathwani 948 days ago
This is awesome. The best thing is that it's so fast to navigate. I like how the HN comments are styled just like on HN.
A couple of thoughts:
* It would be great if each book were to have its own URL (for sharing).
* Consider allowing the search to allow author input, e.g. if I want to find the book 'Who' by Geoff Smart, the single-word title isn't specific enough to show that book at the top of the search results.
[-]
- soco 948 days ago
  If I look for one single word and that single word is the answer, shouldn't that be the very first result? I mean that's a 100% match right there...
  [-]
  - rahimnathwani 948 days ago
    If the dataset were perfect, maybe. But, if a book with a single-word title has only few comments, it's plausible that most/all of those comments are false matches.
    In the case of the book I searched 'Who', showing it in 4th position seemed about right.
  - tracyhenry 948 days ago
    y the search can definitely be improved (e.g. to include author). Right now it's SELECT * FROM books WHERE name LIKE '%{search_string}%'
deaddabe 948 days ago
Impressive work.
What data source are you using for the books, authors and covers? I looked at OpenLibrary [1] but the covers are not the same, so I suppose it is something else? Maybe Amazon directly somehow?
[1] https://openlibrary.org/search?q=zero+to+one&mode=everything
[-]
- tracyhenry 948 days ago
  I crawled about 20k books from Amazon. Thanks for pointing me openlibrary!
mrazomor 947 days ago
A loss which might be worth debugging (maybe it contributes to the whole pattern of losses -- didn't dig deeper): "Brave: A Teen Girl's Guide to Beating Worry and Anxiety" is never explicitly mentioned, but all ~60 misclassified references are actually referring to "Brave" browser or "Brave New World" book.
samuel 948 days ago
This is amazing. Thank you!
Does it take into account negative reviews/comments? I have seen that Why we sleep is being recommended in the 6 months tab, but, while it was received with a lot of praise, it was soon critizised by others researchers in the field and I would expect that the HN crowd would have followed that trend.
[-]
- tracyhenry 948 days ago
  When I labeled the comments, I didn't label books that were criticized. So in theory the model should filter out negative reviews. But currently the training dataset is pretty limited in size so you still can see some negative ones. I suspect that with more training data this problem will go away.
zeristor 948 days ago
Can’t find my recommendations for J. Scott Turner’s “The Extended Organism”
To summarise: organisms evolving to change the environment around them to their benefit. I went to Foyle’s one day with butlying a book on Termite mounds in mind, that is one chapter in the book.
I found out too late that UCL had hosted a talk by Dr Turner a year too late.
rg111 947 days ago
This is great work. There are a lot of great recommendations on HN. Before this, I had this page archived and have been reading through it- https://hackernewsbooks.com/top-books-on-hacker-news.
[-]
- prewett 947 days ago
  Neither list seems to include much fiction. I've been reading "Diaspora" based on HN recommendations, as well as "Snow Crash", "Cryptonomican", and "The Martian".
  ("Cryptonomican" was a good story, but I really hated all the jumping around every five pages. "Diaspora" has sort of so-so writing, but very hard-math sci-fi and quite interesting ideas. I think it's the "hardest" sci-fi book I've read, which includes "The Martian" and the red-green-blue mars books.)
  [-]
  - rg111 941 days ago
    Thanks for sharing your opinions of these books.
    I just bought Cryptonomican. Looking forward to reading it.
    The book I am reading right now is also chosen based on HN recommendations (The Talent Code). And I am about to read the GTD book by Allen. (I hate self-help generally with very particular exceptions)
  - tracyhenry 943 days ago
    Snow Crash, Cryptonomican and The Martin are all in top 20 in the all time list.
Brajeshwar 947 days ago
This is brilliant. Thanks for doing this. One tiny request -- can you please link the primary title to the homepage. I want to be able to click somewhere to come back to the home page after browsing around a bit.
After packing/archiving my library of physical books around 2010-11, I went all digital. However, I came back and try to stay roughly at 1:4 (physical:digital) book ratio when my daughter complained that me reading on the Kindle, "Are you really reading a book."
I re-started buying physical books in 2018. I have a knack of buying books recommended by Hackernews comments and the curated list of people I follow on Twitter. I re-started with less than 10 physical books around mid-2018. Between me and my daughter, we might have crossed 200+ physical books. I need to figure out a better way to deal with this.
[-]
- tracyhenry 947 days ago
  Will do, good point! For now you can just delete all paths so you can go to https://hacker-recommended-books.vercel.app/ which btw redirects to the top book of the all-time list of all categories.
godmode2019 948 days ago
This is very impressive, well done on deploying this.
95% of every book I have ever read or owned is in the first 20 pages.
Its almost just as fun to read the comment chain about each book.
You must be independently wealthy because I know no one cares if their is an affiliate link. I believe affiliates are always paid to the last cookie you have.
soheil 948 days ago
"You're delusional. Where did I ever recommend reading Atlas Shrugged? Ayn Rand is nuts."
Interesting that that's one of the recommendations.
inanutshellus 948 days ago
I'm reminded of Goodhart's Law... So long as your project remains secret it'll be valuable. Once someone sees money being made from it, it'll kick off ingenuine recommendations... anyway... high quality problem to have I guess!
cweill 948 days ago
Great execution, and very neat app!
But, what's wrong with using Amazon affiiliate links? If anything, monetizing would be great since it would give you more incentive to maintain this wonderful application? And it doesn't cost us users anything.
[-]
- tracyhenry 948 days ago
  Great point. I'm on a student visa which forbids any non-work income. That's one reason why :)
brianshaler 947 days ago
Nice work! I noticed it accurately picked out solaris and associated a few recent comments, none of which were about the operating system.
There was a fantastic HN comment[0] which actually spurred me to buy and read it. Do your queries go back far enough to pick this one up? It's an interesting example where one sentence mentioning the author alludes to an association with the title of the book in a sentence that is explicitly about the OS.
[0] https://news.ycombinator.com/item?id=15170979
maiensch 948 days ago
Love it, will you do a write-up on how to replicate this with other sources? I'm currently analyzing both Indie Hackers and StartupsForTheRestOfUs Interview Transcriptions and this could be a fun analysis!
[-]
- tracyhenry 947 days ago
  You can refer to this comment: https://news.ycombinator.com/item?id=28601741
  [-]
  - kenan7 947 days ago
    amazing stuff, hey, can you also make comments clickable please? in order to get the context.
    [-]
    - tracyhenry 947 days ago
      Thanks. Comments are clickable. Try hovering over the commenter or date?
      [-]
      - kenan7 947 days ago
        By the way, I love Barry Farber's books,
        Do you think comments like these also could be picked up by your algorithm?
        https://news.ycombinator.com/item?id=738446
      - kenan7 947 days ago
        Ah, they actually are clickable! Sorry and thank you.
zsmi 948 days ago
It's a really interesting project. And I am sure it's really hard.
I was curious how many times some common textbooks were mentioned but didn't find them via the search, which could be user error. But to give a specific example. None of the books in this comment thread were found:
https://news.ycombinator.com/item?id=19893447
Comment text like this: "CMOS VLSI Design: A Circuits and Systems Perspective (4th Edition)" by Weste and Harris
should've been caught, right?
[-]
- tracyhenry 948 days ago
  It could be that I don't have this book in my book database.
cyberge99 948 days ago
I’m Curious as to why you didn’t choose to monetize with affiliate links.
Is seems simple and easily justifiable reward. I didn’t click the links, but hopefully you used smile.amazon for charity.
This is novel and useful. Thank you.
defect0 948 days ago
Noticed an issue. Some, but not all, comments referencing Strunk and White's Elements of Style are showing up instead as Erin Gates' Elements of Style: Designing a Home & a Life
[-]
- tracyhenry 948 days ago
  Good catch! This is the limitation mentioned in my root comment - the algorithm will fail when two books have similar names. The partial solution is to look at authors too when available. Something to be included in the future.
nautilius 948 days ago
Amazing and super useful: If I start reading today, and I read a book a day, it'll only take 112 years to finish, assuming that no additional books will be recommended in the next century.
[-]
- wrycoder 947 days ago
  I noticed that as I got several pages deep, the recommendations started to repeat.
  [-]
  - nautilius 947 days ago
    Then maybe only 30 years ;-)
clktmr 947 days ago
Interestingly I recently read "A Fire Upon the Deep" based on a HN recommendation. A quick search showed it has many mentions, yet it's not listed in your app at all.
jp42 948 days ago
Slightly off from post. The best book recommendation I got is from one of the following ways, dedicated recommendation service or app never worked for me:
- told by friend
- someone I admire read book and commented on it
- I'm working on some problem, during that exploration i came across books.
- random people mentioning book on platform like HN on a topic/post of my interest.
[-]
- rahimnathwani 948 days ago
  The most life-changing book recommendation I got was from HN: 'Teach your child to read in 100 easy lessons'.
  [-]
  - jp42 948 days ago
    Thanks for the comment! My son is 3y8m. He knows letter and many word. Looks like this book could be what he needs to get on next level.
- TakerofVita 948 days ago
  Yeah, in my experience a lot of 'general' book reviews are super critical and don't really try to hook you. Going through several reviews, you just come away with the collected gripes and nitpicks of what is otherwise a good book.
  I find that I get sold on a lot more when it is just a random single comment on some thread somewhere that focuses on a single aspect of a book.
  If you can find a hyper specific subreddit/forum/etc. for a sub-genre you like, then you will spend more time reading books than reviews...
- spookyuser 948 days ago
  > random people mentioning book on platform like HN on a topic/post of my interest.
  Same! Some of my favorite book recommendations have especially come from this one, I don't know why but a one line comment on a HN thread of "what book changed your life" has become my favorite way for discovering books.
baby 948 days ago
Interestingly it cannot differentiate between the different harry potter recommendation (the original books, fanfics, and that book on philosophy that mentions harry potter)
ehutch79 948 days ago
It has atlas shrugged in the top 10?
[-]
- gjm11 948 days ago
  "Atlas Shrugged" is a polarizing book: people tend to either love it or hate it. And the people who love it love to tell other people how great it is, whereas many of the people who hate it just don't talk about it (because there's generally little need to talk about the badness of bad books).
  I think a book list is more useful if it has some books in it that some love and some hate, rather than only books that no one minds very much. Maybe some of them will turn out to be ones I love.
  (I happen not to be a Rand fan myself.)
- hermitcrab 947 days ago
  Ugh. I'm not sure which is more offensive, the ideas behind it or the prose.
- Kiro 947 days ago
  What's so strange about that? Objectivism used to be huge among hackers.
city17 947 days ago
Great idea. I think it's interesting how the hivemind decides what are great books. There are some objectively great books in the list, but also very debatable ones.
For example, The Design of Everyday Things has some interesting content, but I found it almost unreadable. It's such a poorly organized and written book that its design seems to go against almost every rule discussed in the actual content. Keeps surprising me that this is such a highly praised book.
jcims 947 days ago
This is amazing! I cashed in a bunch of Audible credits on this one. Having the comments right there was super helpful to understand the context of the mentions and other books that were related.
Do you think it would work on podcast transcripts? The formatting of the titles wouldn't be nearly as regularized so it might have quite a few false positives/negatives, but there's a lot of book conversations out there.
mdp2021 948 days ago
Suggestions: any way to notify you (your system) of book recommendations your processing has missed?
You could have a form to notify you of a post which seems to be not processed, e.g. "https://news.ycombinator.com/item?id=28549134", or "id=28591398" etc.
(BTW: very great work, and thank you for your invaluable service)
Brajeshwar 947 days ago
I have another suggestion. Can you please link the Amazon links with the option they have to lead me to my home-country's (correct) link? I don't know how but I have seen ways where they open to the right store for me, otherwise, it is too many clicking or copy-paste-searching.
I just bought 5 Children's Books and I don't mind if you can benefit from the affiliate link that goes from your account.
[-]
wombatmobile 948 days ago
I like this a lot.
The longer extracts are more useful than the shorter extracts.
For Brave New World, I noticed the first 100 - 200 comments are short, and not useful as reviews so much as indicators of preference. Then after that, the comments are longer, and hence, more useful because they explain something.
It would be useful to be able to filter word length so as to be able to distinguish between Opinion Mode vs Review Mode.
throwaway158497 947 days ago
Great app and great execution, especially showing comments next to the books. Is there a chance you can open source some of it?
What are your future plans?
awinter-py 948 days ago
it confused rationalist harry potter fanfic with 'harry potter hogwarts hardcover journal and elder wand pen set', amazing
[-]
- wizzwizz4 948 days ago
  To my knowledge, /jk Rowling doesn't allow people to sell Harry Potter fanfic, even though she's fine with its existence.
kvathupo 948 days ago
In anticipation of getting flagged into oblivion, am I the only one who's disappointed in this selection of books?
Of course, taste is subjective, and it should perhaps be expected that much of the list is in line with what is read by the general public, but many of the books are either presenting fact or attempting to convince the reader of the veracity of a certain viewpoint. I'd like to read more open-ended works that ask for interpretation on the part of the reader or, at the least, don't explicitly spell out what they want the reader to walk away with. (certainly some books here fit the bill, e.g. Infinite Jest, Pride & Prejudice, etc.). Again, interests are subjective.
In light of this, book recommendations?
[-]
- awillen 948 days ago
  I think that's just the nature of pulling books from HN comments - a lot of those comments are trying to convince people of a viewpoint, so it seems unsurprising that this is the kind of list you'd end up with.
  Not good or bad, just a function of where they're coming from.
  And as for book recommendations, Children of Time by Adrian Tchaikovsky.
- themodelplumber 948 days ago
  Personally I wouldn't recommend others' books to someone who is left unfulfilled by such a huge list. I would rather recommend writing or other subjectively-pinned activities, to hold the subject accountable and help them stay out of the critic zone long enough to find their way into more fulfilling growth.
Gatsky 947 days ago
Hmmm, this just permanently ended the problem of working out what to read next.
Fantastic work, and a terrific interface to boot.
johnbatch 947 days ago
I started reading Unsong based on this HN thread [1] a month ago. I guess it’s not on Amazon as a book, so it didn’t match as a book?
[1] https://news.ycombinator.com/item?id=28125249
andrew_ 947 days ago
Great work, but might be a few holes in this. I recommended a book about 10 days ago [1] that doesn't appear in your index.
[1] https://news.ycombinator.com/item?id=28482475
[-]
- dragonwriter 947 days ago
  I’ve frequently suggested Patterns of Democracy and it doesn't seem to have picked it up, despite apparently picking up books with only a single mention.
personjerry 948 days ago
The problem with reading book lists like this is that nobody has time to read all the books. That's a ton of crap out there and I want HN to help me filter through them.
Thus the problem with existing solutions is NOT "limited recall" or "insufficient rules" or "no Amazon link".
And the problem with this "solution" is that there is no justification for why a book is great and applicable to my circumstances, and people have to trust your black box. Otherwise I'm likely to waste my time, just like reading books from any other crappy recommendation engine.
With a deep learning model reducing all the reviews to "book names" you've successfully removed the value of the book discussions themselves. Therefore, for me this engine and all similar engines are strictly worse than simply going through the actual big threads themselves, i.e. https://news.ycombinator.com/item?id=21900498
Edit: I've just seen the embedded comments by switching to a desktop browser. It's a nice addition. However, for me to make sure I'm not wasting my time going through arbitrary books and comments, I would need to know why a book is ranked highly compared to other books. And I want to be sure that ranking is tailored to me, at a very, very high accuracy.
[-]
- adewinter 948 days ago
  > With a deep learning model reducing all the reviews to "book names" you've successfully removed the value of the book discussions themselves.
  It literally shows each comment in full that it extracted the book name from. It also includes a link to the comment in the original thread. What more could you possibly want?
  [-]
  - personjerry 948 days ago
    Oh, I was on mobile and could not see the comments section. It's interesting for sure. But what I want in particular is to learn why a book is ranked highly compared to other books. And I want to be sure that ranking is tailored to me, at a very, very high accuracy.
    [-]
    - lbriner 948 days ago
      They are ranked highly because of the number of times they are recomended in a comment.
- FredPret 948 days ago
  There's no way around the black box element of a book review, but Nassim Taleb suggests waiting a few decades and, if the book is still well known, then reading it.
  [-]
  - bachmeier 948 days ago
    Wow. What a helpful piece of advice (I guess he's smarter than the rest of us so it's hard to understand the genius of his strategy). Any mention of the cost of missing out on the content in the book for a few decades?
    [-]
    - dwighttk 948 days ago
      Just read older books until there aren’t any and then move onto newer ones. There are too many that are a couple decades old for you to ever run out.
      Also occasionally break the rule for a book you want to read. It isn’t like that would kill you.
    - phgn 948 days ago
      The idea behind reading older books is that they're already proven to be useful - it's a filter for you to spend less time on useless information. It's generally called the "Lindy effect"
    - lifekaizen 948 days ago
      Love this question. I could imagine him suggesting reading academic papers for cutting edge things; like his 'barbell' excercise strategy of mostly walking with occasional HIITs.
  - dhosek 948 days ago
    I guess it's ok to read The C Programming Language, then.
    [-]
    - FredPret 948 days ago
      Technical manuals are more like a journal in the sense that you have to read all the new ones if you want to keep up.
      Novels, philosophies, and histories are works that can stand the test of time if they’re good enough
      [-]
      - blondin 947 days ago
        the c programming language stood the test of time.
- tracyhenry 948 days ago
  > I want HN to help me filter through them.
  The comments panel show the actual recommendations. And the books are ranked by number of recommendations. Is this not enough?
FinanceAnon 948 days ago
Awesome idea and nice looking UI! I will definitely visit when I am looking for new books to read.
One thing that I've noticed is that when I select another book, the scrollbar in the comment section doesn't automatically scroll up.
[-]
- tracyhenry 948 days ago
  I know this but I wasn't able to fix it. Would love suggestions on how to keep the scroll position in one div (for the books) but not the other (for the comments) when doing client-side navigation using Next.js...
  [-]
  - RheingoldRiver 947 days ago
    I was going to make this suggestion too, but now that I saw this comment, instead I'll suggest asking on the next.js subreddit? It seems somewhat active, maybe someone there can help - https://old.reddit.com/r/nextjs/
alanbernstein 948 days ago
This is great. I just read Permutation City, which I coincidentally see recommended on HN all the time, so I was surprised not to see it in the search results or the top of the fiction or scifi lists. Any idea why that is?
[-]
- tracyhenry 948 days ago
  That might be that the book database I used is quite limited, sadly.
GuB-42 948 days ago
Interesting lack of 1984, even though it is mentioned way too often. The lesser known "Animal Farm" and other dystopias like "Brave New World" and "Fahrenheit 451" are here.
Is it because it is a number?
[-]
- tracyhenry 948 days ago
  It's because my book database doesn't have it. In fact my model identifies 132 mentions of 1984, some examples:
  https://news.ycombinator.com/item?id=20285306
  https://news.ycombinator.com/item?id=12518804
  https://news.ycombinator.com/item?id=22724495
  [-]
  - SquishyPanda23 948 days ago
    This is great thank you.
    On the topic of Brave New World, the site categorizes it as a Reference.
- qiqitori 947 days ago
  Another book that occasionally gets mentions (51 results for https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...) but appears to be missing on the site AFAICT (not related to 1984, but just came to mind):
  https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
  Unrelated to the above, but it would also be nice if the site could search by author (I don't seem to get any hits when putting in author names) or even year of publication.
- SquishyPanda23 948 days ago
  The book title is "Nineteen Eighty-Four", but nobody spells it that way.
  The app may need to special case it.
  [-]
  - Zobat 947 days ago
    There's suspiciously few misspellings in the names of the books, even with words like 'millionaire' and 'righteous'. There's also 'Calculus' with 132 comments and just browsing through the comments I can find at least four different books referenced and some just talking about calculus in general.
    That said it's an interesting project that clearly took some effort to put together.
    Edit: Saw that the author commented on 'books with similar names. Many of the comments I saw had the books authors name in them as well, next iteration should perhaps look for and match on those as well.
timClicks 947 days ago
Interesting. I wonder what my book's title needs to do to be classed as a book title by this engine. It's actually mentioned in the same comments as a few of the peers in its topic (Rust).
[-]
- wiz21c 947 days ago
  yeah, it's HN, rust is special down here :-)
hipitihop 947 days ago
This is very nice, thanks for making it. I'm looking forward to improvements in the sentiment analysis.
How often are you updating your results ? Can I download the recommendations dataset for offline queries ?
amai 942 days ago
Very useful. Just one thing: You should reduce the long tail and filter out all books which have been mentioned in only one comment. I think these books are not representative for hackernews.
Borrible 947 days ago
Thank you, this is absolutely funtastic work.
So many interesting books I will never have time to read. Could somebody please find a solution for the problem of mortality? If there is time.
spockz 947 days ago
Thank you for this app. I would consider removing the nested scroll bar for the comments. Also, on iPhone, in the last 6 months view I don’t see any links to Amazon.
jb1991 947 days ago
I would really love a way to export the whole list of books.
OhHiMarkos 947 days ago
Great job OP, this app is super helpful. Does this app still aggregate books from comments or was it a one time thing?
Is it ok to call these type of applications crowdsourcing?
thecleaner 947 days ago
God bless you! This is amazing stuff. Could you write some sort of a whitepaper on this topic ? It seems a really good text extraction/cleanup project.
endofreach 948 days ago
Amazing. I appreciate that there are no affiliate links. But I honestly think: you should put affiliate links.
Also, if it makes sense, have a monthly list.
alanlammiman 947 days ago
This is really great, congratulations. Is it a business? Or open-sourcing it by any chance? Would love to try it on some other forums too
codeisawesome 947 days ago
What a fantastic site and application of ML! This is an inspiring project :)
Wonder how “Focus” went under “Christian Books & Bibles”, though.
devenvdev 947 days ago
Great work! Question: will this be updated (at least annually) and how do I subscribe? I would actually pay for this :)
luxurytent 947 days ago
Incredible work and appreciated your detailed summary. I now have a reading list for the next while :-)
enos_feedler 947 days ago
I would love this for mentions of podcasts and twitter accounts on podcasts im listening to
xupybd 947 days ago
Interesting. This has missed my comments despite picking up some comments replying to me?
apples_oranges 947 days ago
Always interesting what new versions of this website come out. This is very well done :)
zerop 947 days ago
Can you give more confidence score to recommendations from users having higher karmas.
[-]
- tracyhenry 947 days ago
  y on the todo list!
jeron 948 days ago
40k good books out there and I can only read like 24 a year if I really push myself
pknerd 947 days ago
This looks pretty cool. I wonder how else NLP could be used to find other insights
elymar 947 days ago
Just bought Thinking in Systems thanks to this. Appreciate the effort, nice tool.
oakfr 948 days ago
This is really cool stuff. Would be really nice to do the same for movies :)
tracyhenry 948 days ago
Although viewable on mobile, this app is best viewed on larger screens! :)
figassis 948 days ago
This is amazing, thank you.
qwert12345887 948 days ago
Can this be done to get list of blogs posted here with topic analysis?
kierkegaard7 947 days ago
Nice. Is there an easy way to download it all? (beyond scripting one)
LyalinDotCom 947 days ago
this is amazing. could we also get an exported Excel version?
nickthemagicman 948 days ago
Came here for the Warhammer stayed for the book recc's.
muddi900 942 days ago
The first book in the past 6 months section is incorrect.
wiz21c 947 days ago
A cursory looks shown me that most if not all books do have good review on Amazon. Somehow, I hoped that HN'ers would point to book with bad reviews (that is, a misunderstood book)..
the_arun 948 days ago
Thinking loud here - what is the difference between Google Search Algorithm & AI Based deep learning? They both are trying to do same I guess - that is structuring unstructured data?
justinzollars 948 days ago
Wow! this is amazing. Nice work! :) made my day
rajeshmr 947 days ago
Can you provide a way to export the dataset ?
tomerbd 947 days ago
Hey any chance you can share the source code?
floatingatoll 947 days ago
If you write a series of blog posts about these books, and talk about whether we should read them or not, that would be totally cool to see on HN and also very meta.
hivedotone 947 days ago
OP what's the best way to contact you?
oakfr 948 days ago
@tracyhenry: how does the system work exactly? I cannot find any documentation on your website.
[-]
- tracyhenry 948 days ago
  hey, you can scroll down to find a long comment of mine documenting the approach.
  [-]
  - dang 948 days ago
    https://news.ycombinator.com/item?id=28596207
donohoe 947 days ago
Harry Potter and the Goblet of Fire
sunjester 947 days ago
I normally give a lot of negative feedback, but this is amazing. What a great mind to think of this.
zerop 947 days ago
Please change the word (hacker) in URL. It's blocked in my firewalls.
tinmandespot 948 days ago
This makes me happy
unobatbayar 947 days ago
Thank you so much
Jondar 947 days ago
I like your cat
artursapek 948 days ago
this is awesome, thanks for making it
Borlands 948 days ago
Brilliant
begueradj 948 days ago
some are interesting
bonniejawker 948 days ago
are you planning to open source the app? could you do one for lobste.rs too?
throwaway59553 947 days ago
This is very interesting.
In some cases there are books that some user recommended and then the book listed is not that one. As an example, lots of people are recommending Spivak's "Calculus", or "Calculus Made Easy", from some other author, but the one listed is "Calculus" from James Stewart. Same happens for the book "Calculus: Early Transcendentals".
Sun Tzu's Art of War is also repeated with two different editions.
newbie789 947 days ago
I have a quick question about how this works. For example according to this page it says that Atlas Shrugged has 290 recommendations. Are all of these posts about the book positive or actual recommendations? Personally if you see me posting about that book it’s likely me being skeptical about the actual breadth of its appeal (like this post) or just trashing that book in general.
For example, if I had posted “Atlas Shrugged is a bad book for children with a laughable premise that could have maybe been comedic in its terribleness if it weren’t jam-packed with literally hundreds of extra pages of masturbatory drivel. It makes the reader dumber with every sentence.” Would your NLP model count that as a recommendation? Technically I said “book for children” in it, so maybe yes?
Edit: This has now been addressed.
supperburg 948 days ago
Surprised to see “A pattern language” on there. I’ve read most of it in preparation for building my house. It’s more of a dictionary than a book but it’s unbelievably useful. It’s just a huge list of little things that an architect would notice over the span of his career. Little things that are important but not obvious. If you’re building a house, another really good book is “what not to build.”
I also recommend “Islamic imperialism” from Yale, “the bomb in my garden” by mahdi obeidi and “nothing to envy.”
[-]
- gjm11 948 days ago
  Most likely the main reason "A Pattern Language" is popular here on HN is that it spawned a movement in software engineering: https://en.wikipedia.org/wiki/Software_design_pattern
  (Plus the fact that it's a good book on its own terms. At least, it is so far as I can tell; I am not an architect and maybe some of the advice in it is actually terrible. But it seems almost always reasonable and frequently insightful, and it's well written, and the "pattern language" idea that software engineering borrowed from it is a nice one. (Though the software-engineering borrowings don't generally amount to actual pattern languages as opposed to miscellaneous grab-bags of alleged patterns.)
sharmin123 947 days ago
How Does A Hacker Hack A Phone? How To Avoid Phone Hacking?: https://www.hackerslist.co/how-does-a-hacker-hack-a-phone-ho...