AlphaFold 3 predicts the structure and interactions of life's molecules

(blog.google)

1100 points | by zerojames 11 days ago

43 comments

lysozyme 11 days ago
Probably worth mentioning that David Baker’s lab released a similar model (predicts protein structure along with bound DNA and ligands), just a couple of months ago, and it is open source [1].
It’s also worth remembering that it was David Baker who originally came up with the idea of extending AlphaFold from predicting just proteins to predicting ligands as well [2].
1. https://github.com/baker-laboratory/RoseTTAFold-All-Atom
2. https://alexcarlin.bearblog.dev/generalized/
Unlike AlphaFold 3, which predicts only a small, preselected subset of ligands, RosettaFold All Atom predicts a much wider range of small molecules. While I am certain that neither network is up to the task of designing an enzyme, these are exciting steps.
One of the more exciting aspects of the RosettaFold paper is that they train the model for predicting structures, but then also use the structure predicting model as the denoising model in a diffusion process, enabling them to actually design new functional proteins. Presumably, DeepMind is working on this problem as well.
[-]
- refulgentis 10 days ago
  I appreciated this, but it's probably worth mentioning: when you say AlphaFold 3, you're talking about AlphaFold 2.
  TFA announces AlphaFold 3.
  Post: "Unlike AlphaFold 3, which predicts only a small, preselected subset of ligands, RosettaFold All Atom predicts a much wider range of small molecules"
  TFA: "AlphaFold 3...*models large biomolecules such as proteins, DNA and RNA*, as well as small molecules, also known as ligands"
  Post: "they also use the structure predicting model as the denoising model in a diffusion process...Presumably, DeepMind is working on this problem as well."
  TFA: "AlphaFold 3 assembles its predictions using a diffusion network, akin to those found in AI image generators."
- theGnuMe 11 days ago
  And that tech just got $1b in funding.
  [-]
  - LeanderK 10 days ago
    can you expand? Who got 1bn of funding?
- hackernewds 10 days ago
  Coming up with ideas is cheaper than executing the ideas. Predicting a wide range of molecules okay-ish is cheaper than predicting a small range of molecules very well.
nybsjytm 11 days ago
Important caveat: it's only about 70% accurate. Why doesn't the press release say this explicitly? It seems intentionally misleading to only report accuracy relative to existing methods, which apparently are just not so good (30%, 50% in various settings). https://www.fastcompany.com/91120456/deepmind-alphafold-3-dn...
[-]
- porphyra 10 days ago
  They also had a headline for Alphazero that convinced everyone that they crushed Stockfish and that classical chess engines were stuff of the past, when in fact it was about 50 elo better than the Stockfish version they were testing against, or roughly the same as how much Stockfish improves each year.
  [-]
  - kriro 10 days ago
    I think Alphazero is a lot more interesting than Stockfish though. Most notably it lead me to reevaluate positional play. Iirc A0 at around 2-3 ply is still above SuperGM Level which is pretty mind-blowing. Based on this I have increased my strategy to tactics ratio quite a bit. FWIW Stockfish is always evolving and adapting and has incorporated ideas from A0.
    [-]
    - RUnconcerned 10 days ago
      Stockfish has not incorporated ideas from AlphaZero. Stockfish's NN eval technology, NNUE, comes from Shogi and it predates Alphazero there.
      The 2nd strongest engine, Leela Chess Zero, is indeed directly inspired by AlphaZero, though, and did surpass Stockfish until NNUE was introduced.
      [-]
      - abecedarius 10 days ago
        Hmm: NNUE was introduced in 2018, the AlphaZero preprint 2017, AlphaGo 2015-2016. I checked this because my memory claimed that it was AlphaGo's success that sparked the new level of interest in NN evaluation.
        Wouldn't surprise me if AlphaZero's improvements had no influence in that timeline, but for AlphaGo it would.
        [-]
        thom 10 days ago
        The original NNUE paper cites AlphaZero[0]. The architectures are different because NNUE is optimized for CPUs and uses integer quantization and a much smaller network. I don't think one could credibly claim that it would have come about if not for Google making so much noise about their neural network efforts in Go, Chess and Shogi.
        0: https://github.com/asdfjkl/nnue/blob/main/nnue_en.pdf
      - thom 10 days ago
        For whatever it's worth, the NNUE training dataset contains positions from Leela games and several generations of self-play. Stockfish wouldn't be where it is if not for Google's impact. AlphaFold will likely have a similar impact on our understanding of protein structure. I don't know why everyone is so offended by them puffing their chests out a little bit here, the paper's linked in the article.
    - AlexCoventry 10 days ago
      How do you train for strategic thinking in chess? I read a book on positional chess once, but that's as far as I've gone.
      [-]
      - kriro 9 days ago
        The first thing I'd recommend is constantly evaluating positions from a strategic POV ("Evaluate like a GM" is a good book, alternatively look at a lot of positions and evaluate like you were an engine and then engine check).
        Second (or first if you lack even the basics to do said evaluation) is understand strategic concepts. A good starting point would be "Simple Chess" the next step would be pawn structures ("Power of Pawns" -> "Chess Structures" would be my recommendations, the latter is probably the greatest chess book in recent times imo). There's also many Chessable courses, I'm quite fond of "Developing Chess Intuition" by GM Raven Sturt and the "Art of..." series by CM Can Kabadayi for lower rated players. The sky is the limit, there's good books all the way up, for example "Mastering Chess Strategy" usually recommended for 2000+ ELO
        Third study great positional players like Carlsen, Karpov, Petrosian etc.
        I'd say the most important thing to realize is that just like tactics puzzles, there's strategic puzzles but they are not as obvious.
        [-]
        AlexCoventry 9 days ago
        Thanks.
  - mda 10 days ago
    Alphazero indeed crushed stockfish with a novel technique, I think it deserved all the praise.
    [-]
    - hibikir 10 days ago
      It definitely deserved a lot of praise, but the testing situation wasn't really against a fully fledged stockfish running on similar hardware, but one that, among other things, had no opening library.
      The issue is not whether alphazero was impressive, but that we should be careful about the specific claims of the press releases, as they are known to oversell. The whole thing would have been impressive enough if the games had been against the last release of stockfish with good hardware, just for the way it played.
  - thom 10 days ago
    And then what happened is AlphaZero changed the professional game in various interesting ways, and all its ideas were absorbed into Stockfish. A little bombast is forgivable for technology that goes on to have a big impact, and I don’t doubt it’s the same story here.
    [-]
    - thealig 10 days ago
      > all its ideas were absorbed into Stockfish
      don't think that is true, Stockfish incorporated NNUE techniques through a fork https://www.chess.com/news/view/stockfishnnue-strongest-ches...
      being transparent with the setup of your invention is always a good thing.
    - GaggiX 10 days ago
      >all its ideas were absorbed into Stockfish
      That's not true at all, Stockfish still uses only human heuristics for search and NNUE for eval, a completely different architecture than alphazero and derived from the Yu Nasu Shogi engine.
      [-]
      - thom 9 days ago
        It's a neural network trained on self-play games (many of them lifted from Leela Zero). I get that it's a different shape of network, but people really seem touchy about crediting Google with the kick up the bum that led us here. AlphaZero had a massive effect on chess globally, whatever people think about its press releases. My main point is that people should update the heuristic that wastes energy arguing about bold claims when clearly something amazing has happened that everyone in the industry will react to and learn from.
        [-]
        nybsjytm 9 days ago
        I don't have any particular thoughts about DeepMind's board game algorithms or how they were advertised, but even if I happened to think it was the most innovative and influential research in years, I'd still ask for honest communication about the work. It's part of being a healthy research community - although clearly the AI community falls well short on this, and nobody could say it's only DeepMind's fault.
        GaggiX 9 days ago
        >It's a neural network trained on self-play games
        It's not, it's just supervised learning on evaluations. There is no self-play involved when training the model.
        [-]
        thom 9 days ago
        Where do the evaluations come from? The idea that Stockfish isn't benefiting hugely from Google having created and advertised AlphaZero is preposterous, can we please just stop?
        [-]
        GaggiX 9 days ago
        >Where do the evaluations come from?
        Good datasets are selected empirically, they are usually a mix of different sources, not a single engine.
        >The idea that Stockfish isn't benefiting hugely from Google having created and advertised AlphaZero is preposterous, can we please just stop?
        I have not said anything about AlphaZero, I am just reporting where you are wrong. Your arguments are simply not very convincing.
        [-]
        thom 9 days ago
        Okay, well, no sale I guess. Stockfish's training dataset is mostly self-play games from an engine directly inspired by AlphaZero. It moved to neural network evaluation after a fork based on a paper that cites AlphaZero. It plays chess more like AlphaZero than Stockfish 11. Yes, it's extremely interesting that it continues to edge out Leela with a fast, rough approximation of the latter's evaluation but much faster search. But it (and human chess) wouldn't be where it is today without AlphaZero, and I was originally responding to someone dismissing it based on the perceived over-zealousness of its marketing, as people seem to want to do with TFA. I merely submit that both of these Google innovations are exciting and impactful, and we should forgive their presentation, which nevertheless has been kind enough to link to the original papers which have all the information we need to help change the world.
        [-]
        nybsjytm 9 days ago
        > someone dismissing it based on the perceived over-zealousness of its marketing, as people seem to want to do with TFA
        Sorry, but that's nothing but a reading comprehension problem for you
        [-]
        thom 9 days ago
        A lot of that going around. Have a great weekend.
- hereme888 10 days ago
  That's what I thought. They go from "predicting all of life's molecules" to "it's a 50% improvement...and we HOPE to...transform drug discovery..."
  Seems unfortunately typical of Google these days: "Gemini will destroy GPT-4..."
- Aunche 11 days ago
  IIRC the next best models all have all been using AlphaFold 2's methodology, so that's still a massive improvement.
  Edit: I see now that you're probably objecting to the headline that got edited on HN.
  [-]
  - nybsjytm 11 days ago
    Not just the headline, the whole press release. And not questioning that it's a big improvement.
- bluerooibos 11 days ago
  That's pretty good. Based on the previous performance improvements of Alpha-- models, it'll be nearing 100% in the next couple of years.
  [-]
  - akira2501 11 days ago
    > it'll be nearing 100% in the next couple of years.
    What are you basing this on? There is no established "moores law" for computational models.
    [-]
    - OrigamiPastrami 10 days ago
      It's the internet. There is no source more cited than "trust me bro"
    - bluerooibos 10 days ago
      Computational models have been shown to improve with computing power though, right?
      It's a tongue in cheek comment about how fast models have been improving over the last few years, but I forgot HN scrutinizes every comment like it's a scientific paper.
      [-]
      - akira2501 10 days ago
        The ROI exponentially decreases with language models. At this point, each percentage point of accuracy costs you tens of billions, and the projections show a solid wall approaching with the current implementation tricks.
        It's an off the cuff comment, at this point though, HN apparently needs to bully everyone who refuses to go along with the zeitgeist, as if being negative near a thing would destroy it.
        I guess the 'h' no longer stands for 'hacker.'
  - nybsjytm 11 days ago
    Just "Alpha-- models" in general?? That's not a remotely reasonable way to reason about it. Even if it were, why should it stop DeepMind from clearly communicating accuracy?
    [-]
    - dekhn 11 days ago
      The way I think about this (specifically, deepmind not publishing their code or sharing their exact experimental results): advanced science is a game played by the most sophisticated actors in the world. Demis is one of those actors, and he plays the games those actors play better than anybody else I've ever seen. Those actors don't care much about the details of any specific system's accuracy: they care to know that it's possible to do this, and some general numbers about how well it works, and some hints what approaches they should take. And Nature, like other top journals, is more than willing to publish articles like this because they know it stimulates the most competitive players to bring their best games.
      (I'm not defending this approach, just making an observation)
      [-]
      - nybsjytm 11 days ago
        I think it's important to qualify that the relevant "game" is not advanced science per se; the game is business whose product is science. The aim isn't to do novel science; it's to do something which can be advertised as novel science. That isn't to cast aspersions on the personal motivations of Hassabis or any other individual researcher working there (which itself isn't to remove their responsibilities to public understanding); it's to cast aspersions on the structure that they're part of. And it's not to say that they can't produce novel or important science as part of their work there. And it's also not to say that the same tension isn't often present in the science world - but I think it's present to an extreme degree at DeepMind.
        (Sometimes the distinction between novel science and advertisably novel science is very important, as seems to be the case in the "new materials" research dopylitty linked to in these comments: here https://www.404media.co/google-says-it-discovered-millions-o...)
      - moomin 10 days ago
        Anyone remember how he marketed his computer games?
        [-]
        nybsjytm 10 days ago
        No, how?
        [-]
        moomin 9 days ago
        Massively overpromising unachievable things. From wikipedia:
        https://en.wikipedia.org/wiki/Republic:_The_Revolution
        Initial previews of Republic in 2000 focused upon the purported level of detail behind the game's engine, the "Totality Engine". Described as "the most advanced graphics engine ever seen, (with) no upper bound of on the number of polygons and objects", it was claimed the game could "render scenes with an unlimited number of polygons in real time".[14] Tech demonstrations of Republic at this time showcased a high polygonal level of detail,[21] with the claim that players would be able to zoom smoothly from the buildings in Novistrana to assets such as flowers upon the balconies of buildings with no loss of detail.[22] The game was further purported to have artificial intelligence that would simulate "approximately one million individual citizens" at a high level of detail,[23][19] each with "their own unique and specific AI" comprising "their own daily routine, emotions, beliefs and loyalties"
        I feel like it's always worth bearing in mind when he talks about upcoming capability.
        [-]
        nybsjytm 8 days ago
        Thanks, I'd never heard about this before. I definitely think it helps in understanding his commentary. It's really a shame that DeepMind picked up his communication style.
    - 7734128 11 days ago
      I'm quite hyped for the upcoming BetaFold, or even ReleaseCandidateFold models. They just have to be great.
  - tsimionescu 10 days ago
    Which specific AlphaX model evolved like that? Most of the ones that were in the press had essentially a single showing, typically very good, but didn't really improve after that.
uptownfunk 11 days ago
Very sad to see they did not make it open source. When you have a technology that has the potential to be a gateway for drug development, to the cures of new diseases, and instead you choose to make it closed, it is a very huge disservice to the community at large. Sure, release your own product alongside it, but making it closed source does not help the scientific community upon which all these innovations were built. Especially if you have lost a loved one to a disease which this technology will one day be able to create cures for, it is very disappointing.
[-]
- DrScientist 10 days ago
  I suspect https://www.isomorphiclabs.com/ is the reason.
  There are 3 basic ways to fund research.
  - Taxes - most academic research
  - begging - research charities
  - profits - companies like Google.
  Sometimes the lines get blurred - but I don't think you can expect Google to release as much of their work for free as people who are paid via central taxes.
  [-]
  - abecedarius 10 days ago
    Worth noting that they did release the AlphaFold 2 weights after a while. Milking an expensive discovery for a limited period should be considered laudable, unless you think tax funding of all research would be awesome and it's just a weird anomaly how the org producing these results was a tiny heterodox startup very recently.
    [-]
    - DrScientist 10 days ago
      > Worth noting that they did release the AlphaFold 2 weights after a while.
      Yes - though I don't think Isomorphic labs existed at that point.
      Obviously the real reason AlphaFold was possible was the huge tax payer funded effort running over decades to generate a diverse, high quality 3D structure dataset.
      However that's why we put taxes into research - to spur innovation in a pre-competitive way - so that's fine.
      What's not fine is any benefiting company avoiding paying any tax back on resulting profits - that's just free riding - and many of the big tech companies are, in my view, guilty.
      [-]
      - uptownfunk 10 days ago
        You know I can appreciate that. For some reason because it’s related to medicine, it feels insidious to keep it closed.
      - abecedarius 10 days ago
        The data is a treasure. It was just as available to others, before and after.
        [-]
        DrScientist 6 days ago
        Yep not knocking alphafold - Alphafold showed it could be done[1], and others have subsequently followed.
        However if all the tech companies hoover up all the profits and simultaneously avoid paying the appropriate level of taxes - then the cycle of innovation isn't sustainable. As well taxes paying for that pre-competitive data, they train the next generation of PhDs.
        [1] There were other groups making progress with DL based structure prediction before Alphafold, but alphafold was a leap forward.
- DonsDiscountGas 10 days ago
  I'd be willing to bet that OpenFold has an implementation inside of a few months.
  https://openfold.io/
- Xeyz0r 9 days ago
  Openness and collaboration could have far-reaching implications for public health and well-being but there are lots of aspects of being open
- robertlagrant 10 days ago
  > the community at large
  Which community? Not any I'm part of.
- falcor84 11 days ago
  The closer it gets to enabling full drug discovery, the closer it also gets to enabling bioterrorism. Taking it to the extreme, if they had the theory of everything, I don't think I'd want it to be made available to the whole world as it is today.
  On a related note, I highly recommend The Talos Principle 2, which really made me think about these questions.
  [-]
  - pythonguython 11 days ago
    Any organization/country that has the ability to use a tool like this to create a bio weapon is already sophisticated enough to do bioterrorism today.
    [-]
    - ramon156 10 days ago
      Alright, but now picture this: it's now open to the masses, meaning an individual could probably even do it.
      [-]
      - tsimionescu 10 days ago
        The problem of producing bio weapons is not computational, it's physical in nature. Even if predictions from these tools become 100% accurate and encompassed 100% of the chemistry, you still need to actually do the manual steps to breed the bioagents. And, very importantly, you need to do so without getting you and your co-workers deadly ill long before finishing the thing. Which requires extremely sophisticated machinery.
        Alternatively, you can go today in some of the poorer corners of the world, find some people with drug resistant tuberculosis, pay them a pittance to give you bodily fluids, and disperse those in a large crowd, say at a concert or similar. You'll get a good chunk of the effects of the worse possible bioterrorism.
        [-]
        DrScientist 10 days ago
        To be honest, apart from the containment systems you mentioned, much of biology basic research doesn't actually need that much sophisticated kit for basic cloning and genetic manipulation.
        A lot of the key reagents can just be bought - and BTW it's why code like Screepy exist ( https://edinburgh-genome-foundry.github.io/ ),.
        I think the real thing that stops it - it not that you can't make stuff that kills people, but the problem of specificity - ie how do you stop it killing yourself.
        [-]
        rolph 10 days ago
        >>kit for basic cloning and genetic manipulation.
        A lot of the key reagents can just be bought<<
        along with the cryo-fridges required to keep said reagents.
        add about 10,000 usd to your purchase request.
        [-]
        flobosg 10 days ago
        Most reagents and kits for molecular cloning will be fine at -20°C.
        hackernewds 10 days ago
        These are very specific ideas you have..
        [-]
        BriggyDwiggs42 10 days ago
        It’s very frustrating when people consider the possession of information equivalent to malice. It suggests that the right way to run society is to keep people stupid and harmless.
        [-]
        baq 10 days ago
        Harmless is enough. The smart ones will figure out that knowing shortens lives.
        FeepingCreature 10 days ago
        Okay, I'll live in the harmless society, you go live in the harmful society.
        [-]
        bronco21016 10 days ago
        We all live in the harmful society already right? I’m not aware of where to find this harmless society.
        Suffering of finite beings is inevitable. While a very worthwhile goal, creating a harmless civilization isn’t possible. There are some common sense things we should do to prevent harm like negative consequences (prison etc) for needlessly harming each other. However, locking up knowledge doesn’t make much sense to me.
        I’d rather explore the bounds of this world than mindlessly collect my drip of Soma and live comatose. To me that sounds more harmful.
        [-]
        FeepingCreature 10 days ago
        I don't care about people harming people incidentally. I also don't want to shut down knowledge. But there is "knowledge" and there are "materials" that everyone agrees must be controlled and limited, like high explosives and bioweapons. Then the question is if large AI weights are a kind of "knowledge" or a kind of "material", and IMO they're much closer to material despite being data.
        > I’d rather explore the bounds of this world than mindlessly collect my drip of Soma and live comatose. To me that sounds more harmful.
        This only once again demonstrates that winning a debate is entirely about getting to define the choice under consideration. To me, it's not about Soma, it's about "humanity survives" and "humanity goes extinct due to out-of-control superintelligence." I don't want to die, so I'm for AI regulation.
        BriggyDwiggs42 10 days ago
        Remember, I also said stupid. Ima go live in the non-dummy society and you can do whatever.
        baq 10 days ago
        These are ideas you should read and think about because intelligence agencies all over the world have been thinking about them for the past hundred years.
      - pythonguython 10 days ago
        I hear you, but I don't think an individual can. If I gave you $20,000 and an open source 70% accurate protein folding model and told you to develop, mass produce, and develop a deployment mechanism for a highly infectious and deadly pathogen, I don't think you could make that happen. Nor do I think you could do it if you had a PhD in microbiology.
        [-]
        bongodongobob 10 days ago
        right now
      - uptownfunk 10 days ago
        The risks don't exceed what is already out there. If someone wants to do damage, especially in America, there are more than enough ways they can do it already. The technology should be made free. I also wonder how much the claims are being exaggerated and are marketing-speak vs. real results. Is there any benchmark for this that they have published?
        [-]
        bongodongobob 10 days ago
        No. I love to be egalitarian as well but this AI thing really feels different. We didn't just invent a better plow or a more durable sword. We're working on making a better brain. I think social media shows us a pretty good slice of the average person and it's not great. Now imagine they can manipulate the smartest person in the world to do dangerous, dumb shit.
        [-]
        MaxikCZ 10 days ago
        >we didnt just invent a better plow
        But we invented metalworking. And if metal were for kings chairs only, we would still have no plows.
      - TeMPOraL 10 days ago
        Honestly, I wouldn't worry about bioterrorism as much as handling mishap. Stick new proteins into a bacteria the wrong way, don't wash hands thoroughly enough, and suddenly something is eating all the trees in the region, or whatnot.
        Designing an effective, lethal pathogen - fast enough to do damage, but slow enough to not burn itself out - is hard. Accidentally making something ecologically damaging is probably much simpler, and I imagine the future holds plenty such localized minor ecophagy[0] event.
        --
        [0] - Yes, I totally just learned that term from https://en.wikipedia.org/wiki/Gray_goo a minute ago.
      - consumer451 10 days ago
        You raise an extremely important point. It appears to me that most people do not understand the implications of your point.
        Organized terrorism by groups is actually extremely rare. What is much less rare are mass shootings in the USA, by deranged individuals.
        What would a psychopathic mass shooter type choose as a weapon if he not only had access to semi-automatic weapons, but now we added bio-weapons to the menu?
        It seems very clear to me that when creating custom viruses becomes high school level knowledge, and the tools can be charged on a credit card, nuclear weapons will be relegated to the second most likely way that our human civilization will end.
        I believe the two concepts being brought together here are the Law of Large Numbers, and the sudden ability for one single human to kill at least millions.
        [-]
        tsimionescu 10 days ago
        > It seems very clear to me that when creating custom viruses becomes high school level knowledge
        That would be very bad indeed, but there is no path from AI to that. Making custom viruses is never going to be an easy task even if you had a magic machine that could explain the effects of adding any chemical to the mix. You still need to procure the chemicals and work with them in very careful ways, often for a long time, in a highly controlled environment. It's still biology lab work, even if you know exactly what you have to do.
        Also, bioweapons already exist and have been used in a few conflicts, even as recently as WWII. They're terrifying in many ways, but are not really comparable to the horror of nuclear weapons hitting major cities.
        [-]
        TeMPOraL 10 days ago
        > You still need to procure the chemicals and work with them in very careful ways, often for a long time, in a highly controlled environment. It's still biology lab work, even if you know exactly what you have to do.
        You can get that as-a-Service, and I imagine that successes in computational biology will make mail-order protein synthesis broadly available. At that point, making a bioweapon or creating a grey goo (green goo) scenario will be a divide-and-conquer issue: how many pieces you need to procure independently from different facilities, so that no one suspects what you're doing until you mix them together and the world goes poof.
        [-]
        tsimionescu 10 days ago
        We know the principles of how to make very powerful and dangerous anorganic compounds today, with extreme precision. Do you see any chemistry-as-a-service products that sell to the general public? Is it easy to obtain the components and expertise to make sarin gas, a clearly existing and much simpler to synthesize substance than some hypothetical green goo bioweapon?
        [-]
        TeMPOraL 10 days ago
        > Do you see any chemistry-as-a-service products that sell to the general public?
        Sort of? Depends on how general you insist the general public to be. Never used one myself, but I used to lurk on nootropic and cognitive enhancement groups, and I recall some people claiming they managed to get experimental nootropics synthesized and sent from abroad, without any special license or access. And then there's all the lab supply companies - again, I never tried, but talking with people I never got the impression it's in any way restricted, other than being niche; I never heard them e.g. requiring a verified association with an university lab or something. Hell, back in high school, my classmate managed to get his hand on some uranium salts (half for chemistry nerdom, half for pure bragging rights), with zero problems.
        > Is it easy to obtain the components and expertise to make sarin gas, a clearly existing and much simpler to synthesize substance than some hypothetical green goo bioweapon?
        Given that I know for a fact that making several kinds of explosives and propellants is a bored middle-schooler level problem, I imagine sarin is also synthesizeable by a smart amateur. Fortunately, the intersection of being able to make it, and having a malicious reason for it, is vanishingly small. But I don't doubt that, should a terrorist group decide to use some of either, there's approximately nothing that can stop from cooking some up.
        What makes me more nervous about potential biosafety issues in the future is that, well, sarin is only effective as far as the air circulation will carry it; pathogens have indefinite range.
        consumer451 10 days ago
        > Making custom viruses is never going to be an easy task even if you had a magic machine that could explain the effects of adding any chemical to the mix. You still need to procure the chemicals and work with them in very careful ways, often for a long time, in a highly controlled environment. It's still biology lab work, even if you know exactly what you have to do.
        You appear to be talking about today. I am referring to some point in the future.
        If you extrapolate our technological progress out to the future, it certainly seems possible, at some point.
        [-]
        tsimionescu 10 days ago
        Not based on AI biochemistry simulators, at the very least.
  - LouisSayers 10 days ago
    Why do you need AI for bioterrorism? There are plenty of well known biological organisms that can kill us today...
  - BriggyDwiggs42 10 days ago
    Oh please, like a terrorist cant fork over a couple bucks to do the bioterrorism. This excuse is utter bs, whether its applied to LLMs or to alphafold. The motivator is profit, not safety.
  - datadeft 10 days ago
    Why on Earth would any terrorist want to invest into this when you can just purchase guns and explosives much easier?
    [-]
    - AlexCoventry 10 days ago
      Because a pathogen could potentially sicken or kill many more people.
renonce 11 days ago
> What is different about the new AlphaFold3 model compared to AlphaFold2?
> AlphaFold3 can predict many biomolecules in addition to proteins. AlphaFold2 predicts structures of proteins and protein-protein complexes. AlphaFold3 can generate predictions containing proteins, DNA, RNA, ions,ligands, and chemical modifications. The new model also improves the protein complex modelling accuracy. Please refer to our paper for more information on performance improvements.
AlphaFold 2 generally produces looping “ribbon-like” predictions for disordered regions. AlphaFold3 also does this, but will occasionally output segments with secondary structure within disordered regions instead, mostly spurious alpha helices with very low confidence (pLDDT) and inconsistent position across predictions.
So the criticism towards AlphaFold 2 will likely still apply? For example, it’s more accurate for predicting structures similar to existing ones, and fails at novel patterns?
[-]
- dekhn 11 days ago
  I am not aware of anybody currently criticiszing AF2's abilities outside of its training set. In fact the most recent papers (written by crystallographers) they are mostly arguing about atomic-level details of side chains at this point.
- rolph 11 days ago
  problem is biomolecules, are "chaperoned" to fold properly, only specific regions such as, alpha helix, or beta pleatedsheet will fold de novo.
  Chaperone (protein)
  https://en.wikipedia.org/wiki/Chaperone_(protein)
  [-]
  - DrScientist 10 days ago
    Chaperones exist - however many proteins will quite happily fold in isolation without any external help.
    [-]
    - rolph 10 days ago
      most protiens require chaperones, to fold properly.
      [-]
      - DrScientist 6 days ago
        Most? Evidence?
        Almost all the 3D structures that Alphafold was trained on were generated from crystals of pure protein.
        ie made without chaperones.
  - staticautomatic 10 days ago
    In principle couldn’t we just incorporate knowledge about chaperones into the model?
    [-]
    - flobosg 10 days ago
      In a way it is already incorporated. Broadly speaking, chaperones function by restricting the available conformational sampling space for the protein to fold. Some researchers even consider the ribosome as a chaperone of sorts for the nascent protein chain it synthetizes.
      Protein structure prediction methods do the same: they find ways of restricting the conformational space to explore, in hopes of finding the global minimum-energy conformation representing the native structure of the protein.
      [-]
      - staticautomatic 10 days ago
        Then it’s not clear to me exactly why should chaperones be a problem, though I get the gist intuitively
        [-]
        rolph 10 days ago
        if you want a protien or any other bio molecule to fold properly, a chaperone system must be either designed or elucidated.
        the primary sequence is not the only consideration for proper folding.
        chaperones allow higher energy folding events to occur and be maintained until subsequent modification stabilizes high energy structural motif.
        chaperones also enforce an A before B before C regime of folding so that the sequence doesnt just crumple up according to energy of hydrostatic interactions
        [-]
        staticautomatic 10 days ago
        Sure, I get the mechanics. My question is, if we can incorporate knowledge about chaperones into the models as explicit or latent variables, so to speak, then why can’t the models predict something like “probability of molecule a given the presence of chaperone b”?
        [-]
        rolph 10 days ago
        it sure can, given enough computation, chaperones are often protiens themselves but can be otherwise; they are subject to the same forces so twist and turn fold and conform.
        they often interact with each other, and must exert influence at proper stage of modification.
        other effects beyond foldingoccur, such as addition or elimination of prosthetic groups.
        the take home message is fallacy of oversimplifying the process of many molecules plus ionic enironment, interacting to influence a single molecule
    - rolph 10 days ago
      the order of folding is crucial, this is a progression of folding events.
      the difference is akin to origami vs a crumpled ball
- COGlory 11 days ago
  >So the criticism towards AlphaFold 2 will likely still apply? For example, it’s more accurate for predicting structures similar to existing ones, and fails at novel patterns?
  Yes, and there is simply no way to bridge that gap with this technique. We can make it better and better at pattern matching, but it is not going to predict novel folds.
  [-]
  - dekhn 11 days ago
    alphafold has been shown to accurately predict some novel folds. The technique doesn't entirely depend on whole-domain homology.
  - flobosg 10 days ago
    > but it is not going to predict novel folds
    https://www.nature.com/articles/s42003-022-03357-1
LarsDu88 11 days ago
As a software engineer, I kind of feel uncomfortable about this new model. It outperforms Alphafold 2 at ligand binding, but Alphafold 2 also had some more hardcoded and interpretable structural reasoning baked into the model architecture.
There's so many things you can incorporate into a protein folding model such as structural constraints, rotational equivariance, etc, etc
This new model simple does away with some of that, achieving greater results. And the authors simply use distillation from data outputted from Alphafold2 and Alphafold2-multimer to get those better results for those cases where you wind up with implausible results.
You have to run all those previous models, and output their predictions to do the distillation to achieve a real end-to-end training from scratch for this new model! Makes me feel a bit uncomfortable.
[-]
- sangnoir 11 days ago
  > Makes me feel a bit uncomfortable.
  Why? Do compilers which can't bootstrap themselves also make you uncomfortable due to dependencies on pre-built artifacts? I'm not saying you're unjustified to feel that way, but sometimes more abstracted systems are quicker to build and may have better performance than those built from the ground up. Selecting which one is better depends on your constraints and taste
  [-]
  - arjvik 10 days ago
    Compilers are deterministic (for the most part, and it's incredibly rare to introduce a compiler bug that self-replicates in future compilers (unless you're Ken Thompson and are reflecting upon trust itself)).
    Alternatively, AlphaFold 2's output is noisy, and using that to train AlphaFold 3, which presumably may be used to train what becomes AlphaFold 4, results in a cascade of errors.
- amitport 11 days ago
  Consider that humans also learn from other humans, and sometimes surpass their teachers.
  A bit more comfortable?
  [-]
  - Balgair 11 days ago
    Ahh, but the new young master is able to explain their work and processes to the satisfaction of the old masters. In the 'Science' of our modern times it's a requirement to show your work (yes, yes, I know about the replication crisis and all that terrible jazz).
    Not being able to ascertain how and why the ML/AI is achieving results is not quite the same and more akin to the alchemists and sorcerers with their cyphers and hidden laboratories.
    [-]
    - falcor84 11 days ago
      > the new young master is able to explain their work and processes to the satisfaction of the old masters
      Yes, but it's one level deep - in general they wouldn't be able to explain their work to their master's master (note "science advances one funeral at a time").
    - hackerdood 10 days ago
      I’ll add that specially when it comes to playing go, professionals who are at the peak of their ability can often find the best move at a given point but be unable to explain why beyond “it feels right” or “it looks right”.
    - borgdefense 10 days ago
      [flagged]
mchinen 11 days ago
I am trying to understand how accurate the docking predictions are.
Looking at the PoseBusters paper [1] they mention, they say they are 50% more accurate than traditional methods.
DiffDock, which is the best DL based systems gets 30-70% depending on the dataset, and traditional gets 50-70%. The paper highlighted some issues with the DL-based methods and given that DeepMind would have had time to incorporate this into their work and develop with the PoseBusters paper in mind, I'd hope it's significantly better than 50-70%. They say 50% better than traditional so I expected something like 70-85% across all datasets.
I hope a paper will appear soon to illuminate these and other details.
[1] https://pubs.rsc.org/en/content/articlehtml/2024/sc/d3sc0418...
zmmmmm 10 days ago
So much of the talk about their "free server" seems to be trying to distract from the fact that they are not releasing the model.
I feel like it's an important threshold moment if this gets accepted into scientific use without the model being available - reproducibility of results becomes dependent on the good graces of a single commercial entity. I kind of hope that like OpenAI it just spurs creation of equivalent open models that then actually get used.
wuj 11 days ago
This tool reminds me that the human body functions much like a black box. While physics can be modeled with equations and constraints, biology is inherently probabilistic and unpredictable. We verify the efficacy of a medicine by observing its outcomes: the medicine is the input, and the changes in symptoms are the output. However, we cannot model what happens in between, as we cannot definitively prove that the medicine affects only its intended targets. In many ways, much of what we understand about medicine is based on observing these black-box processes, and this tool helps to model that complexity.
[-]
- a_bonobo 10 days ago
  Classic essay in this vein:
  >Can a biologist fix a radio? — Or, what I learned while studying apoptosis
  https://www.cell.com/cancer-cell/pdf/S1535-6108(02)00133-2.p...
  >However, if the radio has tunable components, such as those found in my old radio (indicated by yellow arrows in Figure 2, inset) and in all live cells and organisms, the outcome will not be so promising. Indeed, the radio may not work because several components are not tuned properly, which is not reflected in their appearance or their connections. What is the probability that this radio will be fixed by our biologists? I might be overly pessimistic, but a textbook example of the monkey that can, in principle, type a Burns poem comes to mind. In other words, the radio will not play music unless that lucky chance meets a prepared mind.
- bamboozled 10 days ago
  I’d say it’s always been the case for medicine, when people first used medicines, the intention was never to fully understand what happens, just save a life, eliminate or reduce symptoms.
  Now we’ve built explainable systems like computers and software, we try to overlay that onto everything and it might not work.
  To quote Alan Watts, humans like to try square out wiggly systems because we’re not great and understanding wiggles.
qwertox 11 days ago
> Thrilled to announce AlphaFold 3 which can predict the structures and interactions of nearly all of life’s molecules with state-of-the-art accuracy including proteins, DNA and RNA. [1]
There's a slight mismatch between the blog's title and Demis Hassabis' tweet, where he uses "nearly all".
The blog's title suggests that it's a 100% solved problem.
[1] https://twitter.com/demishassabis/status/1788229162563420560
[-]
- TaupeRanger 10 days ago
  First time reading a Deep Mind PR? This is literally their modus operandi.
- bamboozled 10 days ago
  How to make the share price go up…surprised?
- bmau5 11 days ago
  Marketing vs. Reality :)
tea-coffee 11 days ago
This is a basic question, but how is the accuracy of the predicted biomolecular interactions measured? Are the predicted interactions compared to known interactions? How would the accuracy of predicting unknown interactions be assessed?
[-]
- joshuamcginnis 11 days ago
  Accuracy can be assessed two main ways: computationally and experimentally. Computationally, they would compare the predicted structures and interactions with known data from databases like PDB (Protein Database). Experimentally, they can use tools like x-ray crystallography and NMR (nuclear magnetic resonance) to obtain the actual molecule structure and compare it to the predicted result. The outcomes of each approach would be fed back into the model for refining future predictions.
  https://www.rcsb.org/
  [-]
  - dekhn 11 days ago
    AlphaFold very explicitly (unless something has changed) removes NMR structures as references because they are not accurate enough. I have a PhD in NMR biomolecular structure and I wouldn't trust. the structures for anything.
    [-]
    - JackFr 11 days ago
      Sorry, I don’t mean to be dense - do you mean you don’t trust AlphaFolds structures or NMRs?
      [-]
      - dekhn 11 days ago
        I don't trust NMR structures in nearly all cases. The reasons are complex enough that I don't think it's worthwhile to discuss on Hacker News.
        [-]
        fikama 11 days ago
        Hmm, I would say its always worth to share knowledge. Could you paste some links or maybe type a few key-words for anyone willing to reasearch the topic further on his own.
        [-]
        dekhn 11 days ago
        Read this, and recursively (breadth-first) read all its transitive references: https://www.sciencedirect.com/science/article/pii/S096921262...
    - fabian2k 11 days ago
      Looking at the supplementary material (section 2.5.4) for the AlphaFold 3 paper it reads to me like they still use NMR structures for training, but not for evaluating performance of the model.
      [-]
      - dekhn 11 days ago
        I think it's implicit in their description of filtering the training set, where they say they only include structures with resolution of 9A or less. NMR structures don't really have a resolution, that's more specific to crystallography. However, I can't actually verify that no NMR structures were included without directly inspecting their list of selected structures.
        [-]
        fabian2k 11 days ago
        I think it is very plausible that they don't use NMR structures here, but I was looking for a specific statement on it in the paper. I think your guess is plausible, but I don't think the paper is clear enough here to be sure about this interpretation.
        [-]
        dekhn 11 days ago
        Yes, thanks for calling that out. In verifying my statement I actually was confused because you can see they filter NMR out of the eval set (saying so explicitly) but don't say that in the test set section (IMHO they should be required to publish the actual selection script so we can inspect the results).
        [-]
        fabian2k 11 days ago
        Hmm, in the earlier AlphaFold 2 paper they state:
        > Input mmCIFs are restricted to have resolution less than 9 Å. This is not a very restrictive filter and only removes around 0.2% of structures
        NMR structures are more than 0.2% so that doesn't fit to the assumption that they implicitly remove NMR structures here. But if I filter by resolution on the PDB homepage it does remove essentially all NMR structures. I'm really not sure what to think here, the description seems too soft to know what they did exactly.
    - panabee 11 days ago
      interesting observation and experience. must have made thesis development complex, assuming the realization dawned on you during the phd.
      what do you trust more than NMR?
      AF's dependence on MSAs also seems sub-optimal; curious to hear your thoughts?
      that said, it's understandable why they used MSAs, even if it seems to hint at winning CASP more than developing a generalizable model.
      arguably, MSA-dependence is the wise choice for early prediction models as demonstrated by widespread accolades and adoption, i.e., it's an MVP with known limitations as they build toward sophisticated approaches.
      [-]
      - dekhn 11 days ago
        My realizations happened after my PhD. When I was writing my PhD I still believed we would solve the protein folding and structure prediction problems using classical empirical force fields.
        It wasn't until I started my postdocs, where I started learning about protein evolutionary relationships (and competing in CASP), that I changed my mind. I wouldn't say it so much as "multiple sequence alignments"; those are just tools to express protein relationships in a structured way.
        If Alphafold now, or in the future, requires no evolutionary relationships based on sequence (uniprot) and can work entirely by training on just the proteins in PDB (many of which are evoutionarily related) and still be able to predict novel folds, it will be very interesting times. The one thing I have learned is that evolutionary knowledge makes many hard problems really easy, because you're taking advantage of billions of years of nature and an easy readout.
    - carlsborg 10 days ago
      Would you trust the CryoEM structures more?
      [-]
      - dekhn 10 days ago
        yes, albeit with significant filtering.
    - heyoni 10 days ago
      Nice to see you on this thread as well! :)
gajnadsgjoas 10 days ago
Can someone tell me what are the direct implication of this? I often see "helps with a drug design" but I'm too far from this industry and have never seen an example of such drugs
[-]
- Syzygies 10 days ago
  We're this much closer to being hacked?
itissid 11 days ago
Noob here. Can one make the following deduction:
In transformer based architectures, where one typically uses variation of attention mechanism to model interactions, even if one does not consider the autoregressive assumption of the domain's "nodes"(amino acids, words, image patches), if the number of final states that nodes take eventually can be permuted only in a finite way(i.e. they have sparse interactions between them), then these architectures are efficient way of modeling such domains.
In plain english the final state of words in a sentence and amino acids in a protein have only so many ways they can be arranged and transformers do a good job of modeling it.
Also can one assume this won't do well for domains where there is, say, sensitivity to initial conditions, like chaotic systems like wheather where the # final states just explodes?
ak_111 11 days ago
If you work in this space would be interested to know what material impact has alphafold caused in your workflow since its release 4 years ago?
dsign 11 days ago
For a couple of years I've been expecting that ML models would be able to 'accelerate' bio-molecular simulations, using physics-based simulations as ground truth. But this seems to be a step beyond that.
[-]
- dekhn 11 days ago
  When I competed in CASP 20 years ago (and lost terribly) I predicted that the next step to improve predictions would be to develop empirically fitted force fields to make MD produce accurate structure predictions (MD already uses empirically fitted force fields, but they are not great). This area was explored, there are now better force fields, but that didn't really push protein structure prediction forward.
  Another approach is fully differentiable force fields- the idea that the force field function itself is a trainable structure (rather than just the parameters/weights/constants) that can be optimized directly towards a goal. Also explored, produced some interesting results, but nothing that woudl be considered transformative.
  The field still generally believes that if you had a perfect force field and infinite computing time, you could directly recapitulate the trajectories of proteins folding (from fully unfolded to final state along with all the intermediates), but that doesn't address any practical problems, and is massively wasteful of resources compared to using ML models that exploit evolutionary information encoded in sequence and structures.
  In retrospect I'm pretty relieved I was wrong, as the new methods are more effective with far fewer resources.
s1artibartfast 11 days ago
The article was heavy on the free research aspect, but light on the commercial application.
I'm curious about the business strategy. Does Google intend to license out tools, partner, or consult for commercial partners?
[-]
- a_bonobo 10 days ago
  This version has Isomorphic Labs far more in the focus of the press release, which seems to be now the commercial arm more or less licensing access out.
  The new AlphaFold server does not do everything the paper says AlphaFold 3 says it does. You cannot predict docking with the server! That is the main interest of pharma companies, 'does our medication bind to the target protein?'. From the FAQ: 'AlphaFold Server is a web-service that offers customized biomolecular structure prediction. It makes several newer AlphaFold3 capabilities available, including support for a wider range of molecule type' - that's not ALL AlphaFold3 capabilities. Isomorphic prints the money with those additional capabilities.
  It's hilarious that Google says they don't allow this for safety reasons, pure OpenAI fluff. It's just money.
- candiodari 11 days ago
  I wonder what the license for RoseTTAFold is. On github you have:
  https://github.com/RosettaCommons/RoseTTAFold/blob/main/LICE...
  But there's also:
  https://files.ipd.uw.edu/pub/RoseTTAFold/Rosetta-DL_LICENSE....
  Which is it?
- throwtappedmac 11 days ago
  [flagged]
- ilrwbwrkhv 11 days ago
  as soon as google tries to think commercially this will shut down so the longer it stays pure research the better. google is bad with productization.
  [-]
  - s1artibartfast 11 days ago
    I don't think it was ever pure research. The article talks about infinity labs, which is the co. Mercial branch for drug discovery.
    I do agree that Google seems bad at commercialization, which is why I'm curious on what the strategy is.
    It is hard to see them being paid consultants or effective partners for pharma companies, let alone developing drugs themselves.
nsoonhui 10 days ago
Here's something that bugs me about ML: all we have is prediction and no explanation how we come to that prediction, ie: no deeper understanding on the underlying principles.
So despite that we got a good match this time, how can we be sure that the match will be equally good next time? And how to use ML to predict structure that we have no baseline to start with or experimental result to benchmark ? In the absence of physics-like principles, How can we ever be sure that ML results next time is correct ?
[-]
- coriny 10 days ago
  There is a biannual structural prediction contest called CASP [1], in which a set of newly determined structures is used to benchmark the prediction methods. Some of these structures will be "novel", and so can be used to estimate the performance of current methods on predicting "structure that we have no baseline to start with".
  CASP-style assessments are something that should done for more research fields, but it's really hard to persuade funders and researchers to put up the money and embargo the data as required.
  [1] https://en.wikipedia.org/wiki/CASP
- throwaway4aday 10 days ago
  Speaking of physics, we should borrow the quote "Shut up and calculate" to describe the situation: it works so use it now and worry about the explanations later.
  [-]
  - d0mine 10 days ago
    Except the model is not open-source. You can't calculate anything.
ricksunny 11 days ago
I'm interested in how they measure accuracy of binding site identification and binding pose prediction. This was missing for the hitherto widely-used binding pose prediction tool Autodock Vina (and in silico binding pose tools in general). Despite the time I invested in learning & exercising that tool, I avoided using it for published research because I could not credibly cite its general-use accuracy. Is / will Alphafold 3 be citeable in the sense of "I have run Alphafold on this particular target of interest and this array of ligands, and have found these poses of X kJ/mol binding energy, and this is known to an accuracy of Y% because of Alphafold 3's training set results cited below'
[-]
- l33tman 11 days ago
  I've never trusted those predicted binding energies. If you have predicted a ligand/protein complex and have high confidence in it and want to study the binding energy I really think you should do a full MD simulation, you can pull the ligand-protein complex apart and measure the change in free energy explicitly.
  Also, and this is an unfounded guess only, the problem of protein / ligand docking is quite a bit more complex than protein folding - there seems to be a finite set of overall folds used in nature, while docking a small ligand to a big protein with flexible sidechains and even flexible large-scale structures can have induced fits that are really important to know and estimate, and I'm just very sceptical that it's going to be possible to in a general fashion ever predict these accurately by the AI model with the limited training data.
  Though you just need some hints, then you can run MD sims on them to see what happens for real.
lumb63 11 days ago
Would anyone more familiar with the field be able to provide some cursory resources on the protein folding problem? I have a background in computer science and a half a background in biology (took two semesters of OChem, biology, anatomy; didn’t go much further).
_xerces_ 11 days ago
A video summary of why this research is important: https://youtu.be/Mz7Qp73lj9o?si=29vjdQtTtIOk_0CV
[-]
- ProllyInfamous 11 days ago
  Thanks for this informative video summary. As a layperson, with a BS in Chemistry, it was quite helpful in understanding main bulletpoints of this accomplishment.
xnx 11 days ago
Very cool that anyone can login to https://golgi.sandbox.google.com/ and check it out
roody15 11 days ago
I wonder in the not too distant future if these AI predictions could be explained back into “humanized” understanding. Much like ChatGPT can simplify complex topics … cold the model in the future provide feedback to researchers why it is making this prediction?
[-]
reliablereason 11 days ago
Would be very useful if one they used it to predict the structure and interaction of the known variants to.
Would be very helpful when predicting if a mutation on a protein would lead to loss of function for the protein.
DF1PAW 10 days ago
Does it also simulate prion (=misfolded structures) based diseases?
mfld 11 days ago
The improvement on predicting protein/RNA/ligand interactions might facilitate many commercially relevant use cases. I assume pharma and biotech will eagerly get in line to use this.
thenerdhead 11 days ago
A lot of accelerated article previews as of recently. Seems like humanity is making a lot of breakthroughs.
This is nothing short of amazing for all those suffering from disease.
dev1ycan 11 days ago
Excited but also it's been a fair bit now and I have yet to see something truly remarkable come out of this
bbstats 11 days ago
Zero-shot nearly beating trained catboost is pretty amazing.
niwaniwaaa 9 days ago
How can I get my outputs in PDB format? Can not?
sidcool 10 days ago
And google is giving the service for free. Pretty good.
_akhe 11 days ago
Google's Game of Life 3D: Spiral edition
Metacelsus 11 days ago
From: https://www.nature.com/articles/d41586-024-01383-z
>Unlike RoseTTAFold and AlphaFold2, scientists will not be able to run their own version of AlphaFold3, nor will the code underlying AlphaFold3 or other information obtained after training the model be made public. Instead, researchers will have access to an ‘AlphaFold3 server’, on which they can input their protein sequence of choice, alongside a selection of accessory molecules. [. . .] Scientists are currently restricted to 10 predictions per day, and it is not possible to obtain structures of proteins bound to possible drugs.
This is unfortunate. I wonder how long until David Baker's lab upgrades RoseTTAFold to catch up.
[-]
- l33tman 11 days ago
  That sucks a bit. I was just wondering why they are touting that 3rd party company in their own blog post, who commercialise research tools, as well. Maybe there are some corporate agreements with them that prevents them from opening the system...
  Imagine the goodwill for humanity for releasing these pure research systems for free. I just have a hard time understanding how you can motivate to keep it closed. Let's hope it will be replicated by someone who doesn't have to hide behind the "responsible AI" curtain as it seems they are now.
  Are they really thinking that someone who needs to predict 11 structures per day are more likely to be a nefarious evil protein guy than someone who predicts 10 structures a day? Was AlphaFold-2 (that was open-sourced) used by evil researchers?
  [-]
  - perihelions 11 days ago
    - "Imagine the goodwill for humanity for releasing these pure research systems for free."
    The entire point[0] is that they want to sell an API to drug-developer labs, at exclusive-monopoly pricing. Those labs in turn discover life-saving drugs, and recoup their costs from e.g. parents of otherwise-terminally-ill children—again, priced as an exclusive monopoly.
    [0] As signaled by "it is not possible to obtain structures of proteins bound to possible drugs"
    It's a massive windfall for Alphabet, and it'd be a profound breach of their fiduciary duties as a public company to do anything other than lock-down and hoard this API, and squeeze it for every last billion.
    This is a deeply, deeply, deeply broken situation.
    [-]
    - karencarits 11 days ago
      What is the current status of drugs where the major contribution is from AI? Are they protectable like other drugs? Or are they more copyless like AI art and so on?
    - goggy_googy 11 days ago
      What makes this such a "deeply broken situation"?
      I agree that late-stage capitalism can create really tough situations for poor families trying to afford drugs. At the same time, I don't know any other incentive structure that would have brought us a breakthrough like AlphaFold this soon. For the first time in history, we have ML models that are beating out the scientific models by huge margins. The very fact that this comes out of the richest, most competitive country in the history of the world is not a coincidence.
      The proximate cause of the suffering for terminally-ill children is really the drug company's pricing. If you want to regulate this, though, you'll almost certainly have fewer breakthroughs like AlphaFold. From a utilitarian perspective, by preserving the existing incentive structure (the "deeply broken situation" as you call it), you will be extending the lifespans of more people in the future (as opposed to extending lifespans of more people now by lowering drug prices).
      [-]
      - firefoxbrower 11 days ago
        Late-stage capitalism didn't bring us AlphaFold, scientists did, late-stage capitalism just brought us Alphabet swooping in at literally the last minute. Socialize the innovation because that requires potential losses, privatize the profits, basically. It's reminiscent of "Heroes of CRISPR," where Doudna and Charpentier are supposedly just some middle-men, because stepping in at the last minute with more funding is really what fuels innovation.
        AlphaFold wasn't some lone genius breakthrough that came out of nowhere, everything but the final steps were basically created in academia through public funding. The key insights, some combination of realizing that the importance of sequence to structure to function put analyzable constraints on sequence conservation and which ML models could be applied to this, were made in academia a long time ago. AlphaFold's training set, the PDB, is also a result of decades of publicly funded work. After that, the problem was just getting enough funding amidst funding cuts and inflation to optimize. David Baker at IPD did so relatively successfully, Jinbo Xu is less of a fundraiser but was able to keep up basically alone with one or two grad students at a time, etc. AlphaFold1 threw way more people and money to basically copy what Jinbo Xu had already done and barely beat him at that year's CASP. Academics were leading the way until very, very recently, it's not like the problem was stalled for decades.
        Thankfully, the funding cuts will continue until research improves, and after decades of inflation cutting into grants, we are being rewarded by funding cuts to almost every major funding body this year. I pledge allegiance to the flag!
        EDIT: Basically, if you know any scientists, you know the vast majority of us work for years with little consideration for profit because we care about the science and its social impact. It's grating for the community, after being treated worse every year, to then see all the final credit go to people or companies like Eric Lander and Google. Then everyone has to start over, pick some new niche that everyone thinks is impossible, only to worry about losing it when someone begins to get it to work.
        [-]
        iknowstuff 11 days ago
        Why haven't the academics created a non profit foundation with open source models like this then? If alphabet doesnt provide much, then they will be supplanted by non profits. I see nothing broken here.
        [-]
        j-wags 11 days ago
        I work at Open Force Field [1] which is the kind of nonprofit that I think you're talking about. Our sister project, OpenFold [2], is working on open source versions of AlphaFold.
        We're making good progress but it's difficult to interface with fundamentally different organizational models between academia and industry. I'm hoping that this model will become normalized in the future. But it takes serious leaps of faith from all involved (professors, industry leaders, grant agencies, and - if I can flatter myself - early career scientists) to leave the "safe route" in their organizations and try something like this.
        [1] https://openforcefield.org/ [2] https://openfold.io/
        firefoxbrower 11 days ago
        Individual labs somehow manage to do that and we're all grateful. Martin Steinegger's lab put out ColabFold, RELION is the gold standard for cryo-EM despite being academic software and the development of more recent industry competitors like cryoSPARC. Everything out of the IPD is free for academic use. Someone has to fight like hell to get all those grants, though, and from a societal perspective, it's basically needlessly redundant work.
        My frustrations aren't with a lack of open source models, some poor souls make them. My disagreement is with the perception that academia has insufficient incentive to work on socially important problems. Most such problems are ONLY worked on in academia until they near the finish line. Look at Omar Yaghi's lab's work on COFs and MOFs for carbon/emission sequestration and atmospheric water harvesting. Look at all the thankless work numerous labs did on CRISPR-Cas9 before the Broad Institute even touched it. Look at Jinbo Xu's work, on David Baker's lab's and the IPD's work, etc. Look at what labs first solved critical amyloid structures, infuriatingly recently, considering the massive negative social impacts of neurodegenerative diseases.
        It's only rational for companies that only care about their own profit maximization to socialize R&D costs and privatize any possible gains. This can work if companies aren't being run by absolute ghouls who aren't delaying the release of a new generation of drugs to minimize patent duration overlap or who aren't trying to push things that don't work for short-term profit. This can also work if we properly fund and credit publicly funded academic labs. This is not what's happening, however, instead public funded research is increasingly demeaned, defunded, and dismantled due to the false impression that nothing socially valuable gets done without a profit motive. It's okay, though, I guess under this kind of LSC worldview, that everything always corrects itself so preempting problems doesn't matter, we'll finally learn how much actual innovation is publicly funded when we get the Minions movie, aducanumab, and WeWork over and over again for a few decades while strangling the last bit of nature we have left.
      - YeGoblynQueenne 8 days ago
        It is such a surprise when economics and philosophy of morality end up proving that it was a moral duty of large tech companies and billionaires to become filthy rich. Those people were working for the good of humanity all along, we just didn't look at the data close enough to get it.
        Well, allegedly.
    - lupire 11 days ago
      The parents of those otherwise terminally ill children disagree with you in the strongest possible terms.
    - iknowstuff 11 days ago
      Is it broken if it yields new drugs? Is there a system that yields more? The whole point of capitalism is that it incentivizes this in a way that no other system does.
      [-]
      - l33tman 11 days ago
        My point one level up in the comments here, was not really that the system is broken, but more like asking how you can run these companies (google and that other part run by the deepmind founder, who I bet already has more money than he can ever spend) and still sleep well knowing you're the rich capitalist a-hole commercializing life-science work that your parent company has allocated maybe one part in a million of their R&D budget into creating.
        It's not like Google is ever going to make billions on this anyway, the alphafold algorithms are not super advanced and you don't require the datasets of gpt4 to train them so others will hopefully catch up.. though I'm also pretty sure it requires GPU-hours beyond what a typical non-profit academia outfit has available unfortunately.. :/
  - staminade 11 days ago
    Isomorphic Labs? That's an Alphabet owned startup run by Denis Hassabis that they created to commercialise the Alphafold work, so it's not really a 3rd party at all.
  - SubiculumCode 11 days ago
    There is at least some difference between a monitored server and a privately ran one, if negative consequences are possible
- mhrmsn 11 days ago
  Also no commercial use, from the paper:
  > AlphaFold 3 will be available as a non-commercial usage only server at https://www.alphafoldserver.com, with restrictions on allowed ligands and covalent modifications. Pseudocode describing the algorithms is available in the Supplementary Information. Code is not provided.
  [-]
  - moralestapia 11 days ago
    How easy/hard would be for the scientific community to come up with an "OpenFold" model which is pretty much AF3 but fully open source and without restrictions in it?
    I can image training will be expensive, but I don't think it will be at a GPT-4 level of expensive.
    [-]
    - dekhn 11 days ago
      already did it, https://openfold.io/ https://github.com/aqlaboratory/openfold https://www.biorxiv.org/content/10.1101/2022.11.20.517210v1 https://lupoglaz.github.io/OpenFold2/ https://www.biospace.com/article/releases/openfold-biotech-a...
      I really have to emphasize that transformers have literally transformed science in only a few years. Truly extraordinary.
      [-]
      - moralestapia 11 days ago
        Oh, nice! Thanks for sharing.
  - obmelvin 11 days ago
    If you need to submit to their server, I don't know who would use it for commercial reasons anyway. Most biotech startups and pharma companies are very careful about entering sequences into online tools like this.
  - pantalaimon 11 days ago
    What's the point in that that - I mean who does non-commercial drug research?
    [-]
    - karencarits 11 days ago
      Public universities?
    - sangnoir 11 days ago
      Academia
  - p3opl3 11 days ago
    Yes, because that's going to stop competitors.. it's why they didn't release code I guess.
    This is yet another large part of a biotech related Gutenberg moment.
    [-]
    - natechols 11 days ago
      The DeepMind team was essentially forced to publish and release an earlier iteration of AlphaFold after the Rosetta team effectively duplicated their work and published a paper about it in Science. Meanwhile, the Rosetta team just published a similar work about co-folding ligands and proteins in Science a few weeks ago. These are hardly the only teams working in this space - I would expect progress to be very fast in the next few years.
      [-]
      - dekhn 11 days ago
        How much has changed- I talked with David Baker at CASP around 2003 and he said at the time, while Rosetta was the best modeller, every time they updated its models with newly determined structures, its predictions got worse :)
        [-]
        natechols 11 days ago
        It's kind of amazing in retrospect that it was possible to (occasionally) produce very good predictions 20 years ago with at least an order of magnitude smaller training set. I'm very curious whether DeepMind has tried trimming the inputs back to an earlier cutoff point and re-training their models - assuming the same computing technologies were available, how well would their methods have worked a decade or two ago? Was there an inflection point somewhere?
- tepal 11 days ago
  Or OpenFold, which is the more literal reproduction of AlphaFold 2: https://github.com/aqlaboratory/openfold
  [-]
  - LarsDu88 11 days ago
    Time for an OpenFold3? Or would it be an OpenFold2?
- wslh 11 days ago
  The AI call is rolling fast, I see similarities with cryptography in the 90s.
  I have a history to tell for the record, back in the 90s we developed a home banking for Palm (with a modem), it was impossible to perform RSA because of the speed so I contacted the CEO of Certicom which was the unique elliptic curve cryptography implementation at that time. Fast forward and ECC is everywhere.
- ranger_danger 11 days ago
  Not just unfortunate, but doesn't this make it completely untrustable? How can you be sure the data was not modified in any way? How can you verify any results?
  [-]
  - dekhn 11 days ago
    You determine a crystal structure of a known protein which does not previously have a known structure, and compare the prediction to the experimentally determined structure.
    There is a biennial (biannual?) competition known as CASP where some new structures, not yet published, are used for testing predictions from a wide range of protein structure prediction (so, basically blind predictions which are then compared when the competition wraps up). AlphaFold beat all the competitors by a very wide margin (much larger than the regular rate of improvement in the competition), and within a couple years, the leading academic groups adopted the same techniques and caught up.
    It was one of the most important and satisfying moments in structure prediction in the past two+ decades. The community was a bit skeptical but as it's been repeatedly tested, validated, and reproduced, people are generally of the opinion that DeepMind "solved" protein structure prediction (with some notable exceptions), and did so without having the solve the full "protein folding problem" (which is actually great news while also being somewhat depressing).
    [-]
    - ranger_danger 10 days ago
      By data I meant between the client and server, nothing actually related to how the program itself works, but just the fact that it's controlled by a proprietary third party.
- rolph 11 days ago
  in other words, this has been converted to a novelty, and has no use for scientific purposes.
  [-]
  - ebiester 11 days ago
    No. It just means that scientific purposes will have an additional tax paid to google. This will likely reduce use in academia but won't deter pharmaceutical companies.
- Jerrrry 11 days ago
  The second amendment prevents the government's overreaching perversion to restrict me from having the ability to print biological weapons from the comfort of my couch.
  Google has no such restriction.
  [-]
  - gameman144 11 days ago
    I know this is tongue in cheek, but you absolutely can be restricted from having a biological weapons factory in your basement (similar to not being able to pick "nuclear bombs" as your arms to bear).
    [-]
    - timschmidt 11 days ago
      Seems like the recipe for independence, and agreed upon borders, and thus whatever interpretation of the second amendment one wants involves exactly choosing nuclear bombs, and managing to stockpile enough of them before being bombed oneself. At least at the nation state scale. Sealand certainly resorted to arms at several points in it's history.
      [-]
      - gameman144 11 days ago
        The second amendment only applies to the United States -- it's totally normal to have one set of rights for citizens and another set for the government itself.
  - dekhn 11 days ago
    Sergey once said "We don't have an army per-se" (he was referring the size of Google's physical security group) at TGIF.
    There was a nervous chuckle from the audience.
  - SubiculumCode 11 days ago
    /s is strong with this one
- niemandhier 11 days ago
  The logical consequence is to put all scientific publications under a license that restricts the right to train commercial ai models on them.
  Science advances because of an open exchange of ideas, the original idea of patents was to grant the inventor exclusive use in exchange for disclosure of knowledge.
  Those who did not patent, had to accept that their inventions would be studied and reverse engineered.
  The „as a service“ model, breaks that approach.
- dwroberts 11 days ago
  This turns it into a tool that deserves to be dethroned by another group, frankly. What a strange choice.
- photochemsyn 11 days ago
  Well, it's because you can design deadly viruses using this technology. Viruses gain entry to living cells via cell-surface receptor proteins whose normal job is to bind signalling molecules, alter their conformation and translate that external signal into the cellular interior where it triggers various responses from genomic transcription to release of other signal molecules. Viruses hijack such mechanisms to gain entry to cells.
  Thus if you can design a viral coat protein to bind to a human cell-surface receptor, such that it gets translocated into the cell, then it doesn't matter so much where that virus came from. The cell's firewall against viruses is the cell membrane, and once inside, the biomolecular replication machinery is very similar from species to species, particularly within restricted domains, such as all mammals.
  Thus viruses from rats, mice, bats... aren't going to have major problems replicating in their new host - a host they only gained access to because some nation-state actors working in collaboration on such gain-of-function research in at least two labs on opposite sides of the world with funds and material provided by the two largest economic powers for reasons that are still rather opaque, though suspiciously banal...
  Now while you don't need something like AlphaFold3 to do recklessly stupid things (you could use directed evolution, making millions of mutatad proteins, throwing them at a wall of human cell receptors and collecting what stuck), it makes it far easier. Thus Google doesn't want to be seen as enabling, though given their prediliction for classified military-industrial contracting to a variety of nation-states, particularly with AI, with revenue now far more important than silly "don't be evil" statements, they might bear watching.
  On the positive side, AlphaFold3 will be great for fields like small molecular biocatalysis, i.e. industrial applications in which protein enzymes (or more robust heterogenous catalysts designed based on protein structures) convert N2 to ammonia, methane to methanol, or selectively bind CO2 for carbon capture, modification of simple sugars and amino acids, etc.
moconnor 11 days ago
Stepping back, the high-order bit here is an ML method is beating physically-based methods for accurately predicting the world.
What happens when the best methods for computational fluid dynamics, molecular dynamics, nuclear physics are all uninterpretable ML models? Does this decouple progress from our current understanding of the scientific process - moving to better and better models of the world without human-interpretable theories and mathematical models / explanations? Is that even iteratively sustainable in the way that scientific progress has proven to be?
Interesting times ahead.
[-]
- dekhn 11 days ago
  If you're a scientist who works in protein folding (or one of those other areas) and strongly believe that science's goal is to produce falsifiable hypotheses, these new approaches will be extremely depressing, especially if you aren't proficient enough with ML to reproduce this work in your own hands.
  If you're a scientist who accepts that probabilist models beat interpretable ones (articulated well here: https://norvig.com/chomsky.html), then you'll be quite happy because this is yet another validation of the value of statistical approaches in moving our ability to predict the universe forward.
  If you're the sort of person who believes that human brains are capable of understanding the "why" of how things work in all its true detail, you'll find this an interesting challenge- can we actually interpret these models, or are human brains too feeble to understand complex systems without sophisticated models?
  If you're the sort of person who likes simple models with as few parameters as possible, you're probably excited because developing more comprehensible or interpretable models that have equivalent predictive ability is a very attractive research subject.
  (FWIW, I'm in the camp of "we should simultaneously seek simpler, more interpretable models, while also seeking to improve native human intelligence using computational augmentation")
  [-]
  - jprete 11 days ago
    The goal of science has always been to discover underlying principles and not merely to predict the outcome of experiments. I don't see any way to classify an opaque ML model as a scientific artifact since by definition it can't reveal the underlying principles. Maybe one could claim the ML model itself is the scientist and everyone else is just feeding it data. I doubt human scientists would be comfortable with that, but if they aren't trying to explain anything, what are they even doing?
    [-]
    - dekhn 11 days ago
      That's the aspirational goal. And I would say that it's a bit of an inflexible one- for example, if we had an ML that could generate molecules that cure diseases that would pass FDA approval, I wouldn't really care if scientists couldn't explain the underlying principles. But I'm an ex-scientist who is now an engineer, because I care more about tools that produce useful predictions than understanding underlying principles. I used to think that in principle we could identify all the laws of the universe, and in theory, simulate that would enough accuracy, and inspect the results, and gain enlightenment, but over time, I've concluded that's a really bad way to waste lots of time, money, and resources.
      [-]
      - panarky 11 days ago
        It's not either-or, it's yes-and. We don't have to abandon one for the other.
        AlphaFold 3 can rapidly reduce a vast search space in a way physically-based methods alone cannot. This narrowly focused search space allows scientists to apply their rigorous, explainable, physical methods, which are slow and expensive, to a small set of promising alternatives. This accelerates drug discovery and uncovers insights that would otherwise be too costly or time-consuming.
        The future of science isn't about AI versus traditional methods, but about their intelligent integration.
        [-]
        nextos 11 days ago
        Or you can treat AlphaFold as a black box / oracle and work at systems biology level, i.e. at pathway and cellular level. Protein structures and interactions are always going to be hard to predict with interpretable models, which I also prefer.
        My only worry is that AlphaFold and others, e.g. ESM, seem to be bit fragile for out-of-distribution sequences. They are not doing a great job with unusual sequences, at least in my experience. But hopefully they will improve and provide better uncertainty measures.
      - hammock 10 days ago
        > if we had an ML that could generate molecules that cure diseases that would pass FDA approval, I wouldn't really care if scientists couldn't explain the underlying principles
        It’s actually required as part of the submission for FDA approval that you posit a specific Mechanism of Action for why your drug works the way it does. You can’t get approval without it
        [-]
        jdietrich 10 days ago
        A substantial proportion of FDA-approved drugs have an unknown mechanism of action - we can handwave about protein interactions, but we have no useful insight into how they actually work. Drug discovery is bureaucratically rigid, but scientifically haphazard.
        dekhn 10 days ago
        How much do you believe that the MoA actually matches what is happening in the underlying reality of a disease and its treatment?
        Vioxx is a nice example of a molecule that got all the way to large-scale deployment before being taken off the market for side effects that were known. Only a decade before that, I saw a very proud pharma scientist explaining their "mechanism of action" for vioxx, which was completely wrong.
      - adrianN 10 days ago
        Underlying principles are nice for science, whatever works is nice for engineering. There is plenty of historical precedent where we build stuff that works without knowing exactly why it works.
      - gandalfthepink 10 days ago
        Me like thee career path. Interesting.
    - ak_111 11 days ago
      Discovering underlying principles and predicting outcomes is two sides of the same coin in that there is no way to confirm you have discovered underlying principles unless they have some predictive power.
      Some had tried to come up with other criteria to confirm you have discovered an underlying principle without predictive power, such as on aesthetics - but this is seen by the majority of scientists as basically a cop out. See debate around string theory.
      Note that this comment is summarizing a massive debate in the philosophy of science.
      [-]
      - chasd00 11 days ago
        If all you can do is predict an outcome without being able to explain how then what have you really discovered? Asking someone to just believe you can predict outcomes without any reasoning as to how, even if you're always right, sounds like the concept of faith in religion.
        [-]
        lordnacho 10 days ago
        The how is actually just further hypotheses. It's turtles all the way down:
        There is a car. We think it drives by burning petrol somehow.
        How do we test this? We take petrol away and it stops driving.
        Ok, so we know it has something to do with petrol. How does it burning the petrol make it drive?
        We think it is caused by the burned petrol pushing the cylinders, which are attached to the wheels through some gearing. How do we test it? Take away the gearing and see if it drives.
        Anyway, this never ends. You can keep asking questions, and as long as the hypothesis is something you can test, you are doing science.
        [-]
        hammock 10 days ago
        >There is a car. We think it drives by burning petrol somehow. How do we test this? We take petrol away and it stops driving.
        You discovered a principle.
        Better example:
        There is a car. We don’t know how it drives. We turn the blinkers on and off. It still drives. Driving is useful. I drive it to the store
        dekhn 10 days ago
        In the vein of "can a biologist fix a radio" and "can a neuroscientist understand a microprocessor", see https://review.ucsc.edu/spring04/bio-debate.html which is an absolutely wonderful explanation of how geneticists and biochemists would go about reverse-engineering cars.
        The best part is where the geneticist ties the arms of all the suit-wearing employees and it has no functional effect on the car.
        dumpsterdiver 11 days ago
        > what have you really discovered?
        You’ve discovered magic.
        When you read about a wizard using magic to lay waste to invading armies, how much value would you guess the armies place in whether or not the wizard truly understands the magic being used against them?
        Probably none. Because the fact that the wizard doesn’t fully understand why magic works does not prevent the wizard from using it to hand invaders their asses. Science is very much the same - our own wizards used medicine that they did not understand to destroy invading hordes of bacteria.
        [-]
        pineaux 10 days ago
        Exactly! The magic to lay waste to invading armies is packaged into a large flask and magical metal birds are flown to above the army. There the flask is released from the birds bellies and gently glides down. When the flask is at optimum height it releases the power of the sun and all that are beneath it get vaporized. A newer version of this magic is attached to a gigantic fireworks rocket that can fly over whole mountain ranges and seas.
        YeGoblynQueenne 8 days ago
        Do you know what the stories say happens to wizards who don't understand magic?
        https://youtu.be/B4M-54cEduo?si=RoRZIyWRULUnNKLM
        pas 11 days ago
        it's still an extremely valuable tool. just as we see in mathematics, closed forms (and short and elegant proofs) are much coveted luxury items.
        for many basic/fundamental mathematical objects we don't (yet) have simple mechanistic ways to compute them.
        so if a probabilistic model spits out something very useful, we can slap a nice label on it and call it a day. that's how engineering works anyway. and then hopefully someday someone will be able to derive that result from "first principles" .. maybe it'll be even more funky/crazy/interesting ... just like mathematics arguably became more exciting by the fact that someone noticed that many things are not provable/constructable without an explicit Axiom of Choice.
        https://en.wikipedia.org/wiki/Nonelementary_integral#Example...
        [-]
        thfuran 11 days ago
        >closed forms (and short and elegant proofs) are much coveted luxury items.
        Yes, but we're taking about roughly the opposite of a proof
        [-]
        pas 10 days ago
        but in usual natural sciences we don't have proofs, only data and models, and then we do model selection (and through careful experiments we end up with confidence intervals)
        and it seems with these molecular biology problems we constantly have the problem of specificity (model prediction quality) vs sensitivity (model applicability), right? but due to information theory constraints there's also a dimension along model size/complexity.
        so if a ML model can push the ROC curve toward the magic left-up corner then likely it's getting more and more complex.
        and at one point we simply are left with models that are completely parametrized by data and there's virtually zero (direct) influence of the first principles. (I mean that at one point as we get more data even to do model selection we can't use "first principles" because what we know through that is already incorporated into previous versions of the models. Ie. the information we gained from those principles we already used to make decisions in earlier iterations.)
        Of course then in theory we can do model distillation, and if there's some hidden small/elegant theory we can probably find it. (Which would be like a proof through contradiction, because it would mean that we found model with the same predictive power but with smaller complexity than expected.)
        // NB: it's 01:30 here, but independent of ignorance-o-clock ... it's quite possible I'm totally wrong about this, happy to read any criticism/replies
        jcims 11 days ago
        Isn’t that basically true of most of the fundamental laws of physics? There’s a lot we don’t understand about gravity, space, time, energy, etc., and yet we compose our observations of how they behave into very useful tools.
      - thfuran 11 days ago
        >there is no way to confirm you have discovered underlying principles unless they have some predictive power.
        Yes, but a perfect oracle has no explanatory power, only predictive.
        [-]
        nkingsy 11 days ago
        increasing the volume of predictions produces patterns that often lead to underlying principles.
        [-]
        mikeyouse 11 days ago
        And much of the 20th century was characterized by a very similar progression - we had no clue what the actual mechanism of action was for hundreds of life saving drugs until relatively recently, and we still only have best guesses for many.
        That doesn’t diminish the value that patients received in any way even though it would be more satisfying to make predictions and design something to interact in a way that exactly matches your theory.
        [-]
        qp11 10 days ago
        We were using the compass for navigation for thousands of years, without any clue about what it was doing or why. Ofcourse lot of people got lost cause compasses are not perfect. And the same will happen here. Theory of Bounded Rationality applies.
    - gradus_ad 11 days ago
      That ship sailed with Quantum physics. Nearly perfect at prediction, very poor at giving us a concrete understanding of what it all means.
      This has happened before. Newtonian mechanics was incomprehensible spooky action at a distance, but Einstein clarified gravity as the bending of spacetime.
      [-]
      - drdeca 10 days ago
        I think this relies on either the word “concrete” or a particular choice of sense for “concrete understanding”.
        Like, quantum mechanics doesn’t seem, to me, to just be a way of describing how to predict things. I view it as saying substantial things about how things are.
        Sure, there are different interpretations of it, which make the same predictions, but, these different interpretations have a lot in common in terms of what they say about “how the world really is” - specifically, they have in common the parts that are just part of quantum mechanics.
        The qau that can be spoken in plain language without getting into the mathematics, is not the eternal qau, or whatever.
    - jdietrich 10 days ago
      The goal of science has always been to predict the outcome of experiments, because that's what distinguishes science from philosophy or alchemy or faith. Anyone who believes that they've discovered an underlying principle is almost certainly mistaken; with time, "underlying principles" usually become discredited theories or, sometimes, useful but crude approximations that we teach to high schoolers and undergrads.
      Prediction is understanding. What we call "understanding" is a cognitive illusion, generated by plausible but brittle abstractions. A statistically robust prediction is an explanation in itself; an explanation without predictive power explains nothing at all. Feeling like something makes sense is immeasurably inferior to being able to make accurate predictions.
      Scientists are at the dawn of what chess players experienced in the 90s. Humans are just too stupid to say anything meaningful about chess. All of the grand theories we developed over centuries are just dumb heuristics that are grossly outmatched by an old smartphone running Stockfish. Maybe the computer understands chess, maybe it doesn't, but we humans certainly don't and we've made our peace with the fact that we never will. Moore's law does not apply to thinking meat.
    - toxik 11 days ago
      Kepler famously compiled troves of data on the night sky, and just fitted some functions to them. He could not explain why but he could say what. Was he not a scientist?
      [-]
      - jprete 10 days ago
        He did attempt to explain why. Wikipedia: "On 4 February 1600, Kepler met Tycho Brahe....Tycho guarded his data closely, but was impressed by Kepler's theoretical ideas and soon allowed him more access. Kepler planned to test his theory from Mysterium Cosmographicum based on the Mars data, but he estimated that the work would take up to two years (since he was not allowed to simply copy the data for his own use)."
        [-]
        toxik 8 days ago
        Mixed it up! I meant Tycho Brahe actually.
      - YeGoblynQueenne 8 days ago
        Sure he was. And then Newton came along and said it's all because of gravity and Kepler's laws were nothing but his laws of motion applied to planets.
        Newton was a bit of a brat but everybody accepted his explanation. Then the problem turned to trying to explain gravity.
        Thus science advances, one explanation at a time.
      - chemicalnovae 10 days ago
        He might not have been able to explain why _but_ I'd bet anything he would have wanted to if he could.
    - strogonoff 11 days ago
      Can underlying principles be discovered using the framework of scientific method? The primary goal of models and theories it develops is to support more experiments and eventually be disproven. If no model can be correct, complete and provable in finite time, then a theory about underlying principles that claims completeness would have to be unfalsifiable. This is reasonable in context of philosophy, but not in natural sciences.
      Scientific method can help us rule out what underlying principles are definitely not. Any such principles are not actually up to be “discovered”.
      If probabilistic ML comes along and does a decent job at predicting things, we should keep in mind that those predictions are made not in context of absolute truth, but in context of theories and models we have previously developed. I.e., it’s not just that it can predict how molecules interact, but that the entire concept of molecules is an artifact of just some model we (humans) came up with previously—a model which, per above, is probably incomplete/incorrect. (We could or should use this prediction to improve our model or come up with a better one, though.)
      Even if a future ML product could be creative enough to actually come up with and iterate on models all on its own from first principles, it would not be able to give us the answer to the question of underlying principles for the above-mentioned reasons. It could merely suggest us another incomplete/incorrect model; to believe otherwise would be to ascribe it qualities more fit for religion than science.
      [-]
      - jltsiren 11 days ago
        I don't find that argument convincing.
        People clearly have been able to discover many underlying principles using the scientific method. Then they have been able to explain and predict many complex phenomena using the discovered principles, and create even more complex phenomena based on that. Complex phenomena such as the technology we are using for this discussion.
        Words dont have any inherent meaning, just the meaning they gain from usage. The entire concept of truth is an artifact of just some model (language) we came up with previously—a model which, per above, is probably incomplete/incorrect. The kind of absolute truth you are talking about may make sense when discussing philosophy or religion. Then there is another idea of truth more appropriate for talking about the empirical world. Less absolute, less immutable, less certain, but more practical.
        [-]
        strogonoff 10 days ago
        > The kind of absolute truth you are talking about may make sense when discussing philosophy or religion.
        Exactly—except you are talking about it, too. When you say “discovering underlying principles”, you are implying the idea of absolute truth where there is none—the principles are not discovered, they are modeled, and that model is our fallible human construct. It’s a similar mistake as where you wrote “explain”: every model (there should always be more than one) provides a metaphor that 1) first and foremost, jives with our preexisting understanding of the world, and 2) offers a lossy map of some part of [directly inaccessible] reality from a particular angle—but not any sort of explanation with absolute truth in mind. Unless you treat scientific method as something akin to religion, which is a common fallacy and philosophical laziness, it does not possess any explanatory powers—and that is very much by design.
        [-]
        jltsiren 10 days ago
        Now we come back to words gaining their meaning from usage.
        You are assigning meanings to words like "discovering", "principles", and "explain" that other people don't share. Particularly people doing science. Because these absolute philosophical meanings are impossible in the real world, they are also useless when discussing the reality. Reserving common words for impossible concepts would not make sense. It would only hinder communication.
    - fire_lake 11 days ago
      What if the underlying principles of the universe are too complex for human understanding but we can train a model that very closely follows them?
      [-]
      - dekhn 11 days ago
        Then we should dedicate large fractions of human engineering towards finding ethical ways to improve human intelligence so that we can appreciate the underlying principles better.
        [-]
        refulgentis 11 days ago
        I spend about 30 minutes reading this thread and links from it: I don't really follow your line of argument. I find it fascinating and well-communicated, the lack of understanding is on me: my attention flits around like a butterfly, in a way that makes it hard for me to follow people writing original content.
        High level, I see a distinction between theory and practice, between an oracle predicting without explanation, and a well-thought out theory built on a partnership between theory and experiment over centuries, ex. gravity.
        I have this feeling I can't shake that the knife you're using is too sharp, both in the specific example we're discussing, and in general.
        In the specific example, folding, my understanding is we know how proteins fold & the mechanisms at work. It just takes an ungodly amount of time to compute and you'd still confirm with reality anyway. I might be completely wrong on that.
        Given that, the proposal to "dedicate...engineer[s] towards finding ethical ways to improve...intelligence so that we can appreciate the underlying principles better" begs the question of if we're not appreciating the underlying principles.
        It feels like a close cousin of physics theory/experimentalist debate pre-LHC, circa 2006: the experimentalists wanted more focus on building colliders or new experimental methods, and at the extremes, thought string theory was a complete was of time.
        Which was working towards appreciating the underlying principles?
        I don't really know. I'm not sure there's a strong divide between the work of recording reality and explaining it. I'll peer into a microscope in the afternoon, and take a shower in the evening, and all of a sudden, free associating gives me a more high-minded explanation for what I saw.
        I'm not sure a distinction exists for protein folding, yes, I'm virtually certain this distinction does not exist in reality, only in extremely stilted examples (i.e. a very successful oracle at Delphi)
        [-]
        mistermann 11 days ago
        There's a much easier route: consciousness is not included in the discussion...what a coincidence.
      - Wilduck 11 days ago
        That sounds like useful engineering, but not useful science.
        [-]
        mrbungie 11 days ago
        I think that a lot of scientific discoveries originate from initial observations made during engineering work or just out of curiosity without rigour.
        Not saying ML methods haven't shown important reproducibility challenges, but to just shut them down due to not being "useful science" is inflexible.
    - SJC_Hacker 11 days ago
      What if it turns out that nature simply doesn't have nice, neat models that humans can comprehend for many observable phenomena?
      [-]
      - empath-nirvana 10 days ago
        I read an article about the "unreasonable effectiveness of mathematics" that it was basically the result of a drunk looking for his keys under a lamp post because that's where the light is. We know how to use math to model parts of the world, and every where we look, there's _something_ we can model with math, but that doesn't mean that there's all there is to the universe. We could be understanding .0000001% of what's out there to understand, and it's the stuff that's amenable to mathematical analysis.
    - exe34 11 days ago
      The ML model can also be an emulator of parts of the system that you don't want to personally understand, to help you get on with focusing on what you do want to figure out. Alternatively, the ML model can pretend to be the real world while you do experiments with it to figure out aspects of nature in minutes rather than hours-days of biological turnaround.
    - flawsofar 10 days ago
      The machine understands, we do not, and so it is not science?
      Can we differentiate?
    - Invictus0 11 days ago
      Maybe the science of the past was studying things of lesser complexity than the things we are studying now.
    - empath-nirvana 10 days ago
      If have an oracle that can predict the outcome of experiments does it _matter_ if you understand why?
    - fsloth 10 days ago
      AFAIK in wet science you need (or needed) to do tons of experimentations with liquids with specific molar compositions and temperatures splurging in and out of test tubes - basically just physically navigating a search space. I would view an AI model with super powerful guestimation capability as a much faster way of A) cutting through search space B) providing accidental discoveries while at it
      Now, if we look at history of science and technology, there is a shit ton of practical stuff that was found only by pure accident - discoveries of which could not be predicted from any previous theory.
      I would view both A) and B) as net positives. But our teaching of the next generation of scientists needs to adapt.
      The worst case scenario is of course that the middle management driven enshittification of science will proceed to a point where there are only few people who actually are scientists and not glorified accountants. But I’m optimistic this will actually super charge science.
      With good luck we will get rid of the both of the biggest pathologies in modern science - 1. number of papers published and referred as a KPI 2. Hype driven super politicized funding where you can focus only one topic “because that’s what’s hot” (i.e. string theory).
      The best possible outcome is we get excitement and creativity back into science. Plus level up our tech level in this century to something totally unforeseen (singularity? That’s just a word for “we don’t know what’s gonna happen” - not a specific concrete forecasted scenario).
      [-]
      - tsimionescu 10 days ago
        > singularity? That’s just a word for “we don’t know what’s gonna happen” - not a specific concrete forecasted scenario
        It's more specific than you make it out. The singularity idea is that smart AIs working on improving AI will produce smarter AIs, leading to an ever increasing curve that at some point hits a mathematical singularity.
        [-]
        fsloth 10 days ago
        No it's not specific at all in predicting technological progress, which was the point of my comment.
        Nobody knows what singularity would actually mean from the point of view of specific technological development.
    - melagonster 10 days ago
      they offered a good tool for science... so this is a part of science.
  - stouset 10 days ago
    > If you're the sort of person who believes that human brains are capable of understanding the "why" of how things work in all its true detail, you'll find this an interesting challenge- can we actually interpret these models, or are human brains too feeble to understand complex systems without sophisticated models?
    I think chess engines, weirdly enough, have disabused me of this notion.
    There are lots of factors a human considers when looking at a board. Piece activity. Bishop and knight imbalances. King safety. Open and semi-open file control. Tempo. And on and on.
    But all of them are just convenient shortcuts that allow us to substitute reasonable guesses for what really matters: exhaustively calculating a winning line through to the end. “Positional play” is a model that only matters when you can’t calculate trillions of lines thirty moves deep, and it’s infinitely more important that a move survives your opponent’s best possible responses than it is to satisfy some cohesive higher level principle.
    [-]
    - kobenni 10 days ago
      I don't understand why you would draw this conclusion. The deep search you describe is an algorithm that humans can understand perfectly fine. Humans just can't solve it in their heads and need to let a computer handle the number crunching. Just like a scientist may understand the differential equations to describe a system perfectly fine, but require a computer to approximate the solution for an initial value problem.
      [-]
      - stouset 10 days ago
        “Knowing” that some line works to some ridiculous depth is different than understanding how and why.
        And at some level the answer is simply “because every possible refutation fails” and there is no simpler pattern to match against nor intuition to be had. That is the how and why of it.
      - xixixao 10 days ago
        The scientist can understand “how” the model works, how many layers there are, that each neuron has a weight, that some are connected… Parent comment and yours show that “understanding” is a fuzzy concept.
    - searealist 10 days ago
      Chess engines actually do both now. They have ML models to evaluate positions, essentially a much more advanced version of your positional description, and deep calculations.
      [-]
      - stouset 10 days ago
        That might be the best we can practically achieve with technology, but the point stands. If positional evaluation says one thing but an exhaustive analysis of lines finds a solution 60 moves deep, that one is going to win.
        [-]
        searealist 10 days ago
        Humans also do search. Also, engines arent doing an exhaustive search when they are 20 moves deep. They heavily prune.
        [-]
        stouset 10 days ago
        Yes, I understand how chess engines work.
        Ignore the existence of engines for a moment. The reason a particular line works, at the end of the day, is simply because it does. Just because we have heuristics that help us skip a lot of evaluation doesn’t mean the heuristics have intrinsic meaning within the game. They don’t.
        They’re shortcuts that let us skip having to do the impossible. The heuristics will always lose to concrete analysis to a deep enough depth.
        And that’s my point. We come up with models that give us an intuition for “why” things are a certain way. Those models are inarguably helpful toward having a gut feeling. But models aren’t the thing itself, and every model we’ve found breaks down at some deeper point. And maybe at some level things simply “are” some way with no convenient shorthand explanation.
        [-]
        searealist 10 days ago
        So your point is that we are not omnipotent? Ok.
        [-]
        TylerLives 10 days ago
        Because of our limitations, we have to compress reality more in order to reason about it. This means we're blind to some ideas that computers are not. Just like a depth 1 chess engine can't see what's happening at depth 3 but has to make an imperfect guess.
    - anonylizard 10 days ago
      Fine tuned LLMs can play chess at grandmaster levels.
      So its clear that there is in fact 'deeper patterns to chess' that allow one to play very well, without any search required (Since LLMs cannot search). Its just that those patterns are probably rather different to human understood ones.
  - croniev 11 days ago
    I'm in the following camp: It is wrong to think about the world or the models as "complex systems" that may or may not be understood by human intelligence. There is no meaning beyond that which is created by humans. There is no 'truth' that we can grasp in parts but not entirely. Being unable to understand these complex systems means that we have framed them in such a way (f.e. millions of matrix operations) that does not allow for our symbol-based, causal reasoning mode. That is on us, not our capabilities or the universe.
    All our theories are built on observation, so these empirical models yielding such useful results is a great thing - it satisfies the need for observing and acting. Missing explainability of the models merely means we have less ability to act more precisely - but it does not devalue our ability to act coarsely.
    [-]
    - visarga 11 days ago
      But the human brain has limited working memory and experience. Even in software development we are often teetering at the edge of the mental power to grasp and relate ideas. We have tried so much to manage complexity, but real world complexity doesn't care about human capabilities. So there might be high dimensional problems where we simply can't use our brains directly.
      [-]
      - jvanderbot 11 days ago
        A human mind is perfectly capable of following the same instructions as the computer did. Computers are stupidly simple and completely deterministic.
        The concern is about "holding it all in your head", and depending on your preferred level of abstraction, "all" can perfectly reasonably be held in your head. For example: "This program generates the most likely outputs" makes perfect sense to me, even if I don't understand some of the code. I understand the system. Programmers went through this decades ago. Physicists had to do it too. Now, chemists I suppose.
        [-]
        ajuc 11 days ago
        Abstraction isn't the silver bullet. Not everything is abstractable.
        "This program generates the most likely outputs" isn't a scientific explanation, it's teleology.
        [-]
        jvanderbot 11 days ago
        "this tool works better than my intuition" absolutely is science. "be quiet and calculate" is a well worn mantra in physics is it not?
        [-]
        drdeca 10 days ago
        “calculate” in that phrase, refers to doing the math, and the understanding that that entails, not pressing the “=“ button on a calculator.
        [-]
        d0mine 10 days ago
        Why do you think systems of partial differential equations (common in physics) are somehow provide more understanding than the corresponding ML math (at the end of the day both can produce results using a lots of matrix multiplications).
        [-]
        drdeca 10 days ago
        ... because people understand things about what is described when dealing with such systems in physics, and people don't understand how the weights in ML learned NNs produce the overall behavior? (For one thing, the number of parameters is much greater with the NNs)
        [-]
        d0mine 10 days ago
        Looking at Navier-Stokes equations tells you very little about the weather tomorrow.
        [-]
        drdeca 10 days ago
        Sure. It does tell you things about fluids though.
        mistermann 11 days ago
        What is an example of something that isn't abstractable?
        [-]
        ajuc 10 days ago
        Stuff that we can't program directly, but can program using machine learning.
        Speech recognition. OCR. Reccomendation engines.
        You don't write OCR by going "if there's a line at this angle going for this long and it crosses another line at this angle then it's an A".
        There's too many variables and influence of each of them is too small and too tightly coupled with others to be able to abstract it into something that is understandeable to a human brain.
        [-]
        mistermann 9 days ago
        AI arguably accomplishes this using some form of abstraction though does it not?
        Or, consider the art word broadly, artists routinely engage in various forms of unusual abstraction.
        [-]
        ajuc 9 days ago
        > AI arguably accomplishes this using some form of abstraction though does it not?
        It's unabstractable for people, because the most abstract model that works still has far too many variables for our puny brains.
        > artists routinely engage in various forms of unusual abstraction
        Abstraction in art is just another, unrelated meaning of the word. Like execution of a program vs execution of a person. You could argue executing the journalist for his opinions isn't bad, because execution of mspaint.exe is perfectly fine, but it won't get you far :)
        [-]
        mistermann 2 days ago
        > It's unabstractable for people, because the most abstract model that works still has far too many variables for our puny brains.
        Abstraction doesn't have to be perfect, just as "logic" doesn't have to be.
        > Abstraction in art is just another, unrelated meaning of the word.
        Speaking of art: have you seen the movie The Matrix? It's rather relevant here.
        GenerocUsername 11 days ago
        This is just wrong.
        While computer operations in solutions are computable by humans, the billions of rapid computations are unachievable by humans. In just a few seconds, a computer can perform more basic arithmetic operations than a human could in a lifetime.
        [-]
        jvanderbot 11 days ago
        I'm not saying it's achievable, I'm saying it's not magic. A chemist who wishes to understand what the model is doing can get as far as anyone else, and can reach a level of "this prediction machine works well and I understand how to use and change it". Even if it requires another PhD in CS.
        That the tools became complex is not a reason to fret in science. No more than statistical physics or quantum mechanics or CNN for image processing - it's complex and opaque and hard to explain but perfectly reproduceable. "It works better than my intuition" is a level of sophistication that most methods are probably doomed to achieve.
    - EventH- 11 days ago
      "There is no 'truth' that we can grasp in parts but not entirely."
      The value of pi is a simple counterexample.
      [-]
      - joaogui1 10 days ago
        We can predict the digits of pi with a formula, to me that counts as grasping it
        [-]
        dekhn 10 days ago
        For those who aren't aware: https://en.wikipedia.org/wiki/Bellard%27s_formula (yes, that Fabrice: https://en.wikipedia.org/wiki/Fabrice_Bellard)
    - Invictus0 11 days ago
      > There is no 'truth' that we can grasp in parts but not entirely
      It appears that your own comment is disproving this statement
    - slibhb 11 days ago
      > There is no 'truth' that we can grasp in parts but not entirely.
      If anyone actually thought this way -- no one does -- they definitely wouldn't build models like this.
  - interroboink 11 days ago
    > ... and strongly believe that science's goal is to produce falsifiable hypotheses, these new approaches will be extremely depressing
    I don't quite understand this point — could you elaborate?
    My understanding is that the ML model produces a hypothesis, which can then be tested via normal scientific method (perform experiment, observe results).
    If we have a magic oracle that says "try this, it will work", and then we try it, and it works, we still got something falsifiable out of it.
    Or is your point that we won't necessarily have a coherent/elegant explanation for why it works?
    [-]
    - variadix 11 days ago
      There is an issue scientifically. I think this point was expressed by Feynman: the goal of scientific theories isn’t just to make better predictions, it’s to inform us about how and why the world works. Many ancient civilizations could accurately predict the position of celestial bodies with calendars derived from observations of their period, but it wasn’t until Copernicus proposed the heliocentric model and Galileo provided supporting observations that we understood the why and how, and that really matters for future progress and understanding.
      [-]
      - interroboink 11 days ago
        I agree the how/why is the main driving goal. That's kinda why I feel like this is not depressiong news — there's a new frontier to discover and attempt to explain. Scientists love that stuff (:
        Knowing how to predict the motion of planets but without having an underlying explanation encourages scientists to develop their theories. Now, once more, we know how to predict something (protein folding) but without an underlying explanation. Hurray, something to investigate!
        (Aside: I realize that there are also more human factors at play, and upsetting the status quo will always cause some grief. I just wanted to provide a counterpoint that there is some exciting progress represented here, too).
        [-]
        variadix 11 days ago
        I was mainly responding to the claim that these black boxes produce a hypothesis that is useful as a basis for scientific theories. I don’t think it does, because it offers no explanation as to the how and why, which is as we agree the primary goal. It doesn’t provide a hypothesis per se, just a prediction, which is useful technologically and should indicate that there is more to be discovered (see my response to the sibling reply) scientifically but offers no motivating explanation.
      - Invictus0 11 days ago
        But we do know why, it's just not simple. The atoms interact with one another because of a variety of fundamental forces, but since there can be hundreds of thousands of atoms in a single protein, it's plainly beyond human comprehension to explain why it folds the way it does, one fundamental force interaction at a time.
        [-]
        variadix 11 days ago
        Fair. I guess the interesting thing for protein folding research then is that there appears to be a way to approximate/simplify the calculations required to predict folding patterns that doesn’t require the precision of existing folding models and software. In essence, AlphaFold is an existence proof that there should be a way to model protein folding more efficiently.
        [-]
    - dekhn 11 days ago
      People will be depressed because they spent decades getting into professorship positions and publishing papers with ostensible comprehensible interpretations of the generative processes that produced their observations, only to be "beat" in the game by a system that processed a lot of observations and can make predicts in a way that no individual human could comprehend. And those professors will have a harder time publishing, and therefore getting promoted, in the future.
      Whether ML models produce hypotheses is something of an epistemiological argument that I think muddies the waters without bringing any light. I would only use the term "ML models generate predictions". In a sense, the model itself is the hypothesis, not any individual prediction.
  - narrator 11 days ago
    What if our understanding of the laws of the natural sciences are subtly flawed and AI just corrects perfectly for our flawed understanding without telling us what the error in our theory was?
    Forget trying to understand dark matter. Just use this model to correct for how the universe works. What is actually wrong with our current model and if dark matter exists or not or something else is causing things doesn't matter. "Shut up and calculate" becomes "Shut up and do inference."
    [-]
    - dekhn 11 days ago
      All models are wrong, but some models are useful.
      [-]
      - narrator 10 days ago
        The black box AI models could calculate epicycles perfectly so the middle ages Catholic Church could say just use those instead of being a geocentrrism denier.
    - RandomLensman 11 days ago
      High accuracy could result from pretty incorrect models. When and where that woukd then go completely off the rails is difficult to say.
    - visarga 11 days ago
      ML is accustomed with the idea that all models are bad, and there are ways to test how good or bad they are. It's all approximations and imperfect representations, but they can be good enough for some applications.
      If you think carefully humans operate in the same regime. Our concepts are all like that - imperfect, approximative, glossing over some details. Our fundamental grounding and test is survival, an unforgiving filter, but lax enough to allow for anti-vaxxer movements during the pandemic - survival test is not testing for truth directly, only for ideas that fail to support life.
      [-]
      - mistermann 11 days ago
        Also lax enough for the hilarious mismanagement of the situation by "the experts". At least anti-vaxxers have an excuse.
    - coffeebeqn 10 days ago
      Wouldn’t learning new data and results give us more hints to the true meaning of the thing? I fail to see how this is a bad thing in anyone’s eye.
  - divbzero 11 days ago
    There have been times in the past when usable technology surpassed our scientific understanding, and instead of being depressing it provided a map for scientific exploration. For example, the steam engine was developed by engineers in the 1600s/1700s (Savery, Newcomen, and others) but thermodynamics wasn’t developed by scientists until the 1800s (Carnot, Rankine, and others).
    [-]
    - jprete 11 days ago
      I think the various contributors to the invention of the steam engine had a good idea of what they were trying to do and how their idea would physically work. Wikipedia lists the prerequisites as the concepts of a vacuum and pressure, methods for creating a vacuum and generating steam, and the piston and cylinder.
      [-]
      - exe34 11 days ago
        That's not too different from the alpha fold people knowing that there's a sequence to sequence translation, that an enormous number of cross-talk happens between the parts of the molecule, that if you get the potential fields just right, it'll fold in the way nature intended. They're not just blindly fiddling with a bunch of levers. What they don't know is the individual detailed interactions going on and how to approximate them with analytical equations.
  - cynicalkane 10 days ago
    What always struck me about Chomskyists is that they chose a notion of interpretable model that required unrealistic amounts of working interpretation. So Chomsky grammars have significant polynomial memory and computational costs for grammars as they approach something resembling human grammar. And you say, ok, the human brain can handle much more computation than that, and that's fine. But (for example) context-free grammars aren't just O(n^3) in computational cost; for a realistic description of human language they're O(n^3) in human-interpretable rules.
    Other Chomsky-like models of human grammars have different asymptotic behavior and different choices of n, but the same fundamental problem; the big-O constant factor isn't neurons firing but rather human connections between the n inputs. How can you conceive of human minds being able to track O(n^3) (or whatever) cost where that n is everything being communicated -- words, concepts, symbols, representations, all that jazz and the polynomial relationships between them?
    But I feel an apology is in order: I've had quite a few beers before coming home, and it's probably a mistake to try to express academically charged and difficult views on the Internet while in an inebriated state. Probably the alcohol has substantially decreased my mental computational power. However, it has only mildly impaired my ability to string together words and sentences in a grammatically complex fashion. In fact, I often feel that the more sober and clear-minded I am, the simpler my language is. Maybe human grammar is actually sub-polynomial. I have observed the same in ChatGPT; the more flowery and wordy it has become over time, the dumber its output.
    [-]
    - dekhn 10 days ago
      There is a ballmer peak for pontificating.
      As an aside but relevant to your point, my entire introduction to DNA and protein analysis was based on Chomsky grammars. My undergrad thesis advisor David Haussler handed me a copy of an article by David Searls "The Linguistics of DNA" (https://www.scribd.com/document/461974005/The-Linguistics-of...) . At the time, Haussler was in the middle of applying HMMs and other probabilistic graphical models to sequence analysis, and I knew all about DNA as a molecule, but not how to analyze it.
      Searls paper basically walks through Chomsky's hierarchy, and how to apply it, using linguistic techniques to "parse" DNA. It was mind-bending and mind-expanding for me (it takes me a long time to read papers, for example I think I read this paper over several months, learning to deal with parsing along the way). To this day I am astounded at how much those approaches (linguistics, parsing, and grammars) have evolved- and yet not much has changed! People were talking about generative models in the 90s (and earlier) in much the same way we treat LLMs today. While much of Chomsky's thinking on how to make real-world language models isn't particuarly relevant, we still are very deeply dependent on his ideas for grammar...
      Anyway, back to your point. While CFGs may be O(n*3) I would say that there is a implicit, latent O(n) parseable grammar underlying human linguistics, and our brains can map that latent space to its own internal representation in O(1) time, where the n roughly correlates to the complexity of the idea being transferred. It does not seem even remotely surprising that we can make multi-language models that develop their own compact internal representation that is presumably equidistant from each source language.
  - ggm 10 days ago
    For some, this conversation started when the machine derived four colour map proof was announced which is almost 5 decades ago in 1976
  - coffeemug 11 days ago
    > If you're the sort of person who believes that human brains are capable of understanding the "why" of how things work in all its true detail
    This seems to me an empirical question about the world. It’s clear our minds are limited, and we understand complex phenomena through abstraction. So either we discover we can continue converting advanced models to simpler abstractions we can understand, or that’s impossible. Either way, it’s something we’ll find out and will have to live with in the coming decades. If it turns out further abstractions aren’t possible, well, enlightenment thought had lasted long enough. It’s exciting to live at a time in humanity’s history when we enter a totally uncharted new paradigm.
  - RajT88 11 days ago
    > can we actually interpret these models, or are human brains too feeble to understand complex systems without sophisticated models?
    I think we will have to develop a methodology and supporting toolset to be able to derive the underlying patterns driving such ML models. It's just too much for a human to comb through by themselves and make sense of.
  - pishpash 11 days ago
    So the work to simplify ML models, reduce dimensions, etc. becomes the numeric way to seek simple actual scientific models. Scientific computing and science become one.
  - bamboozled 10 days ago
    Do you think a model will also be able to truly comprehend everything too ?
  - ThomPete 11 days ago
    The goal of science should always be to seek good explanations hard to vary.
- GistNoesis 11 days ago
  The frontier in model space is kind of fluid. It's all about solving differential equations.
  In theoretical physics, you know the equations, you solve equations analytically, but you can only do that when the model is simple.
  In numerical physics, you know the equations, you discretize the problem on a grid, and you solve the constraint defined by the equations with various numerical integration schemes like RK4, but you can only do that when the model is small and you know the equations, and you find a single solution.
  Then you want the result faster, so you use mesh-free methods and adaptive grids. It works on bigger models but you have to know the equations, finding a single solution to the differential equations.
  Then you compress this adaptive grid with a neural network, while still knowing the governing equations, and you have things like Physics Informed Neural Networks ( https://arxiv.org/pdf/1711.10561 and following papers) where you can bound the approximation error. This method allows solve all solutions to the differential equations simultaneously, sharing the computations.
  Then when knowing explicitly your governing equations is too complex, so you assume that there are some governing stochastic equations implicitly, which you learn the end-result of the dynamic with a diffusion model, that's what this alpha-fold is doing.
  ML is kind of a memoization technique, analog to hashlife in the game of life, that allows you reuse your past computational efforts. You are free to choose on this ladder which memory-compute trade-off you want to use to model the world.
- nexuist 11 days ago
  As a steelman, wouldn't the abundance of infinitely generate-able situations make it _easier_ for us to develop strong theories and models? The bottleneck has always been data. You have to do expensive work in the real world and accurately measure it before you can start fitting lines to it. If we were to birth an e.g. atomically accurate ML model of quantum physics, I bet it wouldn't take long until we have mathematical theories that explain why it works. Our current problem is that this stuff is super hard to manipulate and measure.
  [-]
  - moconnor 11 days ago
    Maybe; AI chess engines have improved human understanding of the game very rapidly, even though humans cannot beat engines.
    [-]
    - whymauri 10 days ago
      I've seen generative models for molecular structures produce results that looked non-sensical at first glance; however, when passed along to more experienced medicinal chemists they identified a bit 'creativity' that only a very advanced practitioner would understand or appreciate. Those hypotheses, which would not be produced by most experts, served as an anchor for further exploration of novel structures and ideas.
      So in a way, what you say is already possible. Just how GMs in chess specialize in certain openings or play styles, master chemists have pre-existing biases that can affect their designs; algorithms can have different biases which push exploration to interesting places. Once you have a good latent representation of relevant chemical space, so you can optimize for this sort of creativity (a practical but boring example is to push generation outside of patent space).
  - alfalfasprout 11 days ago
    This is an important aspect that's being ignored IMO.
    For a lot of problems, currently you either don't have an an analytical solution and the alternative is a brute force-ish numerical approach. As a result the computational cost of simulating things enough times to be able to detect behavior that can inform theories/models (potentially yielding a good analytical result) is not viable.
    In this regard, ML models are promising.
- xanderlewis 11 days ago
  It depends whether the value of science is human understanding or pure prediction. In some realms (for drug discovery, and other situations where we just need an answer and know what works and what doesn’t), pure prediction is all we really need. But if we could build an uninterpretable machine learning model that beats any hand-built traditional ‘physics’ model, would it really be physics?
  Maybe there’ll be an intermediate era for a while where ML models outperform traditional analytical science, but then eventually we’ll still be able to find the (hopefully limited in number) principles from which it can all be derived. I don’t think we’ll ever find that Occam’s razor is no use to us.
  [-]
  - failTide 11 days ago
    > But if we could build an uninterpretable machine learning model that beats any hand-built traditional ‘physics’ model, would it really be physics?
    At that point I wonder if it would be possible to feed that uninterpretable model back into another model that makes sense of it all and outputs sets of equations that humans could understand.
  - gmarx 11 days ago
    The success of these ML models has me wondering if this is what Quantum Mechanics is. QM is notoriously difficult to interpret yet makes amazing predictions. Maybe wave functions are just really good at predicting system behavior but don't reflect the underlying way things work.
    OTOH, Newtonian mechanics is great at predicting things under certain circumstances yet, in the same way, doesn't necessarily reflect the underlying mechanism of the system.
    So maybe philosophers will eventually tell us the distinction we are trying to draw, although intuitive, isn't real
    [-]
    - kolinko 11 days ago
      That’s what thermodynamics is - we initially only had laws about energy/heat flow, and only later we figured out how statistical particle movements cause these effects.
  - RandomLensman 11 days ago
    Pure prediction is only all we need if the total end-to-end process is predicted correctly - otherwise there could be pretty nasty traps (e.g., drug works perfectly for the target disease but does something unexpected elsewhere etc.).
    [-]
    - gus_massa 11 days ago
      > e.g., drug works perfectly for the target disease but does something unexpected elsewhere etc.
      That's very common. It's the reason to test the new drug in petri dish, then rats, then dogs, then humans and if all test passed send it to the pharmacy.
- topaz0 11 days ago
  In case it's not clear, this does not "beat" experimental structure determination. The matches to experiment are pretty close, but they will be closer in some cases than others and may or may not be close enough to answer a given question about the biochemistry. It certainly doesn't give much information about the dynamics or chemical perturbations that might be relevant in biological context. That's not to pooh-pooh alphafold's utility, just that it's a long way from making experimental structure determination unnecessary, and much much further away from replacing a carefully chosen scientific question and careful experimental design.
- UniverseHacker 11 days ago
  It means we now have an accurate surrogate model or "digital twin" that can be experimented on almost instantaneously. So we can massively accelerate the traditional process of developing mechanistic understanding through experiment, while also immediately be able to benefit from the ability to make accurate predictions, even without needing understanding.
  In reality, science has already pretty much gone this way long ago, even if people don't like to admit it. Simple, reductionist explanations for complex phenomena in living systems don't really exist. Virtually all of medicine nowadays is empirical: try something, and if you can prove its safe and effective, you keep doing it. We almost never have a meaningful explanation for how it really works, and when we think we do, it gets proven wrong repeatedly, while the treatment keeps working as always.
  [-]
  - imchillyb 11 days ago
    Medicine can be explained fairly simply, and the why of how it works as it does is also explained by this:
    Imagine a very large room that has every surface covered by on-off switches.
    We cannot see inside of this room. We cannot see the switches. We cannot fit inside of this room, but a toddler fits through the tiny opening leading into the room. The toddler cannot reach the switches, so we equip the toddler with a pole that can flip the switches. We train the toddler, as much as possible, to flip a switch using the pole.
    Then, we send the toddler into the room and ask the toddler to flip the switch or switches we desire to be flipped, and then do tests on the wires coming out of the room to see if the switches were flipped correctly. We also devise some tests for other wires to see if that naughty toddler flipped other switches on or off.
    We cannot see inside the room. We cannot monitor the toddler. We can't know what _exactly_ the toddler did inside the room.
    That room is the human body. The toddler with a pole is a medication.
    We can't see or know enough to determine what was activated or deactivated. We can invent tests to narrow the scope of what was done, but the tests can never be 100% accurate because we can't test for every effect possible.
    We introduce chemicals then we hope-&-pray that the chemicals only turned on or off the things we wanted turned on or off. Craft some qualifications testing for proofs, and do a 'long-term' study to determine if there were other things turned on or off, or a short circuit occurred, or we broke something.
    I sincerely hope that even without human understanding, our AI models can determine what switches are present, which ones are on and off, and how best to go about selecting for the correct result.
    Right now, modern medicine is almost a complete crap-shoot. Hopefully modern AI utilities can remedy the gambling aspect of medicine discovery and use.
    [-]
    - tsimionescu 10 days ago
      The more important point was that medications that do work still come in two forms: ones where we have a good idea of the mechanism of action that makes them work, and ones where we don't.
      For example, we have a good idea of why certain antibiotics cure tuberculosis - we understand that tuberculosis is caused by certain bacteria, and we know how antibiotics affect the cellular chemistry of those bacteria to kill them. We also understand the dynamics of this, the fact that the body's immune system still has to be functioning well enough to kill many of the bacteria as well, etc. We don't fully understand all of the side-effects and possible interactions with other diseases or medications in every part of the body, but we understand the gist of it all.
      Then there are drugs and diseases where we barely understand any of it. We don't have for example a clear understanding of what depression is, what the biochemistry of it is. We do know several classes of drugs that help with depression in certain individuals, but we know those drugs don't help with other individuals, and we have no way of predicting which is which. We know some of the biochemical effects of these drugs, but since we don't understand the underlying cause of depression, we don't actually know why the drugs help, or what's the difference in individuals where they don't help.
      There are also widely used medications where we understand even less. Metamizole, a very widely used painkiller sold as Novalgin or Analgin and other names, discovered in 1922, has no firmly established mechanism of action.
  - mathgradthrow 11 days ago
    instead of "in mice", we'll be able to say "in the cloud"
    [-]
    - topaz0 11 days ago
      "In nimbo" (though what people actually say is "in silico").
    - unsupp0rted 11 days ago
      In vivo in humans in the cloud
      [-]
      - dekhn 11 days ago
        one of the companies I worked for, "insitro", is specificallyt named that to mean the combination of "in vivo, in vitro, in silicon".
    - d_silin 11 days ago
      "in silico"
- philip1209 11 days ago
  It makes me think about how Einstein was famous for making falsifiable real-world predictions to accompany his theoretical work. And, sometimes it took years for proper experiments to be run (such as measuring a solar eclipse during the breakout of a world war).
  Perhaps the opportunity here is to provide a quicker feedback loop for theory about predictions in the real world. Almost like unit tests.
  [-]
  - HanClinto 11 days ago
    > Perhaps the opportunity here is to provide a quicker feedback loop for theory about predictions in the real world. Almost like unit tests.
    Or jumping the gap entirely to move towards more self-driven reinforcement learning.
    Could one structure the training setup to be able to design its own experiments, make predictions, collect data, compare results, and adjust weights...? If that loop could be closed, then it feels like that would be a very powerful jump indeed.
    In the area of LLMs, the SPAG paper from last week was very interesting on this topic, and I'm very interested in seeing how this can be expanded to other areas:
    https://github.com/Linear95/SPAG
  - goggy_googy 11 days ago
    Agreed. At the very least, models of this nature let us iterate/filter our theories a little bit more quickly.
    [-]
    - jprete 11 days ago
      The model isn't reality. A theory that disagrees with the model but agrees with reality shouldn't be filtered, but in this process it will be.
- CapeTheory 11 days ago
  Many of our existing physical models can be decomposed into "high-confidence, well tested bit" plus "hand-wavy empirically fitted bit". I'd like to see progress via ML replacing the empirical part - the real scientific advancement then becomes steadily reducing that contribution to the whole by improving the robust physical model incrementally. Computational performance is another big influence though. Replacing the whole of a simulation with an ML model might still make sense if the model training is transferrable and we can take advantage of the GPU speed-ups, which might not be so easy to apply to the foundational physical model solution. Whether your model needs to be verified against real physical models depends on the seriousness of your use-case; for nuclear weapons and aerospace weather forecasts I imagine it will remain essential, while for a lot of consumer-facing things the ML will be good enough.
  [-]
  - jononor 11 days ago
    Physics-informed machine learning is a whole (nascent) subfield that is very much in line with this thinking. Steve Brunton has some good stuff about this on YouTube.
- 6gvONxR4sf7o 11 days ago
  "Best methods" is doing a lot of heavy lifting here. "Best" is a very multidimensional thing, with different priorities leading to different "bests." Someone will inevitably prioritize reliability/accuracy/fidelity/interpretability, and that's probably going to be a significant segment of the sciences. Maybe it's like how engineers just need an approximation that's predictive enough to build with, but scientists still want to understand the underlying phenomena. There will be an analogy to how some people just want an opaque model that works on a restricted domain for their purposes, but others will be interested in clearer models or unrestricted/less restricted domain models.
  It could lead to a very interesting ecosystem of roles.
  Even if you just limit the discussion to using the best model of X to design a better Y, limited to the model's domain of validity, that might translate the usage problem to finding argmax_X of valueFunction of modelPrediction of design of X. In some sense a good predictive model is enough to solve this with brute force, but this still leaves room for tons of fascinating foundational work. Maybe you start to find that the (wow so small) errors in modelPrediction are correlated with valueFunction, so the most accurate predictions don't make it the best for argmax (aka optimization might exploit model errors rather than optimizing the real thing). Or maybe brute force just isn't computationally feasible, so you need to understand something deeper about the problem to simplify the optimization to make it cheap.
- andrewla 10 days ago
  Physicists like to retroactively believe that our understanding of physical phenomena preceded the implementation of uses of those phenomena, when the reality is that physics has always come in to clean up after the engineers. There are some rare exceptions, but usually the reason that scientific progress can be made in an area is that the equipment to perform experiments has been commoditized sufficiently by engineering demand for it.
  We had semiconductors and superconductors before we understood how they worked -- on both cases arguably we still don't completely understand the phenomena. Things like the dynamo and the electric motor were invented by practice and later explained by scientists, not derived from first principles. Steam engines and pumps were invented before we had the physics to describe how they worked.
- pen2l 11 days ago
  The most moneyed and well-coordinated organizations have honed a large hammer, and they are going to use it for everything, and so almost certainly future big findings in the areas you mention, probabilistically inclined models coming from ML will be the new gold standard.
  But yet the only thing that can save us from ML will be ML itself because it is ML that has the best chance to be able to extrapolate patterns from these blackbox models to develop human interpretable models. I hope we do dedicate explicit effort to this endeavor, and so continue the human advances and expanse of human knowledge in tandem with human ingenuity with computers at our assistance.
  [-]
  - optimalsolver 11 days ago
    Spoiler: "Interpretable ML" will optimize for output that either looks plausible to humans, reinforces our preconceptions, or appeals to our aesthetic instincts. It will not converge with reality.
    [-]
    - JoshuaDavid 10 days ago
      Empirically, this does not seem to be what we see: from https://transformer-circuits.pub/2023/monosemantic-features/...
      > One strong theme is the prevalence of context features (e.g. DNA, base64) and token-in-context features (e.g. the in mathematics – A/0/341, < in HTML – A/0/20). 29 These have been observed in prior work (context features e.g. [38, 49, 45] ; token-in-context features e.g. [38, 15] ; preceding observations [50] ), but the sheer volume of token-in-context features has been striking to us. For example, in A/4, there are over a hundred features which primarily respond to the token "the" in different contexts. 30 Often these features are connected by feature splitting (discussed in the next section), presenting as pure context features or token features in dictionaries with few learned features, but then splitting into token-in-context features as more features are learned.
      > [...]
      > The general the in mathematical prose feature (A/0/341) has highly generic mathematical tokens for its top positive logits (e.g. supporting the denominator, the remainder, the theorem), whereas the more finely split machine learning version (A/2/15021) has much more specific topical predictions (e.g. the dataset, the classifier). Likewise, our abstract algebra and topology feature (A/2/4878) supports the quotient and the subgroup, and the gravitation and field theory feature (A/2/2609) supports the gauge, the Lagrangian, and the spacetime
      I don't think "hundreds of different ways to represent the word 'the', depending on the context" is a-priori plausible, in line with our preconceptions, or aesthetically pleasing. But it is what falls out of ML interpretation techniques, and it does do a quantitatively good job (as measured by fraction of log-likelihood loss recovered) as an explanation of what the examined model is doing.
    - kolinko 11 days ago
      That is not considered interpretable then, and I think most people working in the field are aware of this gotcha.
      Iirc when EU required banks to have interpretable rules for loans, a plain explanation was not considered enough. What was required was a clear process that was used from the beginning - i.e. you can use an AI to develop an algorightm to make a decision, but you can’t use AI to make a decision and explains reasons afterwards.
    - DoctorOetker 11 days ago
      Spoiler: basic / hard sciences describe nature mathematically.
      Open a random physics book, and you will find lots and lots of derivations (using more or less acceptable assumptions depending on circumstance under consideration).
      Derivations and assumptions can be formally verified, see for example https://us.metamath.org
      Ever more intelligent machine learning algorithms and data structures replacing human heuristic labor, will simply shift the expected minimum deliverable from associations to ever more rigorous proofs in terms of less and less assumptions.
      Machine learning will ultimately be used as automated theorem provers, and their output will eventually be explainable by definition.
      When do we classify an explanation as explanatory? When it succeeds in deriving a conclusion from acceptable assumptions without hand waving. Any hand waving would result in the "proof" not having passed formal verification.
  - barrenko 10 days ago
    Interpretable AI is the same scam as alignment.
- slibhb 11 days ago
  It's interesting to compare this situation to earlier eras in science. Newton, for example, gave us equations that were very accurate but left us with no understanding at all of why they were accurate.
  It seems like we're repeating that here, albeit with wildly different methods. We're getting better models but by giving up on the possibility of actually understanding things from first principles.
  [-]
  - slashdave 11 days ago
    Not comparable. Our current knowledge of the physics involved in these systems is complete. It is just impossibly difficult to calculate from first principles.
- t14n 11 days ago
  A new-ish field of "mechanistic interpretability" is trying to poke at weights and activations and find human-interpretable ideas w/in them. Making lots of progress lately, and there are some folks trying to apply ideas from the field to Alphafold 2. There are hopes of learning the ideas about biology/molecular interactions that the model has "discovered".
  Perhaps we're in an early stage of Ted Chiang's story "The Evolution of Human Science", where AIs have largely taken over scientific research and a field of "meta-science" developed where humans translate AI research into more human-interpretable artifacts.
- tomrod 11 days ago
  A few things:
  1. Research can then focus on where things go wrong
  2. ML models, despite being "black boxes," can still have brute-force assessment performed of the parameter space over covered and uncovered areas by input information
  3. We tend to assume parsimony (i.e Occam's razor) to give preference to simpler models when all else is equal. More complex black-box models exceeding in prediction let us know the actual causal pathway may be more complex than simple models allow. This is okay too. We'll get it figured out. Not everything is closed-form, especially considering quantum effects may cause statistical/expected outcomes instead of deterministic outcomes.
- ChuckMcM 11 days ago
  Interesting times indeed. I think the early history of medicines takes away from your observation though. In the 19th and early 20th century people didn't know why medicines worked, they just did. The whole "try a bunch of things on mice, pick the best ones and try them on pigs, and then the best of those and try a few on people" kind of thing. In many ways the mice were a stand in for these models, at the time scientists didn't understand nearly as much about how mice worked (early mice models were pretty crude by today's standards) but they knew they were a close enough analog to the "real thing" that the information provided by mouse studies was usefully translated into things that might help/harm humans.
  So when you're tools can produce outputs that you find useful, you can then use those tools to develop your understanding and insights. As a tool, this is quite good.
- jeffreyrogers 11 days ago
  I asked a friend of mine who is chemistry professor at a large research university something along these lines a while ago. He said that so far these models don't work well in regions where either theory or data is scarce, which is where most progress happens. So he felt that until they can start making progress in those areas it won't change things much.
  [-]
  - mensetmanusman 11 days ago
    Major breakthroughs happen when clear connections can be made and engineered between the many bits of solved but obscured solutions.
- adw 11 days ago
  > What happens when the best methods for computational fluid dynamics, molecular dynamics, nuclear physics are all uninterpretable ML models?
  A better analogy is "weather forecasting".
  [-]
  - wayeq 10 days ago
    interesting choice considering the role chaos theory plays in forever rendering long term weather predictions impossible, by humans or LLMs.
- wslh 11 days ago
  This is the topic of epistemology of the sciences in books such as "New Direction in the Philosophy of Mathematics" [1] and happened before with problems such as the four color theorem [2] where AI was not involved.
  Going back to the uninterpretable ML models in the context of AlphaFold 3, I think one method for trying to explain the findings is similar to the experimental methods of physics with reality: you perform experiments with the reality (in this case AlphaFold 3) to came up with sound conclusions. AI/ML is an interesting black-box system.
  There are other open discussions on this topic. For example, can our human brain absorbe that knowledge or it is limited somehow with the scientific language that we have now?
  [1] https://www.google.com.ar/books/edition/New_Directions_in_th...
  [2] https://en.wikipedia.org/wiki/Four_color_theorem
- advisedwang 11 days ago
  In physics, we already deal with the fact that many of the core equations cannot be analytically solved for more than the most basic scenarios. We've had to adapt to using approximation methods and numerical methods. This will have to be another place where we adapt to a practical way of getting results.
- thegrim33 11 days ago
  Reminds me of the novel Blindsight - in it there's special individuals who work as synthesists, whos job it is to observe and understand and then somehow translate back to "lay person" the seemingly undecipherable actions/decisions of advanced computers and augmented humans.
- salty_biscuits 10 days ago
  I'd say it's not new. Take fluid dynamics as an example, the navier stokes equations predict the motion of fluids very well but you need to approximately solve them on a computer in order to get useful predictions for most setups. I guess the difference is the equation is compact and the derivation from continuum mechanics is easy enough to follow. People still rely on heuristics to answer "how does a wing produce lift?". These heuristic models are completely useless at "how much lift will this particular wing produce under these conditions?". Seems like the same kind of situation. Maybe progress forward will look like producing compact models or tooling to reason about why a particular thing happened.
- jononor 11 days ago
  I think it likely that instead of replacing existing methods, we will see a fusion. Or rather, many different kinds of fusions - depending on the exact needs of the problems at hand (or in science, the current boundary of knowledge). If nothing else then to provide appropriate/desirable level of explainability, correctness etc. Hypothetically the combination will also have better predictive performance and be more data efficient - but it remains to be seen how well this plays out in practice. The field of "physics informed machine learning" is all about this.
- signal_space 11 days ago
  Is alphafold doing model generation or is it just reducing a massive state space?
  The current computational and systems biochemistry approaches struggle to model large biomolecules and their interactions due to the large degrees of freedom of the models.
  I think it is reasonable to rely on statistical methods to lead researchers down paths that have a high likelihood of being correct versus brute forcing the chemical kinetics.
  After all chemistry is inherently stochastic…
- tambourine_man 11 days ago
  Our metaphors and intuitions were crumbling already and stagnating. See quantum physics: sometimes a particle, sometimes a wave, and what constitute a measurement anyway?
  I’ll take prediction over understanding if that’s the best our brains can do. We’ve evolved to deal with a few orders of magnitude around a meter and a second. Maybe dealing with light-years and femtometer/seconds is too much to ask.
- Jupe 10 days ago
  > Does this decouple progress from our current understanding of the scientific process - moving to better and better models of the world without human-interpretable theories and mathematical models / explanations?
  Replace "human-interpretable theories" with "every man interpretable theories", and you'll have a pretty good idea of how > 90% of the world feels about modern science. It is indistinguishable from magic, by the common measure.
  Obtuse example: My parents were alive when the first nuclear weapon was detonated. They didn't know that they didn't know this weapon was being built, let alone that it might have ignited the atmosphere.
  With sophisticated enough ML, that 90% will become 99.9% - save the few who have access to (and can trust) ML tools that can decipher the "logic" from the original ML tools.
  Yes, interesting times ahead... indeed.
- danielmarkbruce 11 days ago
  "better and better models of the world" does not always mean "more accurate" and never has.
  We already know how to model the vast majority of things, just not at a speed and cost which makes it worthwhile. There are dimensions of value - one is accuracy, another speed, another cost, and in different domains additional dimensions. There are all kinds of models used in different disciplines which are empirical and not completely understood. Reducing things to the lowest level of physics and building up models from there has never been the only approach. Biology, geology, weather, materials all have models which have hacks in them, known simplifications, statistical approximations, so the result can be calculated. It's just about choosing the best hacks to get the best trade off of time/money/accuracy.
- robwwilliams 10 days ago
  This is a key but secondary concern to many of us working in molecular geneticist who will use AlphaFold 3 to evaluate pair-wise interactions. We often have genetic support for an interaction between proteins A and B. For example, in a study of genetic variation in responses of mice to morphine I currently have two candidate proteins that interact epistatically, suggesting a possible “lock and key” model—-the mu opiate receptor (MOR) and FGF12. I can now evaluate the likelihood of a direct molecular interaction between these proteins and possible amino acids substitutions that account for individuals difference.
  In other words I bring a hypothesis to AF3 and ask for it to refute or affirm.
- insane_dreamer 11 days ago
  For me the big question is how do we confidently validate the output of this/these model(s).
  [-]
  - topaz0 11 days ago
    It's the right question to ask, and the answer is that we will still have to confirm them by experimental structure determination.
- torrefatto 11 days ago
  You are conflating the whole scientific endeavor to a very specific problem to which this specific approach is effective at producing results that fit with the observable world. This has nothing to do with science as a whole.
- ldoughty 11 days ago
  My argument is: weather.
  I think it is fine & better for society to have applications and models for things we don't fully understand... We can model lots of small aspects of weather, and we have a lot of factors nailed down, but not necessarily all the interactions.. and not all of the factors. (Additional example for the same reason: Gravity)
  Used responsibly. Of course. I wouldn't think an AI model designing an airplane that no engineers understand how it works is a good idea :-)
  And presumably all of this is followed by people trying to understand the results (expanding potential research areas)
  [-]
  - GaggiX 11 days ago
    It would be cool to see an airplane made using generative design.
    [-]
    - tech_buddha 11 days ago
      How about spaceship parts ? https://www.nasa.gov/technology/goddard-tech/nasa-turns-to-a...
- burny_tech 11 days ago
  We need to advance mechanistic interpretability (field reverse engineering neural networks) https://www.youtube.com/watch?v=P7sjVMtb5Sg https://www.youtube.com/watch?v=7t9umZ1tFso https://www.youtube.com/watch?v=2Rdp9GvcYOE
- trueismywork 11 days ago
  To paraphrase Kahan, it's not interesting to me whether a method is accurate enough or not, but whether you can predict how accurate you can be. So, if ML methods can predict that they're right 98% of times then we can build this in our systems, even if we don't understand how they work.
  Deterministic methods can predict result with a single run, ML methods will need ensemble of results to show the same confidence. It is possible at the end of day that the difference in cost might not he that high over time.
- ozten 11 days ago
  Science has always given us better, but error prone tooling to see further and make better guesses. There is still a scientific test. In a clinical trial, is this new drug safe and effective.
- Brian_K_White 10 days ago
  Perhaps an ai can be made to produce the work as well as a final answer, even if it has to reconstruct or invent the work backwards rather than explain it's own internal inscrutable process.
  "produce a process that arrives at this result" should be just another answer it can spit out. We don't necessarily care if the answer it produces is actually the same as what originally happened inside itself. All we need is that the answer checks out when we try it.
- visarga 11 days ago
  No, science doesn't work that way. You can just calculate your way to scientific discoveries, you got to test them in the real world. Learning, both in humans and AI, is based on the signals provided by the environment. There are plenty of things not written anywhere, so the models can't simply train on human text to discover new things. They learn directly from the environment to do that, like AlphaZero did when it beat humans at Go.
- goodmachine 10 days ago
  In order for that not to happen (uninterpretable ML models) some research on symbolic distillation, aka symbolic regression
  https://arxiv.org/abs/2006.11287
  https://www.science.org/doi/10.1126/sciadv.aay2631
- goggy_googy 11 days ago
  I think at some point, we will be able to produce models that are able to pass data into a target model and observe its activations and outputs and put together some interpretable pattern or loose set of rules that govern the input-output relationship in the target model. Using this on a model like AlphaFold might enable us to translate inferred chemical laws into natural language.
- nico 11 days ago
  Even if we don’t understand the models themselves, you can still use them as a basis for understanding
  For example, I have no idea how a computer works in every minute detail (ie, exactly the physics and chemistry of every process that happens in real time), but I have enough of an understanding of what to do with it, that I can use it as an incredibly useful tool for many things
  Definitely interesting times!
  [-]
  - tecleandor 8 days ago
    Not the same. There is a difference between "I cannot understand the deeper details of certain model but some others can and there's the possibility of explaining it in detail" and "Nobody can understand it and there's not a clear cause-effect that we know" .
    Except for weird cases, computers (or cars, or cameras, or lots of other man made devices) are clearly known and you (or another specialist) can clearly show why a device does X when you input Y on it.
- sdwr 11 days ago
  > Does this decouple progress from our current understanding of the scientific process?
  Thank God! As a person who uses my brain, I think I can say, pretty definitively, that people are bad at understanding things.
  If this actually pans out, it means we will have harnessed knowledge/truth as a fundamental force, like fire or electricity. The "black box" as a building block.
  [-]
  - tantalor 11 days ago
    This type of thing is called an "oracle".
    We've had stuff like this for a long time.
    Notable examples:
    - Temple priestesses
    - Tea-leaf reading
    - Water scrying
    - Palmistry
    - Clairvoyance
    - Feng shui
    - Astrology
    The only difference is, the ML model is really quite good at it.
    [-]
    - unsupp0rted 11 days ago
      > The only difference is, the ML model is really quite good at it.
      That's the crux of it: we've had theories of physics and chemistry since before writing was invented.
      None of that mattered until we came upon the ones that actually work.
- mnky9800n 11 days ago
  I believe it simply tells us that our understanding of mechanical systems, especially chaotic ones, is not as well defined as we thought.
  https://journals.aps.org/prresearch/abstract/10.1103/PhysRev...
- bluerooibos 11 days ago
  > What happens when...
  I can only assume that existing methods would still be used for verification. At least we understand the logic used behind these methods. The ML models might become more accurate on average but they could still throw out results that are way off occasionally, so their error rate would have to become equal to the existing methods.
- tnias23 11 days ago
  I wonder if ML can someday be employed in deciphering such black box problems; a second model that can look under the hood at all the number crunching performed by the predictive model, identify the pattern that resulted in a prediction, and present it in a way we can understand.
  That said, I don’t even know if ML is good at finding patterns in data.
  [-]
  - lupire 11 days ago
    > That said, I don’t even know if ML is good at finding patterns in data.
    That's the only thing ML does.
- theGnuMe 11 days ago
  The models are learning an encoding based on evolutionary related and known structures. We should be able to derive fundamental properties from those encodings eventually. Or at least our biophysical programmed models should map into that encoding. That might be a reasonable approach to look at the folding energy landscape.
- TheBicPen 10 days ago
  Perhaps related, the first computer-assisted mathematics proof: https://en.wikipedia.org/wiki/Four_color_theorem
  I'm sure that similar arguments for and against the proof apply here as well.
- slashdave 11 days ago
  In terms of docking, you can call the conventional approaches "physically-based", however, they are rather poor physical models. Namely, they lack proper electrostatics, and, most importantly, basically ignore entropic contributions. There is no reason for concern.
- fnikacevic 11 days ago
  I can only hope the models will be sophisticated enough and willing to explain their reasoning to us.
- Gimpei 11 days ago
  Might be easier to come up with new models with analytic solutions if you have a probabilistic model at hand. A lot easier to evaluate against data and iterate. Also, I wouldn't be surprised if we develop better tools for introspecting these models over time.
- Grieverheart 11 days ago
  Perhaps for understanding the structure itself, but having the structure available allows us to focus on a coarser level. We also don't want to use quantum mechanics to understand the everyday world, and that's why we have classic mechanics etc.
- jncfhnb 11 days ago
  These processes are both beyond human comprehension because they contain vast layers of tiny interactions and also not practical to simulate. This tech will allow for exploration for accurate simulations to better understand new ideas if needed.
- phn 11 days ago
  I'm not a scientist by any means, but I imagine even accurate opaque models can be useful in moving the knowledge forward. For example, they can allow you to accurately simulate reality, making experiments faster and cheaper to execute.
- RandomLensman 11 days ago
  We could be entering a new age of epicycles - high accuracy but very flawed understanding.
- timschmidt 11 days ago
  There will be an iterative process built around curated training datasets - continually improved, top tier models, teams reverse engineering the model's understanding and reasoning, and applying that to improve datasets and training.
- cgearhart 11 days ago
  This is a neat observation. Slightly terrifying, but still interesting. Seems like there will also be cases where we discover new theories through the uninterpretable models—much easier and faster to experiment endlessly with a computer.
- yieldcrv 10 days ago
  I think it creates new studies, such as diagnosing these models behaviors without the doctor having an intricate understanding of all of the model's processes/states just like with natural organisms
- JacobThreeThree 10 days ago
  As a tool people will use it as any other tool, by experimenting, testing, tweaking and iterating.
  As a scientific theory for fundamentally explaining the nature of the universe, maybe it won't be as useful.
- mberning 11 days ago
  I would assume that given enough hints from AI and if it is deemed important enough humans will come in to figure out the “first principles” required to arrive at the conclusion.
  [-]
  - RobCat27 11 days ago
    I believe this is the case also. With a well enough performing AI/ML/probabilistic model where you can change the model's input parameters and get a highly accurate prediction basically instantly, we can test theories approximately and extremely fast rather than running completely new experiments, which will always come with it's own set of errors and problems.
- jes5199 11 days ago
  every time the two systems disagree, it's an opportunity to learn something. both kinds of models can be improved with new information, done through real-world experiments
- jpadkins 11 days ago
  Hook the protein model up to an LLM model, have the LLM interpret the results. Problem solved :-) Then we just have to trust the LLM is giving us correct interpretations.
- krzat 11 days ago
  We will get better with understanding black boxes, if a model can be compressed into simple math formula then it's both easier to understand and to compute.
- MobiusHorizons 11 days ago
  Is it capable of predictions though? Ie can it accurately predict the folding of new molecules? Otherwise how do you distinguish accuracy from overfitting.
- andy_ppp 10 days ago
  What happens if we get to the stage of being able to simulate every chemical and electrical reaction in a human brain, is doing this torture or wrong?
  [-]
  - jasondigitized 10 days ago
    So the Matrix?
    [-]
    - andy_ppp 9 days ago
      The brains were in “the real” in the Matrix or did I not watch it closely enough :-)
- dyauspitr 11 days ago
  Whatever it is if we needed to we could follow each instruction through the black box. It’s never going to be as opaque as something organic.
- abledon 10 days ago
  Next decade we will focus on building out debugging and visualization tools for deep learning , to glance inside the current black box
- ogogmad 11 days ago
  Some machine learning models might be more interpretable than others. I think the recent "KAN" model might be a step forward.
- tobrien6 11 days ago
  I suspect that ML will be state-of-the-art at generating human-interpretable theories as well. Just a matter of time.
- bbor 11 days ago
  This is exactly how the physicists felt at the dawn of quantum physics - the loss of meaningful human inquiry to blindly effective statistics. Sobering stuff…
  Personally, I’m convinced that human reason is less pure than we think it to be, and that the move to large mathematical models might just be formalizing a lack-of-control that was always there. But that’s less of a philosophy of science discussion and more of a cognitive science one
- jasondigitized 10 days ago
  All I can see anymore is that March of Progress illustration [1] with a GPU being added to the far right. Interesting times indeed.
  [1] https://en.m.wikipedia.org/wiki/March_of_Progress
- kylebenzle 11 days ago
  That is not a real concern, just a confusion on how statistics works :(
- hyperthesis 10 days ago
  Engineering often precedes Science. It's just more data.
- GuB-42 11 days ago
  We already have the absolute best method for accurately predicting the world, and it is by experimentation. In the protein folding case, it works by actually making the protein and analyzing it. For designing airplanes, computer models are no match for building the thing, or even using physical models and wind tunnels.
  And despite having these "best method", it didn't prevent progress in theoretical physics, theory and experimentation complement each other.
  ML models are just another kind of model that can help both engineering and fundamental research. Their working is close to the old guy in the shop who knows intuitively what is good design, because he has seen it all. That old guys in shops are sometimes better than modeling using physics equations help scientific progress, as scientists can work together with the old guy, combining the strength of intuition and experience with that of scientific reasoning.
- thelastparadise 11 days ago
  The ML models will help us understand that :)
- thomasahle 11 days ago
  > Stepping back, the high-order bit here is an ML method is beating physically-based methods for accurately predicting the world.
  I mean, it's just faster, no? I don't think anyone is claiming it's a more _accurate_ model of the universe.
  [-]
  - Jerrrry 11 days ago
    Collision libraries and fluid libraries have had baked-in memorized look-up tables that were generated with ML methods nearly a decade ago.
    World is still here, although the Matrix/metaverse is becoming more attractive daily.
    [-]
    - hahaurfunny 11 days ago
      [dead]
- kajic 10 days ago
  It’s much easier to reverse engineer a solution that you don’t understand (and discover important underlying theories on that journey), than it is to arrive at that same solution and the underlying theories without knowing in advance where you are going.
  For this reason, discoveries made by AI will be immensely useful for accelerating scientific progress, even if those discoveries are opaque at first.
- mycall 10 days ago
  A New Kind Of Science?
- scotty79 11 days ago
  We should be thankful that we live in the universe that obeys math simple enough to comprehend that we were able to reach that level.
  Imagine if optis was complex enough that it would require ML model to predict anything.
  We'd be in permanent stone age without a way out.
  [-]
  - lupire 11 days ago
    What would a universe look like that lacked simple things, and somehow only complex things existed?
    It makes me think of how Gaussian integers have irreducibles but not prime numbers, where some large things cannot be uniquely expressed as combination of smaller things.
- aaroninsf 11 days ago
  The top HN response to this should be,
  what happens is an opportunity has entered the chat.
  There is a wave coming—I won't try to predict if it's the next one—where the hot thing in AI/ML is going to be profoundly powerful tools for analyze other such tools and render them intelligible to us,
  which will I imagine mean providing something like a zoomable explainer. At every level there are footnotes; if you want to understand why the simplified model is a simplification, you look at the fine print. Which has fine print. Which has...
  Which doesn't mean there is not a stable level at which some formal notion of "accurate" cannot be said to exist, which is the minimum viable level of simplification.
  Etc.
  This sort of thing will of course will the input to many other things.
- flawsofar 11 days ago
  How do they compare on accuracy per watt?
j7ake 11 days ago
So it’s okay now to publish a computational paper with no code? I guess Nature’s reporting standards don’t apply to everyone.
> A condition of publication in a Nature Portfolio journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications.
> Authors must make available upon request, to editors and reviewers, any previously unreported custom computer code or algorithm used to generate results that are reported in the paper and central to its main claims.
https://www.nature.com/nature-portfolio/editorial-policies/r...
[-]
- dekhn 11 days ago
  Nature has long been willing to break its own rules to be at the forefront of publishing new science.
- boxed 11 days ago
  Are you an editor or reviewer?
  [-]
  - j7ake 11 days ago
    If you read the standards it applies broadly beyond reviewers or editors.
    > A condition of publication in a Nature Portfolio journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications.
    [-]
    - boxed 10 days ago
      That's a much better rule!
  - HanClinto 11 days ago
    Good question.
    Also makes me wonder -- where's the line? Is it reasonable to have "layperson" reviewers? Is it reasonable to think that regular citizens could review such content?
    [-]
    - Kalium 11 days ago
      I think you will find that for the vast, vast majority of scientific papers there is significant negative expected value to even attempting to have layperson reviewers. Bear in mind that we're talking about papers written by experts in a specific field aimed at highly technical communication with other people who are experts in the same field. As a result, the only people who can usefully review the materials are drawn from those who are also experts in the same field.
      For an instructive example, look up the seminal paper on the structure of DNA: https://www.mskcc.org/teaser/1953-nature-papers-watson-crick... Ask yourself how useful comments from someone who did not know what an X-ray is, never mind anything about organic chemistry, would be in improving the quality of research or quality of communication between experts in both fields.
    - _just7_ 11 days ago
      No, infact most journals have peer reviews cordoned off, not viewable to the general public.
      [-]
      - lupire 11 days ago
        That's pre-publication review, not scientific peer review. Special interests try to conflate the two, to bypass peer review and transform science into a religion.
        Peer review properly refers to the general process of science advancing by scientists reviewing each other's published work.
        Publishing a work is the middle, not the end of the research.
dopylitty 11 days ago
This reminds me of Google’s claim that another “AI” discovered millions of new materials. The results turned out to be a lot of useless noise but that was only apparent after actual expert spent hundreds of hours reviewed the results[0]
0: https://www.404media.co/google-says-it-discovered-millions-o...
[-]
- dekhn 11 days ago
  The alphafold work has been used across the industry (successfully, in the sense of blind prediction), and has been replicated independently. The work on alphafold will likely net Demis and John a Nobel prize in the next few years.
  (that said, one should always inspect Google publications with a fine-toothed comb and lots of skepticism, as they have a tendency to juice the results)
  [-]
  - nybsjytm 11 days ago
    >The alphafold work has been used across the industry (successfully, in the sense of blind prediction), and has been replicated independently.
    This is clearly an overstatement, or at least very incomplete. See for instance https://www.nature.com/articles/s41592-023-02087-4:
    "In many cases, AlphaFold predictions matched experimental maps remarkably closely. In other cases, even very high-confidence predictions differed from experimental maps on a global scale through distortion and domain orientation, and on a local scale in backbone and side-chain conformation. We suggest considering AlphaFold predictions as exceptionally useful hypotheses."
    [-]
    - dekhn 11 days ago
      Yep, I know Paul Adams (used to work with him at Berkeley Lab) and that's exactly the paper he'd publish. If you read that paper carefully (as we all have, since it's the strongest we've seen from the crystallography community so far) they're basically saying the results from AF are absolutely excellent, and fit for purpose.
      (put another way: if Paul publishes a paper saying your structure predictions have issues, and mostly finds tiny local issues and some distortion and domain orientation,r ather than absolutely incorrect fold prediction, it means your technique works really well, and people are just quibbling about details.)
      [-]
      - natechols 11 days ago
        I also worked with the same people (and share most of the same biases) and that paper is about as close to a ringing endorsement of AlphaFold as you'll get.
      - nybsjytm 11 days ago
        I don't know Paul Adams, so it's hard for me to know how to interpret your post. Is there anything else I can read that discusses the accuracy of AlphaFold?
        [-]
        dekhn 11 days ago
        Yes, https://predictioncenter.org/casp15/ https://www.sciencedirect.com/science/article/pii/S0959440X2... https://dasher.wustl.edu/bio5357/readings/oxford-alphafold2....
        I can't find the link at the moment but from the perspective of the CASP leaders, AF2 was accurate enough that it's hard to even compare to the best structures determined experimentally, due to noise in the data/inadequacy of the metric.
        A number of crystallographers have also reported that the predictions helped them find errors in their own crystal-determined structures.
        If you're not really familiar enough with the field to understand the papers above, I recommend spending more time learning about the protein structure prediction problem, and how it relates to the epxerimental determination of structure using crystallography.
        [-]
        nybsjytm 11 days ago
        Thanks, those look helpful. Whenever I meet someone with relevant PhDs I ask their thoughts on AlphaFold, and I've gotten a wide variety of responses, from responses like yours to people who acknowledge its usefulness but are rather dismissive about its ultimate contribution.
        [-]
        dekhn 11 days ago
        The people who are most likely to deprecate AlphaFold are the ones whose job viability is directly affected by its existence.
        Let me be clear: DM only "solved" (and really didn't "solve") a subset of a much larger problem: creating a highly accurate model of the process by which real proteins adopt their folded conformations, or how some proteins don't adopt folded conformations without assistance, or how some proteins don't adopt a fully rigid conformation, or how some proteins can adopt different shapes in different conditions, or how enzymes achieve their catalyst abilities, or how structural proteins produce such rigid structures, or how to predict whether a specific drug is going to get FDA approval and then make billions of dollars.
        In a sense we got really lucky because CASP has been running so long and with some many contributors that it became recognized that winning at CASP meant "solving protein structure prediction to the limits of our ability to evaluate predictions", and that Demis and his associates had such a huge drive to win competitions that they invested tremendous resources and state of the art technology, while sharing enough information that the community could reproduce the results in their own hands. Any problem we want solved, we should gamify, so that DeepMind is motivated to win the game.
        [-]
        panabee 11 days ago
        this is very astute, not only about deepmind but about science and humanity overall.
        what CASP did was narrowly scope a hard problem, provided clear rules and metrics for evaluating participants, and offered a regular forum in which candidates can showcase skills -- they created a "game" or competition.
        in doing so, they advanced the state of knowledge regarding protein structure.
        how can we apply this to cancer and deepen our understanding?
        specifically, what parts of cancer can we narrowly scope that are still broadly applicable to a complex heterogenous disease and evaluate with objective metrics?
        [edited to stress the goal of advancing cancer knowledge, not to "gamify" cancer science but to create structures that inivte more ways to increase our understanding of cancer.]
  - 11101010001100 11 days ago
    Depending on your expected value of quantum computing, the Nobel committee shouldn't wait too long.
    [-]
    - dekhn 11 days ago
      Personally I don't expect QC to be a competitor to ML in protein structure prediction for the foreseeable future. After spending more money on molecular dynamics than probably any other human being, I'm really skeptical that physical models of protein structures will compete with ML-based approaches (that exploit homology and other protein sequence similarities).
- Laaas 11 days ago
  > We have yet to find any strikingly novel compounds in the GNoME and Stable Structure listings, although we anticipate that there must be some among the 384,870 compositions. We also note that, while many of the new compositions are trivial adaptations of known materials, the computational approach delivers credible overall compositions, which gives us confidence that the underlying approach is sound.
  Doesn't seem outright useless.
_obviously 11 days ago
[flagged]
weregiraffe 11 days ago
s/predicts/attempts to predict
[-]
- dekhn 11 days ago
  AlphaFold has been widely validated- it's now appreciated that its predictions are pretty damn good, with a few important exceptions, instances of which are addressed with the newer implementation.
  [-]
  - AtlasBarfed 11 days ago
    "pretty damn good"
    So... what percentage of the time? If you made an AI to pilot an airplane, how would you verify its edge conditions, you know, like plummeting out of the sky because it thought it had to nosedive?
    Because these AIs are black box neural networks, how do you know they are predicting things correctly for things that aren't in the training dataset?
    AI has so many weasel words.
    [-]
    - dekhn 11 days ago
      As mentioned elsewhere and this thread and trivially determinable by reading, AF2 is constantly being evaluated in blind predictions where the known structure is hidden until after the prediction. There's no weasel here; the process is well-understood and accepted by the larger community.
- pbw 11 days ago
  A prediction is a prediction; it's not necessarily a correct prediction.
  The weatherman predicts the weather, even if he's sometimes wrong, we don't say "he attempts to predict" the weather.
- jasonjmcghee 11 days ago
  The title OP gave accurately reflects the title of Google's blog post. Title should not be editorialized.
  [-]
  - jtbayly 11 days ago
    Unless the title is clickbait, which it appears this is…
- matt-attack 11 days ago
  Syntax error
  [-]
  - adrianmonk 11 days ago
    Legal without the trailing slash in vi!
MPSimmons 11 days ago
Not sure why the first thing they point it at wouldn't be prions.
TaupeRanger 11 days ago
So after 6 years of this "revolutionary technology", what we have to show for all the hype and breathless press releases is: ....another press release saying how "revolutionary" it is. Fantastic. Thanks DeepMind.
nojvek 11 days ago
So much hyperbole from recent Google releases.
I wish they didn't hype AI so much, but I guess that's what people want to hear, so they say that.
[-]
- sangnoir 11 days ago
  I don't blame them for hyping their products - if only to fight the sentiment that Google is far behind OpenAI because they were not first to release a LLM.
tonyabracadabra 11 days ago
Very cool, and what’s cooler is this rap about alphafold3 https://heymusic.ai/blog/news/alphafold-3
ein0p 11 days ago
I’m inclined to ignore such pr fluff until they actually demonstrate a _practical_ result. Eg. cure some form of cancer or some autoimmune disease. All this “prediction of structure” has been in the news for years, and it seems to have resulted in nothing practically usable IRL as far as I can tell. I could be wrong of course, I do not work in this field
[-]
- dekhn 11 days ago
  the R&D of all major pharma is currently using AlphaFold predictions when they don't have experimentally determined structures. I cannot share further details but the results suggest that we will see future pharmaceuticals based on AF predictions.
  The important thing to recognize is that protein structures are primarily hypothesis-generation machines and tools to stimulate ideas, rather that direct targets of computational docking. Currently structures rarely capture the salient details required to identify a molecule that has precisely the biological outcome desired, because the biological outcome is an extremely complex function that incorporates a wide array of other details, such as other proteins, metabolism, and more.
  [-]
  - ein0p 11 days ago
    Sure. If/when we see anything practical, that’ll be the right moment to pay attention. This is much like “quantum computing” where everyone who doesn’t know what it is is excited for some reason, and those that do know can’t even articulate any practical applications
    [-]
    - dekhn 11 days ago
      Feynman already articulated the one practical application for quantum computing: using it to simulate complex systems (https://www.optica-opn.org/home/articles/on/volume_11/issue_... and https://calteches.library.caltech.edu/1976/ and https://s2.smu.edu/~mitch/class/5395/papers/feynman-quantum-...
      These approaches are now being explored but I haven't seen any smoking guns showing a QC-based simulation exceeding the accuracy of a classical computer for a reasonable investment.
      Folks have suggested other areas, such as logistics, where finding small improvements to the best approximations might give a company a small edge, and crypto-breaking, but there has been not that much progress in this area, and the approximate methods have been improving rapidly.
- arolihas 11 days ago
  There are a few AI-designed drugs in various phases of clinical trials, these things take time.