"Managing the Complexity of Molecules: Letting Matter Compute Itself" by Gregory Kovacs (Stanford University and SRI International)

Thursday, December 6, 2018 - 3:30pm
Physics and Astronomy Colloquium

Thursdays, 3:30-4:30 pm

1-434 Physics and Astronomy (map)
Reception from 3:30-4:30 p.m.
(unless otherwise posted)

Guest Speaker: Gregory Kovacs (SRI International and Stanford University)

Talk Title:  “Managing the Complexity of Molecules: Letting Matter Compute Itself”


Person-millenia are spent each year seeking useful molecules for medicine, food, agriculture and other uses. Biomolecules, which are comprised of interchangeable building blocks such as amino acids, represent a near infinite number of combinatorial possibilities. As an example, antibodies, which make up the majority of the top-grossing medicines today, are comprised of 1,100 amino acids chosen from the twenty used by living things. The binding part (variable region) that allows the antibody to recognize other molecules, is comprised of 110 to 130 amino acids, giving rise to at least 10143 possible combinations. However, are apparently only about 1080 atoms in the universe, illustrating the intractability of exploring the entire space of possibility. This is just one example of biological complexity…           

Machine learning (ML), artificial intelligence (AI), and “big data” are often put forth as the solutions to all problems, particularly by pontificating TED presenters giving talks dripping with hyperbole. Expecting these methods to provide intelligent de novo prediction of molecular structure and function within our lifetimes is utter rubbish. For example, a neural network trained on daily weather patterns in Palo Alto cannot develop an internal model for global weather. In a similar way, finite and reasonable molecular training sets will not magically cause a generalizable model of molecular quantum mechanics to arise within a neural network, no matter how many layers it is endowed with. Regardless of the algorithms chosen, one simply cannot yet ask a computer to “compute” a drug that cures HIV.

With that provocative preface, we turn to the notion of letting matter compute itself. Massive combinatorial libraries can now be intelligently and efficiently created and mined with appropriate molecular readouts (AKA “the question vector”) at ever-increasing throughputs presently surpassing 1012 unique molecules in a few hours. Once “matter-in-the-loop” exploration is embraced, AI, ML and other methods can be brought to bear usefully in closed-loop methods to follow veins of opportunity in molecular spaces. Several examples of mining massive molecular spaces will be presented, including drug discovery and AI-guided continuous-flow chemical synthesis – all real, all working today.

For more information, contact Yaroslav Tserkovnyak

We thank the following people for their contributions to the wine fund for the post-colloquium reception:
Professors Katsushi Arisaka, Andrea Ghez, Karoly Holczer, Huan Huang, HongWen Jiang, Per Kraus, Alexander Kusenko, Matthew Malkan, Mayank Mehta, Warren Mori, Ni Ni, Seth Putterman, David Saltzberg, Yaroslav Tserkovnyak, Vladimir Vassiliev, Shenshen wang, and Nathan Whitehorn.

1-434 PAB