Jonathan Chan is an undergraduate senior majoring in Chemistry. As an Outreach Coordinator for Caltech’s Chemistry Club and Editor-in-Chief of the Caltech Undergraduate Research Journal, Jonathan is passionate about getting kids of all ages interested in science.


Get the newsletter!

Share this article!

Automating the Art of Making Molecules

Maddie Lewis for Caltech Letters

S ynthetic organic chemistry—the art and science of making molecules—is hard. Performing any single chemical reaction almost always involves multiple steps: setting up a container for the reaction, adding appropriate chemicals, isolating products when the reaction has completed, and then finally, concentrating, purifying, and analyzing the final product. This entire process can take anywhere from several hours to several days. At each step of the way, unforeseen problems may require the chemist to start from scratch and attempt a new approach. And even before chemists step into the lab, they encounter substantial intellectual challenges, like planning out how they will make their desired product before they actually make it. In short, traditional synthesis involves significant mental and physical work, which can be a bottleneck when chemists need to screen and optimize thousands of chemical reactions to produce compounds for clinical testing and drug commercialization.

If synthetic chemistry is so hard, why do people still do it? My personal experience as a chemist tells me that there are two main reasons. First, chemists simply love chemistry for its own sake: synthesis allows us to indulge our chemical curiosities and express ourselves artistically in order to expand the boundaries of humankind’s chemical knowledge. Second, and more importantly, synthetic chemistry benefits society immensely. In particular, the large-scale industrial syntheses of complex natural products—molecules that are made in nature by plants, fungi, and other living organisms—play a critical role in improving human health. These molecules and their analogues serve as important pharmaceutical drugs, vitamins, nutritional goods, cosmetics, dyes, and fertilizers. For example, the Pacific yew tree, a natural producer of the chemotherapy drug Paclitaxel, is endangered by overharvesting; as such, finding more efficient ways to generate Paclitaxel is increasingly important. Another drug obtained through synthetic means, sildenafil—more commonly known as Viagra—has dramatically improved the sexual health of millions of people worldwide and has earned its creator tens of billions of dollars in revenues.

Because synthesis is so vital to human health and society, yet so demanding on chemists, the automation of synthesis is a holy grail in the field. If computers and machines can do the work of a chemist, whether that work be physical (performing, purifying, and analyzing reactions) or intellectual (planning syntheses), then chemists have more time to engage in less tedious chemical tasks. Although synthetic automation has been a topic of discussion among chemists since the late 1970s, only recently have machines become dexterous enough and computers smart enough that the automation of synthesis is now feasible.

Flow technologies and robotic arms are two ways to rapidly speed up the physical processes of synthesis. In flow technologies, chemical reactions take place in channels with a constant flow of reactants pumping through them, rather than in a single reaction vessel. Pfizer, the world’s largest pharmaceutical company, has recently leveraged this flexible flow-based system for chemical reaction discovery. Pfizer’s automated instrument can switch between different types of chemicals, heat reactions to a variety of temperatures, execute and evaluate over 1,500 reactions a day!

In another flow-based example, the Cronin lab at the University of Glasgow has recently designed a “Chemputer”, a hybrid of computer software, physical syringe pumps, valves, and conventional laboratory glassware. The autonomous Chemputer is capable of translating published experimental synthetic methods into computer code that guides lab equipment through reaction, workup, and purification steps in order to obtain several target drug molecules including sildenafil (Viagra).

The software controls of Cronin’s Chemputer direct lab equipment—tubes, valves, pumps, etc.—to automate chemical reactions.

Science, Vol 363

A second approach to physical automaton uses a robotic arm and preloaded chemical cartridges to automate synthetic steps such as heating, mixing and separating chemicals (see the picture below). In contrast to flow techniques, this particular robot, developed at the University of Liverpool, is built to handle only solids. The device consists of a robotic arm on a moving box and is about the same size as a person. Every day, it sifts through thousands of materials to optimize the extraction of hydrogen from water. However, the robot is not a mindless worker bee; its algorithm iteratively determines its next step or action based on a continuous process relating chemical composition to experimental results. This allows the system to predict and perform reactions that should be more successful than any other reaction studied so far. This machine learning process informs the robot’s strategic chemical choices that humans would never be able to make on their own.

The robot chemist at the University of Liverpool.

Nature

While physical automation of the synthesis of molecules has progressed greatly, the “true” game of synthesis lies in the planning process behind making the molecules themselves. This process, known as “retrosynthesis,” involves much creativity and decision-making, which makes it more difficult than it might seem. However, that hasn’t stopped researchers from trying to make computers think like chemists. In programming a computer to plan syntheses, researchers can take several approaches.

One approach is to provide the computer program with an exhaustive, human-generated list of reactions that enable a desired chemical transformation. The program then combines such reactions into synthetic routes to target molecules in ways that are similar to those used to evaluate combinations of chess moves. A leader in this first approach is Bartosz A. Grzybowski and his team at the Ulsan National Institute of Science and Technology in South Korea. Grzybowski’s Chematica software plans synthetic routes to target molecules using some 50,000 rules of chemical reactions the team fed into the system. Recently, the team used its planning software to devise syntheses for eight molecules. For most molecules, Chematica needed less than 20 minutes to plan their syntheses; in contrast, a human might take hours to come up with possible reaction schemes. The proposed routes were then tested in the lab, and all were more efficient than previously established routes. Grzybowski attributes such success to the “ice cold objectivity of an algorithm over a human brain.”

An output of one of Chematica’s early (mid-2015) successful synthetic designs. Color-codes differentiate between commercially available, restricted or unpopular chemicals, and numerical scores of several synthetic pathways are indicated.

Cell Press

Alternatively, a more ambitious approach is to design a program that learns by itself what chemists already know. At the forefront of this second approach is Marwin Segler, an organic chemist and artificial-intelligence researcher at the University of Münster in Germany. First, Segler “taught” several deep-learning neural networks millions of chemical transformations by showing the system essentially all reactions ever published in the field of organic chemistry. After asking the system to construct general chemical rules from these reactions, Segler combined the neural networks to devise synthetic routes to several target molecules, just like a seasoned human chemist would: retrosynthetically. To test the validity of the AI system’s intellectual handiwork, trained organic chemists evaluated the synthetic routes that emerged in a double-blind test for plausibility, meaning neither the trained participants nor the test administrators knew which synthetic routes were planned by humans or the machine. When asked to assess the AI synthetic pathways for target molecules alongside routes reported in the literature, professional chemists expressed no preference for the routes that had been devised by real people. In other words, the AI’s synthetic routes were at least as plausible as those proposed by real chemists. The best thing about the system though, might be its lightning fast speeds; the algorithm took only 5.4 seconds to find a six-step route for a precursor to a drug candidate.

As an amateur synthetic organic chemist, I initially reacted to these stories of automated synthesis with both amazement and apprehension. I struggle to run five reactions in the lab a day, while Pfizer’s flow system runs 1,500 a day (albeit with less material in the system). Planning a six-step synthesis for the drug precursor that Segler tested his AI on would have taken me at least half an hour. Segler’s AI system did it in 5.4 seconds. Would I soon be out of a job if a computer-cum-machine could carry out my physical and intellectual duties far more efficiently than I ever could? Perhaps. But then, I would probably consider such work no longer worthwhile for me or other chemists to do, just like how doing long division has little intellectual value today. An automation revolution in synthesis would simply promote chemistry innovation. Automated systems reduce the costs of experimentation, thereby reducing chemists’ reluctance to perform riskier, potentially breakthrough experiments. In this way, automation is no longer a labor-saving solution in and of itself, but becomes a tool for exploring new techniques and promoting future innovation. Paradoxically, automation encourages us chemists to continue working efficiently.

Automation would foster another type of innovation as well. It would allow us chemists to work at a higher intellectual level than we do today; rather than planning and executing syntheses, we would have more time to think about which chemical problems we should solve with our new computer aides, who to include in making such decisions, and how this new chemistry would affect our society, environment and laws governing intellectual property. We would be better able to contextualize our chemical results and ensure that they are used for the greatest public good. Any way you slice it, automating the art of synthesis would better enable us to do meaningful work and maximize our impact on improving society.


Share this article!

Jonathan Chan is an undergraduate senior majoring in Chemistry. As an Outreach Coordinator for Caltech’s Chemistry Club and Editor-in-Chief of the Caltech Undergraduate Research Journal, Jonathan is passionate about getting kids of all ages interested in science.


Get the newsletter!

Join the conversation!