Harnessing Artificial Intelligence to Breed Crops that Adapt to Changing Climates
- Aryan Inamdar
- Jul 4, 2021
- 6 min read
Until recently, the field of plant breeding looked a lot like it did in centuries past. A breeder might examine, for example, which tomato plants were most resistant to drought and then cross the most promising plants to produce the most drought-resistant offspring. This process would be repeated, plant generation after generation, until, over the course of roughly seven years, the breeder arrived at what seemed the optimal variety.
Now, with the global population expected to increase to nearly 10 billion by 2050 and climate change shifting growing conditions, crop breeder and geneticist Steven Tanksley doesn’t think plant breeders have that kind of time. “We have to double the productivity per acre of our major crops if we’re going to stay on par with the world’s needs,” says Tanksley, a professor emeritus at Cornell University in Ithaca, NY.
To speed up the process, Tanksley and others are turning to artificial intelligence (AI). Using computer science techniques, breeders can rapidly assess which plants grow the fastest in a particular climate, which genes help plants thrive there, and which plants, when crossed, produce an optimum combination of genes for a given location, opting for traits that boost yield and stave off the effects of a changing climate. Large seed companies in particular have been using components of AI for more than a decade. With computing power rapidly advancing, the techniques are now poised to accelerate breeding on a broader scale. Crop breeders still grapple with tradeoffs such as higher yield versus marketable appearance. And even the most sophisticated AI cannot guarantee the success of a new variety. But as AI becomes integrated into agriculture, some crop researchers envisage an agricultural revolution with computer science at the helm.
During the “green revolution” of the 1960s, researchers developed new chemical pesticides and fertilizers along with high-yielding crop varieties that dramatically increased agricultural output. But the reliance on chemicals came with the heavy cost of environmental degradation. “If we’re going to do this sustainably,” says Tanksley, “genetics is going to carry the bulk of the load.”
Plant breeders lean not only on genetics but also on mathematics. As the genomics revolution unfolded in the early 2000s, plant breeders found themselves inundated with genomic data that traditional statistical techniques couldn’t wrangle (5). Plant breeding “wasn’t geared toward dealing with large amounts of data and making precise decisions,” says Tanksley. In 1997, Tanksley began chairing a committee at Cornell that aimed to incorporate data-driven research into the life sciences. There, he encountered an engineering approach called operations research that translates data into decisions. In 2006, Tanksley cofounded the Ithaca, NY-based company Nature Source Improved Plants on the principle that this engineering tool could make breeding decisions more efficient. “What we’ve been doing almost 15 years now,” says Tanksley, “is redoing how breeding is approached.”
Such approaches try to tackle complex scenarios. Suppose, for example, a wheat breeder has 200 genetically distinct lines. The breeder must decide which lines to breed together to optimize yield, disease resistance, protein content, and other traits. The breeder may know which genes confer which traits, but it’s difficult to decipher which lines to cross in what order to achieve the optimum gene combination. The number of possible combinations, says Tanksley, “is more than the stars in the universe.”
An operations research approach enables a researcher to solve this puzzle by defining the primary objective and then using optimization algorithms to predict the quickest path to that objective given the relevant constraints. Auto manufacturers, for example, optimize production given the expense of employees, the cost of auto parts, and fluctuating global currencies. Tanksley’s team optimizes yield while selecting for traits such as resistance to a changing climate. “We’ve seen more erratic climate from year to year, which means you have to have crops that are more robust to different kinds of changes,” he says.
For each plant line included in a pool of possible crosses, Tanksley inputs DNA sequence data, phenotypic data on traits like drought tolerance, disease resistance, and yield, as well as environmental data for the region where the plant line was originally developed. The algorithm projects which genes are associated with which traits under which environmental conditions and then determines the optimal combination of genes for a specific breeding goal, such as drought tolerance in a particular growing region, while accounting for genes that help boost yield. The algorithm also determines which plant lines to cross together in which order to achieve the optimal combination of genes in the fewest generations.
Nature Source Improved Plants conducts, for example, a papaya program in southeastern Mexico where the once predictable monsoon season has become erratic. “We are selecting for varieties that can produce under those unknown circumstances,” says Tanksley. But the new papaya must also stand up to ringspot, a virus that nearly wiped papaya from Hawaii altogether before another Cornell breeder developed a resistant transgenic variety. Tanksley’s papaya isn’t as disease resistant. But by plugging “rapid growth rate” into their operations research approach, the team bred papaya trees that produce copious fruit within a year, before the virus accumulates in the plant.
“Plant breeders need operations research to help them make better decisions,” says William Beavis, a plant geneticist and computational biologist at Iowa State in Ames, who also develops operations research strategies for plant breeding. To feed the world in rapidly changing environments, researchers need to shorten the process of developing a new cultivar to three years, Beavis adds.
The big seed companies have investigated use of operations research since around 2010, with Syngenta, headquartered in Basel, Switzerland, leading the pack, says Beavis, who spent over a decade as a statistical geneticist at Pioneer Hi-Bred in Johnston, IA, a large seed company now owned by Corteva, which is headquartered in Wilmington, DE. “All of the soybean varieties that have come on the market within the last couple of years from Syngenta came out of a system that had been redesigned using operations research approaches,” he says. But large seed companies primarily focus on grains key to animal feed such as corn, wheat, and soy. To meet growing food demands, Beavis believes that the smaller seed companies that develop vegetable crops that people actually eat must also embrace operations research. “That’s where operations research is going to have the biggest impact,” he says, “local breeding companies that are producing for regional environments, not for broad adaptation.”
In collaboration with Iowa State colleague and engineer Lizhi Wang and others, Beavis is developing operations research-based algorithms to, for example, help seed companies choose whether to breed one variety that can survive in a range of different future growing conditions or a number of varieties, each tailored to specific environments. Two large seed companies, Corteva and Syngenta, and Kromite, a Lambertville, NJ-based consulting company, are partners on the project. The results will be made publicly available so that all seed companies can learn from their approach.
Useful farming AI requires good data, and plenty of it. To collect sufficient inputs, some researchers take to the skies. Crop researcher Achim Walter of the Institute of Agricultural Sciences at ETH Zürich in Switzerland and his team are developing techniques to capture aerial crop images. Every other day for several years, they have deployed image-capturing sensors over a wheat field containing hundreds of genetic lines. They fly their sensors on drones or on cables suspended above the crops or incorporate them into handheld devices that a researcher can use from an elevated platform.
Meanwhile, they’re developing imaging software that quantifies growth rate captured by these images. Using these data, they build models that predict how quickly different genetic lines grow under different weather conditions. If they find, for example, that a subset of wheat lines grew well despite a dry spell, then they can zero in on the genes those lines have in common and incorporate them into new drought-resistant varieties.
Research geneticist Edward Buckler at the US Department of Agriculture and his team are using machine learning to identify climate adaptations in 1,000 species in a large grouping of grasses spread across the globe. The grasses include food and bioenergy crops such as maize, sorghum, and sugar cane. Buckler says that when people rank what are the most photosynthetically efficient and water-efficient species, this is the group that comes out at the top. Still, he and collaborators, including plant scientist Elizabeth Kellogg of the Donald Danforth Plant Science Center in St. Louis, MO, and computational biologist Adam Siepel of Cold Spring Harbor Laboratory in NY, want to uncover genes that could make crops in this group even more efficient for food production in current and future environments. The team is first studying a select number of model species to determine which genes are expressed under a range of different environmental conditions. They’re still probing just how far this predictive power can go.
Such approaches could be scaled up—massively. To probe the genetic underpinnings of climate adaptation for crop species worldwide, Daniel Jacobson, the chief researcher for computational systems biology at Oak Ridge National Laboratory in TN, has amassed “climatype” data for every square kilometer of land on Earth. Using the Summit supercomputer, they then compared each square kilometer to every other square kilometer to identify similar environments. The result can be viewed as a network of GPS points connected by lines that show the degree of environmental similarity between points.
In collaboration with the US Department of Energy’s Center for Bioenergy Innovation, the team combines this climatype data with GPS coordinates associated with individual crop genotypes to project which genes and genetic interactions are associated with specific climate conditions. Right now, they’re focused on bioenergy and feedstocks, but they’re poised to explore a wide range of food crops as well. The results will be published so that other researchers can conduct similar analyses.
Comments