Researchers at UC San Diego, UC Irvine and Columbia University have received a $3.6 million Department of Energy (DOE) grant to develop new machine learning methods to understand climate data and predict its ramifications on a warming world.
“We want to understand what is driving extreme heat in Houston or blizzards in Boston,” says CSE Assistant Professor Rose Yu, principal investigator on the project. “Climate studies often focus on short-range correlations, like the atmospheric dynamics near Texas that drive extreme heat. But what about the long-range pieces, like phenomena in the middle of the Pacific that change climate dynamics worldwide?”
Big, Big Data
The project, which includes CSE Assistant Professor Yian Ma, in the Halıcıoğlu Data Science Institute, and CSE Professor Lawrence Saul, will advance machine learning approaches to better handle the many large datasets that contribute to the overall climate picture.
“We want to combine simulation and observational data,” said Yu. “Traditionally in climate science, people develop simulation models but the process is very slow and has limited resolution – about a 100 to 150 square kilometer range. That's just not enough to understand the hidden drivers that lead to major events, such as extreme heat or hurricanes.”
The group will incorporate data from satellite images data, climate stations, ocean-deployed sensors and other sources. Each of these contribute enormous amounts of data, so the team will have to devise new ways to trim that down for analysis.
Yu and colleagues will develop a machine learning framework, based on latent variable models, which creates mathematical models to infer information from unobservable events. Using this approach, they intend to map super-complex information from climate models and observational data to create simpler forms that can be more readily interpreted.
Ironically, the datasets are both large and small: Overall climate information is massive, but extreme climate events can be quite rare. One of the many challenges the group will face is teaching the algorithm, even when data points are unavoidably sparse and unbalanced.
The Physical World
Yu notes the group will have to customize existing algorithms to make them more science-focused. Specifically, the team will embed principles from physics and other disciplines to better define data parameters.
“Climate science is driven by physical principles,” said Yu. “If we just use an off-the-shelf machine learning model, it's predictions may not adhere to physical laws. Such models would not make any sense to our end users, which are climate scientists.”
While challenging, incorporating physics in machine learning will improve predictions by adding explicit constraints. For example, there’s no chance the temperature will hit 200 degrees Fahrenheit. The hybrid model will also account for the constant uncertainty in climate data – it’s not a single temperature but rather a temperature range.
The seven investigators on the grant bring a wide range of expertise to solve these and other problems and better capture the nuances associated with climate change. They hope their results will inform better policy. In addition, the methods they develop could provide other, ongoing benefits to the broader science community.
“I'm really excited because we can potentially demonstrate the huge impact of AI to improve our fundamental understanding and accelerate scientific discoveries.,” said Yu. “We need advanced machine learning technologies, and these can have a broader impact on many fields beyond climate science.”