Machine learning scales sparse geophysical surveys to improve watershed-level predictions and guide smarter field studies to fill gaps

Image courtesy of Hang Chen, Lawrence Berkeley National Laboratory
The ModEx framework uses limited geophysical data and machine learning to build watershed-scale subsurface maps, then feeds them into hydrologic models that identify where to measure next.
The Science
Mapping underground water pathways across a whole watershed is hard because field measurements are costly and cover only small areas. Now, a team of scientists led by LBNL used a portable geophysical scanner to image the shallow subsurface at a mountain catchment in Colorado. They then trained an AI model using the collected data to recognize the link between the land surface and the geophysical surveys. This let them create a map of the entire subsurface across the entire watershed. The complete underground map was plugged into a water-flow simulation, which better matched above-ground stream observations than previous iterations. The combined subsurface/surface flow system also pointed scientists to the locations where new measurements would be most useful.
The Impact
The use of AI helps scientists do more with less data. By linking geophysical scans, artificial intelligence, and water-flow models in a repeating cycle, researchers can build better underground maps of subsurface water paths without exhaustive drilling. The approach cut prediction uncertainty by more than half compared to uniform surveys. It also produces a priority map showing where to measure next, saving time and money. Because the method relies on open-source data and standard tools, it can be applied to other watersheds and geophysical methods, supporting water resource management, flood forecasting, and environmental protection.
Summary
Researchers at Lawrence Berkeley National Laboratory and the University of Iowa developed a Model–Experiment (ModEx) framework that combines limited electromagnetic induction (EMI) surveys with Random Forest machine learning and with the ParFlow-CLM hydrologic model to characterize subsurface properties across the Trail Creek Catchment in Colorado’s East-Taylor River Watershed. The Random Forest model, trained on topographic attributes that were measured using laser pulses, or Light Detection and Ranging (LiDAR), predicted subsurface resistivity across the full catchment from sparse EMI measurements collected over just six days. The resistivity maps were converted into hydraulic parameters using borehole constraints and fed into a physics-based hydrologic model that simulates coupled surface–subsurface flow.
The team undertook three resistivity-based parameterization scenarios and demonstrated that EMI-informed models easily reproduced the timing of spring streamflow peaks and spatial patterns of intermittent flow. The framework’s decision tools, which included uncertainty maps, petrophysical similarity clustering, watershed zonation, and a new investigation interest index, identified the most informative locations for future surveys, demonstrating more than 50% uncertainty reduction over uniform sampling designs. The approach is transferable to other geophysical methods and watershed settings.
Contact
Hang Chen
University of Iowa; Lawrence Berkeley National Laboratory
Eoin L. Brodie, Watershed Function SFA LRM
Lawrence Berkeley National Laboratory
Funding
This work was supported as part of the Watershed Function Science Focus Area funded by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research under Contract No. DE-AC02-05CH11231 to Lawrence Berkeley National Laboratory.
Publications
H. Chen, et al., “A ModEx Framework for Watershed Subsurface Investigation With Limited Geophysical Data Using Machine Learning and Hydrologic Modeling.” Geophysical Research Letters 53, e2025GL119953 (2026). [DOI: 10.1029/2025GL119953]
