Skip to main content
  • UC Berkeley
  • College of Letters & Science
Berkeley University of California

URAP

Project Descriptions
Fall 2025

URAP Home Project Listings Application Contact

AI for Streamflow Forecasting and Water Supply Prediction in California

Laurel Larsen, Professor  
Geography  

Applications for Fall 2025 are closed for this project.

The Environmental Systems Dynamics Laboratory (ESDL) focuses on the interplay between biological, physical, and human aspects of the environment using a combination of physically-based and data-driven models. This internship aims to expand on our current work exploring the use of deep learning for environmental predictions.

The Environmental Systems Dynamics Lab (ESDL) uses a combination of physically-based and data-driven models to study environmental systems. This project expands our current research on deep learning for streamflow forecasting, in collaboration with the U.S. Army Corps of Engineers (USACE) and the Hydrologic Engineering Center (HEC), as part of broader efforts to improve California’s water supply predictions under climate change.
While deep learning models—especially long short-term memory (LSTM) networks—often outperform traditional models, their "black box" nature limits interpretability and generalizability. To address this, we incorporate physical constraints (e.g., water balance, process-based model outputs) into LSTM inputs. This improves predictive accuracy, particularly under non-stationary conditions and in data-sparse basins.
The first phase of this work focused on predicting snowpack and stremflow in the headwaters of the Tule River in the Sierra Nevada Mountains, and inflows to the Success Dam reservoir operated by USACE. We showed that hybrid data-driven LSTM models and physically-based hydrologic models developed by HEC yield better out-of-sample forecasts than either model alone. In the second phase, we developed a physics-informed multi-timescale LSTM (MTS-PILSTM) capable of daily and hourly predictions—essential for flood forecasting in the Russian River Basin. We also updated open-source deep learning tools like NeuralHydrology to enhance predictive performance and model generalizability and ingest data collected by USACE.
The current phase aims to expand this approach across multiple California watersheds using statewide datasets, focusing on human-managed systems (e.g., reservoirs, diversions) and transfer learning to improve scalability and generalizability.


This project, developed in collaboration with the U.S. Army Corps, of Engineers (USACE), and the Hydrologic Engineering Center (HEC) is part of the overarching goal of predicting California water supply in a changing climate.
In the first phase we performed a case study focused on snow-pack in the Tule River basin and reservoir inflows at the Lake Success reservoir, located in the western California Sierra Nevada mountains. We developed a long-short-term-memory (LSTM) network to forecast snow-pack accumulation and reservoir inflows, testing the predictive performance of novel machine learning models and comparing results from a companion effort involving a physical model carried out by USACE HEC-HMS modelers. Finally, we integrated the two models and showed that the additional physical constraints result in better out-of-sample predictions.

The next phase of this project aims to develop an LSTM model , as well as a hybrid model combining LSTM and HEC-HMS state variables. The project's specific objectives are to (1) update deep learning algorithms, focusing on the LSTM model available in open-source repositories such as NeuralHydrology, and (2) evaluate the performance of these models in predicting streamflow in the Russian River Basin, California. These models will enhance streamflow predictability, which is crucial for water availability analysis, reservoir operations, and flood risk assessment.

Role: Student Tasks May Include:
• Extending NeuralHydrology’s PyTorch-based models
• Testing LSTM architectures and physical input configurations
• Running large-scale model tuning on UC Berkeley’s HPC clusters
• Implementing cross-validation and overfitting mitigation
• Applying transfer learning across basins
• Evaluating physical variable influence on model accuracy
Student Tasks May Include:
• Extending NeuralHydrology’s PyTorch-based models
• Testing LSTM architectures and physical input configurations
• Running large-scale model tuning on UC Berkeley’s HPC clusters
• Implementing cross-validation and overfitting mitigation
• Applying transfer learning across basins
• Evaluating physical variable influence on model accuracy

Qualifications: Qualifications:
Required: Ideal for students in Computer Science, Data Science, or Statistics with experience in Python, PyTorch, and deep learning (especially LSTMs). Must be comfortable working collaboratively and communicating technical ideas clearly.
Desired: Familiarity with time-series analysis, hydrological data. Experience in High Performance Computing (parallel processing) and Transfer Learning.

Day-to-day supervisor for this project: Dr. Dino Bellugi (primary contact), Staff Researcher

Hours: to be negotiated

Off-Campus Research Site: Predominantly remote, but with weekly in-person meetings

Related website: http://esdlberkeley.com
Related website: geography.berkeley.edu

 Mathematical and Physical Sciences

Return to Project List

Office of Undergraduate Interdisciplinary Studies, Undergraduate Division
College of Letters & Science, University of California, Berkeley
Accessibility   Nondiscrimination   Privacy Policy