Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Try the next-generation Data Catalog at catalog-beta.data.gov and help shape it with your feedback.

Data and model code in support of Stream nitrate dynamics driven primarily by discharge and watershed physical and soil characteristics at intensively monitored sites, Insights from deep learning

Metadata Updated: January 22, 2026

We developed a suite of models using deep learning to make hindcast predictions of the 7-day average backward-looking nitrate concentration at 46 predominantly agricultural sites across the midwestern and eastern United States. The models used daily observations of discharge and meteorological variables and static watershed attributes describing anthropogenic modification to hydrology, nitrogen application, climate, groundwater, land use and land cover, watershed physical attributes, and soils. Across all sites, discharge and watershed soil and physiographic attributes show a particularly strong influence on model performance. An analysis of drivers across sites revealed considerable regional differences related to controlling processes such as groundwater contributions. We tested several ways to pool data across sites to develop accurate models and make the most effective use of available data. Single-site models, in which models are trained and tested at a single location, showed generally strong predictive performance (median Kling-Gupta Efficiency = 0.66), and accuracy at poorly performing sites could be improved by grouping sites with similar characteristics. Developing a single model for all sites reduced performance at several locations with distinct characteristics, suggesting that there is a threshold of dissimilarity beyond which more data does not improve the model. While many deep learning studies have shown that national or even global models can outperform local models, it is not clear that this is true for water quality constituents. This study demonstrates how existing data can be combined effectively, using deep learning to develop accurate and interpretable models of instream nitrate at sites where varying processes are responsible for changes in nitrate concentration. This release provides code and data for running a suite of machine learning model to predict in stream nitrate concentration and using explainable AI to analyze model outputs and compare among modeling approaches.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

Dates

Metadata Created Date January 11, 2026
Metadata Updated Date January 22, 2026

Metadata Source

Harvested from DOI USGS DCAT-US

Additional Metadata

Resource Type Dataset
Metadata Created Date January 11, 2026
Metadata Updated Date January 22, 2026
Publisher U.S. Geological Survey
Maintainer
Identifier http://datainventory.doi.gov/id/dataset/USGS_661436d1d34e633466530330
Data Last Modified 2024-08-23T00:00:00Z
Category geospatial
Public Access Level public
Bureau Code 010:12
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://ddi.doi.gov/usgs-data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Datagov Dedupe Retained 20260121204831
Harvest Object Id 95096a89-86d3-4a1a-b598-7e30f3b23227
Harvest Source Id 2b80d118-ab3a-48ba-bd93-996bbacefac2
Harvest Source Title DOI USGS DCAT-US
Metadata Type geospatial
Old Spatial {"type": "Polygon", "coordinates": -96.75, 38.32, -96.75, 44.53, -74.22, 44.53, -74.22, 38.32, -96.75, 38.32}
Source Datajson Identifier True
Source Hash 534195e7d581c08ab9ab6a582e7c56f35639ad0c8cdb0503b72374b161c10cc4
Source Schema Version 1.1
Spatial {"type": "Polygon", "coordinates": -96.75, 38.32, -96.75, 44.53, -74.22, 44.53, -74.22, 38.32, -96.75, 38.32}

Didn't find what you're looking for? Suggest a dataset here.