Hydrology — catchment water balance from ERA5¶
Build a monthly water balance for a small catchment by pulling the
three drivers from CDS in one call: precipitation in, evaporation out,
and runoff out. ERA5 reports all three as flux variables (per-step
accumulations), so op="auto" resolves to sum and the per-month
GeoTIFF carries the actual monthly total in metres of water
equivalent.
Domain context. The catchment-scale water balance is
$$ P = ET + R + \frac{dS}{dt} $$
where $P$ is precipitation, $ET$ is actual evapotranspiration, $R$ is runoff (surface + subsurface), and $dS/dt$ is change in storage. For long enough periods the storage term averages near zero and the residual $P - ET - R$ should be small. We'll check that for one calendar year.
Step 1 — verify the variables exist in the catalog¶
All three live on reanalysis-era5-single-levels, with types: flux
set so the auto-routing picks sum.
from earthlens.ecmwf import Catalog
cat = Catalog()
for code in ("total-precipitation", "evaporation", "surface-runoff"):
spec = cat.get_variable("reanalysis-era5-single-levels", code)
print(f"{code:25s} nc={spec.nc_variable:5s} units={spec.units:25s} is_flux={spec.is_flux}")
total-precipitation nc=tp units=m is_flux=True evaporation nc=e units=m of water equivalent is_flux=True surface-runoff nc=sro units=m is_flux=True
Step 2 — retrieve a year of monthly totals¶
One small catchment-sized box (~1° around Coello, Colombia), all three
variables, twelve months. We pass temporal_resolution="monthly" so
the retrieve uses the -monthly-means dataset internally and the
request body skips the day field. The aggregation runs at 1MS
(month-start) frequency.
from pathlib import Path
from earthlens import EarthLens, AggregationConfig
OUT = Path("data/era5-water-balance")
OUT.mkdir(parents=True, exist_ok=True)
earthlens = EarthLens(
data_source="ecmwf",
temporal_resolution="monthly",
start="2022-01-01",
end="2022-12-01",
variables={
"reanalysis-era5-single-levels-monthly-means": [
"total-precipitation",
"evaporation",
"surface-runoff",
],
},
lat_lim=[4.0, 5.0],
lon_lim=[-75.0, -74.0],
path=str(OUT),
)
earthlens.download(aggregate=AggregationConfig(freq="1MS", op="auto"))
2026-05-10 01:40:20.985 | INFO | earthlens.ecmwf.backend:download:536 - Download ECMWF reanalysis-era5-single-levels-monthly-means/total-precipitation data for period 2022-01-01 00:00:00 till 2022-12-01 00:00:00
2026-05-10 01:40:21.584 | INFO | earthlens.ecmwf.backend:_api:724 - Requesting reanalysis-era5-single-levels-monthly-means from CDS; this may take several minutes
2026-05-10 01:40:21,987 INFO Request ID is ded1950e-e0a5-4bf4-a7a2-ab2315f6e839
2026-05-10 01:40:22,050 INFO status has been updated to accepted
2026-05-10 01:40:36,210 INFO status has been updated to running
2026-05-10 01:40:43,904 INFO status has been updated to successful
2026-05-10 01:40:44 | INFO | pyramids.base.config | Logging is configured.
2026-05-10 01:40:45.449 | INFO | earthlens.ecmwf.backend:download:536 - Download ECMWF reanalysis-era5-single-levels-monthly-means/evaporation data for period 2022-01-01 00:00:00 till 2022-12-01 00:00:00
2026-05-10 01:40:45.451 | INFO | earthlens.ecmwf.backend:_api:724 - Requesting reanalysis-era5-single-levels-monthly-means from CDS; this may take several minutes
2026-05-10 01:40:45,711 INFO Request ID is 0b5dd1e3-5cfc-4768-9dbc-3e480bbf66e6
2026-05-10 01:40:45 | INFO | ecmwf.datastores.legacy_client | Request ID is 0b5dd1e3-5cfc-4768-9dbc-3e480bbf66e6
2026-05-10 01:40:45,875 INFO status has been updated to accepted
2026-05-10 01:40:45 | INFO | ecmwf.datastores.legacy_client | status has been updated to accepted
2026-05-10 01:41:07,343 INFO status has been updated to successful
2026-05-10 01:41:07 | INFO | ecmwf.datastores.legacy_client | status has been updated to successful
2026-05-10 01:41:08 | INFO | multiurl.base | Downloading https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-cache-1/2026-05-09/6da6c47c0c4101a235f34ace58c7db8.nc
2026-05-10 01:41:08.709 | INFO | earthlens.ecmwf.backend:download:536 - Download ECMWF reanalysis-era5-single-levels-monthly-means/surface-runoff data for period 2022-01-01 00:00:00 till 2022-12-01 00:00:00
2026-05-10 01:41:08.710 | INFO | earthlens.ecmwf.backend:_api:724 - Requesting reanalysis-era5-single-levels-monthly-means from CDS; this may take several minutes
2026-05-10 01:41:09,102 INFO Request ID is 840a5c96-9403-45b7-afbd-b7c2eb88ebb7
2026-05-10 01:41:09 | INFO | ecmwf.datastores.legacy_client | Request ID is 840a5c96-9403-45b7-afbd-b7c2eb88ebb7
2026-05-10 01:41:09,162 INFO status has been updated to accepted
2026-05-10 01:41:09 | INFO | ecmwf.datastores.legacy_client | status has been updated to accepted
2026-05-10 01:41:30,718 INFO status has been updated to running
2026-05-10 01:41:30 | INFO | ecmwf.datastores.legacy_client | status has been updated to running
2026-05-10 01:41:42,368 INFO status has been updated to successful
2026-05-10 01:41:42 | INFO | ecmwf.datastores.legacy_client | status has been updated to successful
2026-05-10 01:41:42 | INFO | multiurl.base | Downloading https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-cache-2/2026-05-09/ef933a52dc248bc14b1cd0ce9399834d.nc
2026-05-10 01:41:43.218 | INFO | earthlens.ecmwf.backend:download:575 - ECMWF download summary: all 3 variables succeeded ([('reanalysis-era5-single-levels-monthly-means', 'total-precipitation'), ('reanalysis-era5-single-levels-monthly-means', 'evaporation'), ('reanalysis-era5-single-levels-monthly-means', 'surface-runoff')])
Step 3 — load the per-month GeoTIFFs into arrays¶
Each (variable, month) pair lands at
<path>/aggregated/<cds_variable>_1MS_<YYYYMMDD>.tif. Stack them
into a (month, lat, lon) cube per variable, then average over space
to get a single time series per variable.
import numpy as np
import pandas as pd
from pyramids.dataset import Dataset
agg_dir = OUT / "aggregated"
def stack_monthly(cds_variable: str) -> np.ndarray:
"""Stack the 12 monthly GeoTIFFs for one variable into a (12, lat, lon) cube."""
paths = sorted(agg_dir.glob(f"{cds_variable}_1MS_*.tif"))
return np.stack([Dataset.read_file(str(p)).read_array() for p in paths])
precip = stack_monthly("total_precipitation")
et = stack_monthly("evaporation")
runoff = stack_monthly("surface_runoff")
# Catchment mean per month — one number per (variable, month).
precip_mean = np.nanmean(precip, axis=(1, 2))
et_mean = np.nanmean(et, axis=(1, 2))
runoff_mean = np.nanmean(runoff, axis=(1, 2))
months = pd.date_range("2022-01-01", periods=12, freq="MS")
df = pd.DataFrame(
{"P": precip_mean, "ET": et_mean, "R": runoff_mean},
index=months,
)
df * 1000 # m -> mm
| P | ET | R | |
|---|---|---|---|
| 2022-01-01 | 2.861938 | -2.317952 | 0.697975 |
| 2022-02-01 | 6.916809 | -2.221828 | 2.208328 |
| 2022-03-01 | 11.556015 | -2.609737 | 4.806976 |
| 2022-04-01 | 12.418518 | -2.541585 | 5.221519 |
| 2022-05-01 | 15.173798 | -2.609251 | 7.470741 |
| 2022-06-01 | 13.788681 | -2.505731 | 6.464844 |
| 2022-07-01 | 10.142136 | -2.559916 | 4.798431 |
| 2022-08-01 | 7.104645 | -2.612860 | 3.006973 |
| 2022-09-01 | 7.507477 | -2.474343 | 3.020058 |
| 2022-10-01 | 9.890747 | -2.529038 | 3.594170 |
| 2022-11-01 | 7.742996 | -2.587045 | 2.605782 |
| 2022-12-01 | 3.974533 | -2.626493 | 1.360283 |
Step 4 — plot the monthly water-balance terms¶
ERA5 reports evaporation as negative when the surface loses water
to the atmosphere, so we invert the sign to read the magnitude. Same
convention for surface_runoff (positive away from the surface).
import matplotlib.pyplot as plt
precip_mm = precip_mean * 1000
et_mm = -et_mean * 1000 # ERA5 evaporation is negative for sfc->atm flux
runoff_mm = runoff_mean * 1000
fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(months, precip_mm, marker="o", label="Precipitation (P)")
ax.plot(months, et_mm, marker="s", label="Evapotranspiration (|ET|)")
ax.plot(months, runoff_mm, marker="^", label="Surface runoff (R)")
ax.set_ylabel("mm / month (catchment mean)")
ax.set_title("ERA5 monthly water-balance terms — Coello bbox, 2022")
ax.legend()
ax.grid(alpha=0.3)
plt.tight_layout()
plt.show()
Step 5 — annual closure¶
Sum each term over the year and check the residual $P - |ET| - R \approx \Delta S$. For a 12-month period in a tropical catchment we expect a small residual relative to total fluxes — single-percent of $P$ for a closed system, larger when the catchment has substantial groundwater export or interannual storage swings.
annual = pd.DataFrame(
{
"P (mm/yr)": [precip_mm.sum()],
"|ET| (mm/yr)": [et_mm.sum()],
"R (mm/yr)": [runoff_mm.sum()],
"Residual P - |ET| - R": [precip_mm.sum() - et_mm.sum() - runoff_mm.sum()],
"Residual / P (%)": [
100 * (precip_mm.sum() - et_mm.sum() - runoff_mm.sum()) / max(precip_mm.sum(), 1e-9)
],
}
)
annual.round(1)
| P (mm/yr) | |ET| (mm/yr) | R (mm/yr) | Residual P - |ET| - R | Residual / P (%) | |
|---|---|---|---|---|---|
| 0 | 109.099998 | 30.200001 | 45.299999 | 33.599998 | 30.799999 |
Notes¶
op="auto"is critical here. All three variables are fluxes; the catalog'stypes: fluxfield tells the reducer to sum per-step accumulations instead of averaging them. Ameanwould give numbers ~30× too small (one slot's accumulation rather than a month's).- ERA5-Land for higher-resolution catchment work. The single-levels
product is 0.25° native; ERA5-Land is 0.1°. Switch the dataset key to
"reanalysis-era5-land"and addevaporation-from-bare-soil,evaporation-from-vegetation-transpiration,sub-surface-runofffor a more detailed land-surface budget. - Storage term. ERA5 also exposes
volumetric_soil_water_layer_1..4on ERA5-Land for shallow soil moisture; differencing month-end values closes the residual.