Deep Learning for Discharge Forecasting in Central Asia

Through the SAPPHIRE Central Asia Project, we’ve helped national Meteorological and Hydrological Services (Hydromets) in Central Asia make a significant leap in river discharge forecasting. Using state-of-the-art deep learning approaches, we implemented a forecasting system that provides daily, accurate, and uncertainty-aware predictions of river discharge. The approach is highly scalable and transferable, delivering strong results in both Kyrgyzstan and Tajikistan. With a newly signed Memorandum of Understanding with Kazakh Hydromet and an implementation agreement in Turkmenistan, the system is now poised to expand across approximately 400 basins throughout Central Asia – an area larger than 83 times the size of Switzerland.

4
Number of countries with working partnerships established
> 400
Number of basins from which data are available
+20%
Improved Forecast Accuracy in the Crucial Spring Melt Season
25-32%
Error Reduction during Agricultural Period

Why Accurate Discharge Forecasts Matter

Reliable forecasts of river discharge underpin virtually every water-related decision in semi-arid Central Asia. Accurate, timely discharge information enables irrigation authorities to match crop-water demand with supply, dam operators to optimise hydropower production while ensuring adequate water supplies for downstream users and to meet ecological flow requirements, and disaster-management agencies to issue early flood or drought warnings that can save lives and livelihoods.

The national Meteorological and Hydrological Services, commonly referred to as the Hydromets, form the backbone of discharge forecasting in Central Asia. They operate and maintain national observation networks from which data are used to forecast river discharge, among other things. They are the official national institutions that issue forecast bulletins, used by irrigation departments, dam operators, emergency services, neighboring countries, and the Interstate Commission for Water Coordination (ICWC). Their forecasts provide crucial information for effective water allocation across sectors and at transboundary levels, flood warnings as well as for reservoir planning, helping to mitigate water-related risks.

Via the SAPPHIRE Central Asia Project, the Swiss Agency for Development and Cooperation (SDC) supports the modernisation of operational hydrology and forecasting in Central Asia. Jointly with the national Meteorological and Hydrological Services (Hydromets) in the region, hydrosolutions GmbH is implementing these upgrades into operational practice.

Figure 1: Spatial distribution of the 61 training basins in Kyrgyzstan

From Empirical Rules to Data-Driven Intelligence

To forecast river flow up to 10 days in advance, Hydromet agencies in Central Asia use simple linear relationships between current flows and past observations from the same time of year. While robust, this approach cannot capture the non-linear snow- and glacier-dominated runoff formation that defines high-mountain basins. This is especially true during the critical spring-melting season, when these methods fail to produce sufficiently accurate forecasts.

Recent work in hydrology has demonstrated that deep learning (DL) neural network models, which excel at language and vision tasks, outperform traditional models when trained on large, multi-basin datasets. They learn complex, non-linear relations directly from data, without imposing a rigid functional form. In our implementation for the Central Asia Hydromets, they provide short-term discharge forecasts up to 10 days ahead. While we also develop models for monthly and seasonal predictions, this post focuses on short-term forecasting only.

A Multi-Basin Deep Learning System for Kyrgyzstan

Here, we present implementation results from Kyrgyzstan, based on daily discharge data from all operational gauging stations of the Kyrgyz Hydromet. We trained three state-of-the-art DL neural network forecasting models—Temporal Fusion Transformer (TFT), Time‑Series Mixer(TSMixer), and TiDE—simultaneously on 61 gauged rivers (Figure 1).

These models are sequence models that learn patterns not only in what data is present, but also in what order the data comes. They are thus particularly suited for time series modelling. The advantage over traditional methods, such as benchmark linear regressions, is that neural network models are trained across multiple basins, enabling the network to share information between contrasting catchments and thus markedly improve generalisation for new periods, flow conditions, or even entirely different basins.

The forecasting pipeline is shown in Figure 2 below. It comprises the following components. First, the daily discharge time series continuously recorded by the Kyrgyz Hydromet serves a dual role: past observations are used as input for the model, while future values constitute the prediction target. Second, meteorological drivers combine downscaled historical ERA5‑Land reanalysis data on precipitation and temperature with ECMWF IFS ensemble forecasts of these atmospheric variables, providing the model with both past context and a forward-looking outlook. Third, each catchment is characterised by static descriptors (basin fingerprints)—including topography, land cover, and long‑term climatology. Finally, we optionally provide runoff-forming (ROF) indices from SnowMapper CA, which are physically based metrics that quantify snowmelt (see section 'Adding Cryosphere Intelligence' below).

The models produce daily forecasts for the next 10 days and also simulate how well they would have predicted past events. They include a prediction interval (80% for this application) to show how certain or uncertain each prediction is.

Figure 2: End‑to‑end data flow for the forecasting pipeline showing dynamic meteorological inputs, static descriptors, and optional cryosphere signals feeding the DL model.

Several advantages of the deep learning-based approach are worth highlighting. They are summarised in Table 1.

Table 1: Key changes between the legacy forecasting system and the deep-learning system that is implemented under the SAPPHIRE Central Asia project.

Figure 3 shows the evolution of the operational prediction for an example station in Kyrgyzstan in the 2024 warm season. Overall, the deep learning models show remarkable skills. Over the evaluation period 2017 – July 2025, the Neural Ensemble (i.e., the arithmetic mean of the three deep-learning models’ point forecasts) beats the legacy linear-regression method across all lead times during the agricultural period from April through September:

  • Pentad (5-day) forecasts: +4.6 % accuracy, −25 % mean‑absolute error
  • Decade (10-day) forecasts: +11 % accuracy, −32 % mean‑absolute error

The most significant gains (up to 20% forecast accuracy improvements) can be observed during the early melt period in springtime, when linear models fail to track the rapid snowmelt dynamics.

Figure 3: Evolution of the operational prediction for an example station in Kyrgyzstan for the year 2024. In red is the deep learning model (TSMixer) and in blue the traditional linear regression model. The shaded area corresponds to the 80% prediction interval. The discharge data was normalised to preserve data confidentiality.

Scaling to Central Asia

The DL approach described above is flexible. For example, the same models trained for Tajikistan improve 10-day discharge prediction accuracy by 12.5% compared to the traditional methods used by Tajik Hydromet. The data from Tajikistan and Kyrgyzstan were shared under existing Memorandum of Understandings (MoU's) that we signed with the Tajik and Kyrygz Hydromet, respectively. Such MoU's establish a framework for long-term collaboration and ensure mutual trust. A newly signed MoU with the Kazakh Hydromet and an implementation agreement with Turkmenistan pave the way for a regional rollout to approximately 400 basins, an area larger than 83 times the size of Switzerland.

Adding Cryosphere Intelligence

Snowmelt is the primary source of river flow in the high-mountain regions of Central Asia, so a forecasting model performs better when it knows how much snow is melting each day, in the past and the future. We therefore added a physically based snowmelt + snow-rain input variable from the SnowMapper CA system to the deep-learning models. When this extra cryosphere data was included, the TSMixer model’s errors for the April–September season (2017–2022) dropped noticeably at every forecast lead time, as shown in Figure 5: the version with ROF (orange) has a consistently lower normalised mean absolute error (nMAE) than the version without it (blue). Based on these results, we plan to integrate the snow-enhanced models into our routine operations in the 2025/26 period.

Figure 4: The normalised Mean Absolute Error for different forecasting steps. In blue are the TSMixer model results. The model doesn’t take any information from the SnowMapper CA as input. In orange are the TSMixer model results, which take the runoff-forming (ROF) variable (snowmelt and rain-on-snow) as input. The evaluation was performed on the agricultural period from 2017 to 2022.

Conclusions and Outlook

Deep learning ensembles elevate operational discharge forecasting from empirical rules to data-driven intelligence, exactly what Central Asia needs as climate variability tightens water margins and as new infrastructure developments impact water availability, also at transboundary scales. By delivering daily, uncertainty-aware predictions, the system supports everything from reservoir gate settings to warnings on future extreme conditions, ultimately helping societies navigate the region’s growing water challenges.

Looking ahead, the SAPPHIRE Project roadmap unfolds in three phases, concluding with the planned project's completion in November 2026. During 2025, we will complete user testing with the Kyrgyz and Tajik Hydromet services, integrate the system into the SAPPHIRE Forecast Tools web interface, and formally onboard the Kazakh Hydromet. In 2026, operational monthly forecasts will be released, providing water managers with a reliable one-month planning horizon. In the final phase, spanning the second half of 2026, we will incorporate seasonal outlooks and provide fully cryosphere-aware short-term forecasts, thereby closing the loop between weather-driven snow dynamics and operational decision-making. These milestones ensure that new capabilities reach practitioners as soon as they are validated and operationalised.

Downloads

No items found.