Sabber Ahamed | Geophysicist & Data Scientist

Consider what you are asking a geodynamic simulation to do.

On one end of the problem: model the slow, viscous creep of the mantle — a fluid with viscosity $10^{21}$ times that of water — deforming continents over hundreds of millions of years, opening ocean basins, building mountain ranges, and driving the drift of tectonic plates at a few centimeters per year.

On the other end: capture an earthquake rupture that initiates on a fault segment, propagates at velocities approaching the shear wave speed, and arrests — all within 60 to 90 seconds.

These two phenomena are governed by the same underlying physics. They are causally connected: the slow tectonic loading accumulates the stress that the earthquake instantaneously releases. And yet the timescales they operate on differ by a factor of roughly $10^{14}$ — fourteen orders of magnitude.

This is the central challenge of computational geodynamics. And it is why machine learning, for all its recent power, has not yet solved it — but may be the only realistic path forward.

The Physics: One Framework, Many Regimes

Despite the apparent diversity of tectonic systems — thrust faults, normal faults, transform faults, mid-ocean ridges, subduction zones, metamorphic core complexes — the governing equations are largely shared. At the core is the Stokes equation for a viscous, incompressible fluid under gravitational body forces:

$-\nabla p + \nabla \cdot \boldsymbol{\tau} + \rho \mathbf{g} = 0$

$\nabla \cdot \mathbf{v} = 0$

where $p$ is pressure, $\boldsymbol{\tau}$ is the deviatoric stress tensor, $\rho$ is density, $\mathbf{g}$ is gravitational acceleration, and $\mathbf{v}$ is velocity. Coupled to this is the energy equation:

$\rho C_p \left( \frac{\partial T}{\partial t} + \mathbf{v} \cdot \nabla T \right) = \nabla \cdot (k \nabla T) + Q$

which governs how temperature $T$ evolves through advection, conduction, and internal heat production $Q$ . The rheology — how the material deforms in response to stress — closes the system, and this is where complexity explodes. Earth materials are neither purely viscous nor purely elastic. Depending on temperature, pressure, strain rate, and mineralogy, the mantle behaves as a non-Newtonian power-law fluid, the crust as an elastoplastic solid, and fault zones as frictional interfaces governed by rate-and-state laws.

What changes between a thrust fault and a mid-ocean ridge is not the equations — it is the geometry, the boundary conditions, the rheological regime, and the dominant physical process. This apparent simplicity masks enormous computational complexity.

The Multiscale Problem

My PhD dissertation, "Computational Journey Across Multiphysics and Multitemporal Scales: From Slow Tectonic Movement to Fast Earthquake Slip" (University of Memphis, 2018) [1], was named for exactly this challenge. Getting a single geodynamic model to faithfully represent physics across widely separated timescales is not an engineering inconvenience — it is a fundamental obstacle.

The relevant scales are:

Process	Timescale	Length Scale	Dominant Physics
Mantle convection	$10^6$ – $10^9$ yr	$10^2$ – $10^3$ km	Viscous Stokes flow, heat transport
Plate tectonics / rifting	$10^5$ – $10^8$ yr	$10^2$ – $10^3$ km	Viscoplastic flow, lithospheric flexure
Fault zone evolution	$10^3$ – $10^6$ yr	$1$ – $100$ km	Frictional creep, damage mechanics
Seismic cycle	$10^1$ – $10^3$ yr	$1$ – $100$ km	Viscoelastic relaxation, interseismic coupling
Earthquake nucleation	$10^{-2}$ – $10^1$ yr	$0.1$ – $10$ km	Rate-and-state friction, pore pressure
Dynamic rupture	$10^{-2}$ – $10^2$ s	$0.1$ – $500$ km	Elastodynamics, slip-weakening

Bridging even two adjacent rows in this table requires solving problems with fundamentally different numerical character. Long-timescale tectonic models use quasi-static solvers and large time steps. Rupture models must resolve the wave equation on millisecond timescales. The recent work of Jourdon et al. (2025) represents the state of the art: a coupled ASPECT–SeisSol framework that resolves geodynamics over millions of years and dynamic rupture over seconds — but only by loosely coupling two separate solvers in a one-way pipeline [2]. A fully coupled simulation remains out of reach.

Van Zelst et al. demonstrated a one-way coupling from geodynamic subduction model through seismic cycle to dynamic rupture, spanning $\sim 10^{14}$ in timescale [3]. The technical machinery required — separate codes, format conversions, stress field interpolation — underscores how fragile these couplings are in practice.

The Computational Cost: What It Actually Takes

Modern geodynamic codes — ASPECT, CitcomS, Underworld3 — are among the most sophisticated scientific software in existence [4, 5, 6]. They implement adaptive mesh refinement, scalable multigrid solvers, and massively parallel execution. And they are still not fast enough.

The TerraNeo project demonstrated solving the Stokes system governing mantle convection at more than one trillion degrees of freedom on petascale supercomputers [7]. To resolve Earth's mantle at 1 km global resolution requires systems exceeding $10^{12}$ unknowns — an exascale problem, as documented by Bauer et al. (2023) [8].

The recent HyTeG framework achieves this via matrix-free geometric multigrid methods, since traditional sparse matrix approaches are disqualified by memory constraints at this scale [9]. The DOE EQSIM project demonstrated fault-to-structure earthquake simulation across the 0–10 Hz band using all 9,402 nodes of the Frontier exascale machine — generating more than 3 petabytes of output data for 90 seconds of physical time [10].

These are not edge cases. They represent what it costs to do the science correctly.

For a single tectonic configuration — say, a subduction zone with a prescribed slab geometry and plate velocity — a high-resolution 3D ASPECT simulation may take:

Hours to days on a dedicated HPC allocation
Thousands of CPU-core-hours per run
Terabytes of output for the velocity, temperature, pressure, and composition fields

Now consider that to understand a geodynamic system — thrust faulting, core complex development, ridge spreading — you must explore a parameter space spanning fault geometry, rheological parameters, thermal gradients, and boundary velocities. Covering that space by brute force requires thousands of simulations. At hours per run, that is years of compute time.

This is the data generation problem.

The Data Generation Problem

Building a training dataset for a geodynamic ML model is not like scraping the web. Every data point costs real compute. And the cost is nonlinear: increasing resolution to capture small-scale features (fault zones, melt channels, shear bands) increases the problem size super-linearly due to CFL constraints, solver iterations, and mesh refinement overhead.

The challenges compound:

1. Coverage vs. cost. The parameter space for a realistic geodynamic system — rheological exponent, viscosity prefactor, activation energy, grain size, fault geometry, boundary velocities, thermal state — is easily 10- to 20-dimensional. Even coarse coverage with 10 samples per dimension is $10^{10}$ to $10^{20}$ simulations. This is impossible. Active learning and Latin hypercube sampling can help, but the fundamental tension remains.

2. Output complexity. Geodynamic simulations produce 4D field data: velocity, stress, temperature, and composition as functions of $(x, y, z, t)$ . These outputs are structured but enormous. A modest 3D mantle simulation might have $10^7$ spatial nodes and thousands of time steps. Storing and loading this data for ML training is itself a systems engineering problem.

3. Synthetic-to-real transfer. Even a large simulation database may fail to generalize to field observations. The "simulation-to-reality gap" — well-documented in seismology by Mousavi & Beroza (2022) [11] — is even more acute in geodynamics, where subsurface rheological structure is poorly constrained and the "ground truth" is geodetic and seismic data filtered through additional physical assumptions.

4. Temporal non-stationarity. Geodynamic systems evolve. A model trained on the stress state at one phase of a rifting cycle may fail entirely at another phase. Training on snapshots misses the history-dependence embedded in rate-and-state friction and viscoelastic relaxation.

Agarwal et al. (2020) established the paradigm for ML surrogate modeling in this context: train on $\sim$ 10,000 2D mantle convection simulations, build a neural network surrogate that predicts thermal evolution with 99.7% accuracy, then use the surrogate for parameter-space exploration [12]. Their 2021 follow-up extended this to full 2D temperature fields using convolutional autoencoders [13]. The most recent extension (2025) uses the ML-predicted steady state to initialize simulations, reducing time-to-convergence by a median factor of 3.75 [14].

This is real progress. But 10,000 2D simulations — feasible with a dedicated HPC campaign — becomes a different problem for 3D simulations of geometrically diverse tectonic systems.

What Machine Learning Can Realistically Do

Given these constraints, where does ML genuinely help?

Surrogates for the Stokes Solve

The most expensive component of a geodynamic timestep is almost always the Stokes solve — finding the velocity and pressure fields consistent with the viscosity structure and boundary conditions. A 2025 paper in Physics of Fluids demonstrated that a physics-constrained ML model can predict Stokes velocities as a function of the temperature field while enforcing mass conservation, bypassing the solver entirely and achieving up to 89× speedup [15]. The physics constraint was not cosmetic: enforcing $\nabla \cdot \mathbf{v} = 0$ within the network architecture was essential for physical consistency.

Neural Operators for Forward and Inverse Problems

Kong, Gurnis & Ross (2026) transformed the classical mantle convection solver into Fourier Neural Operators that learn mappings between thermal states separated by geological time intervals [16]. The key capability: the FNO's auto-differentiation enables gradient-based inversion — recovering the past thermal state of the mantle from present-day seismic tomography. This is the forward-inverse duality that makes neural operators especially powerful for geodynamics, where we almost always want to run the problem backwards.

For geometrically diverse systems — the core focus of our discussion — GINO (Geometry-Informed Neural Operator) and GNN-based operators (MeshGraphNet) are the right architectural choice. They represent the domain as a signed distance function or unstructured graph, enabling the same operator to generalize across thrust fault, normal fault, and ridge geometries without retraining. The recent Frontiers review on surrogate modeling [17] identifies this as the critical frontier: operators that generalize not just across parameter values but across domain geometry.

Physics-Informed Acceleration

Rather than building a pure surrogate, the IGyTheM framework (highlighted by the CIG community in 2026) uses ML as a fast surrogate for the thermodynamic solver embedded within the geodynamic code [18]. Traditional approaches use pre-computed lookup tables that sacrifice compositional flexibility. IGyTheM's ML surrogate predicts thermodynamic properties (density, entropy, seismic velocities) self-consistently at runtime, enabling fully coupled geodynamic-thermodynamic simulations at reasonable cost. This is the hybrid architecture that is likely to dominate the near term: classical numerical solver for the PDE, ML surrogate for expensive sub-components.

Probabilistic Hazard Assessment

The biggest near-term impact may be indirect. The GMD paper by Rackauckas et al. (2023) [19] demonstrates that physics-based ML surrogates trained on limited geodynamic simulation data can enable probabilistic parameter estimation that is impossible with the raw simulator. Given observations — geodetic displacements, seismicity rates, heat flow — the surrogate inverts for the rheological and stress parameters that best explain the data, with full uncertainty quantification. This is the operational application: not replacing the physics simulator, but unlocking the parameter inversions that scientific and hazard assessment questions require.

The Challenges That Remain

Honesty about the limitations matters here.

The coverage problem is not solved. For a system like core complex development or transform fault evolution — where geometry is complex and the relevant parameter space is high-dimensional — no neural operator trained today can reliably generalize to unseen configurations far from its training distribution. Gerya's (2022) review of subduction modeling explicitly identifies this as a frontier that requires ML augmentation [20], but the required simulation databases do not yet exist.

Temporal dynamics are hard. Most ML surrogate work in geodynamics predicts steady states or single-step outputs. Autoregressive prediction of long tectonic trajectories — accurately propagating the state of a rifting system through millions of years — accumulates errors in ways that neural operators have not yet solved for geodynamic problems. The climate surrogate literature [17] documents the same instability in long rollouts.

Multi-physics coupling is harder. Coupling viscous mantle flow to elastic crustal deformation to frictional fault mechanics to thermodynamics is not just a computational problem — it is an architectural one. Each physics regime has different characteristic scales, different numerical character, and different data representation requirements. A single ML model that handles all of them simultaneously does not exist. The one-way coupled approaches (Jourdon et al., van Zelst et al.) represent the state of the art, and even these required years of development by large teams.

Interpretability matters more here than in most domains. A model that predicts fault rupture propagation or crustal deformation must be understandable to geoscientists and defensible in a hazard assessment context. Black-box accuracy is not sufficient. Feature importance from Random Forests and gradient-based sensitivity analysis from PINNs are useful, but they do not fully substitute for mechanistic understanding. The field will need better tools for extracting physical insight from trained neural operators.

The Path Forward

The most credible near-term path is not replacement of classical solvers by ML, but strategic coupling:

Run targeted simulation campaigns covering the geometry and BC space of interest — 3,000–10,000 runs per tectonic system using ASPECT or Underworld3 on HPC, using Latin hypercube or active learning for sampling.
Train geometry-aware neural operators (GINO or GNN-based) on this database, with physics-informed loss terms encoding the Stokes and energy equations to reduce data requirements.
Use the surrogate for parameter-space exploration — probabilistic inversion, sensitivity analysis, uncertainty quantification — at a cost that makes these tasks operational rather than aspirational.
Couple physics-ML hybrids into existing codes for expensive sub-components (thermodynamics, rheology evaluation) rather than replacing the full solver.

The 2025 Advances and Perspectives review in Science China Earth Sciences [21] reaches a similar conclusion: ML in geodynamics is most powerful as a complement to physics-based modeling, not a replacement for it.

The simulation of Earth's tectonic systems remains genuinely hard. Not because the physics is unknown — the equations have been written down for decades. But because the scales are extreme, the data is scarce, the geometry is complex, and the stakes — seismic hazard, volcanic risk, resource assessment — are real.

Machine learning offers no magic. But it offers something valuable: the possibility of making the parameter-space exploration, uncertainty quantification, and rapid forward modeling that rigorous geoscience demands actually computationally tractable. That is not a small thing. It is, in the context of the problem, rather significant.

References

Ahamed, S. (2018). Computational Journey Across Multiphysics and Multitemporal Scales: From Slow Tectonic Movement to Fast Earthquake Slip. PhD Dissertation, University of Memphis. ProQuest ID: 10787825. https://www.proquest.com/openview/411308c78d420322539ea37cbffb3c99/1
Jourdon, A., Ragon, T., Gabriel, A.-A., Madden, E., & Ulrich, T. (2025). Coupling 3D Geodynamics and Dynamic Earthquake Rupture: Fault Geometry, Rheology and Stresses Across Timescales. Journal of Geophysical Research: Solid Earth. https://doi.org/10.1029/2025JB031730
van Zelst, I., Wollherr, S., Gabriel, A.-A., Madden, E. H., & van Dinther, Y. (2019). Modeling Megathrust Earthquakes Across Scales: One-Way Coupling From Geodynamics and Seismic Cycles to Dynamic Rupture. Journal of Geophysical Research: Solid Earth. https://doi.org/10.1029/2019JB017539
Bangerth, W., Dannberg, J., Gassmöller, R., Heister, T., et al. (2023). ASPECT v2.5.0: Advanced Solver for Problems in Earth's ConvecTion. Zenodo. https://aspect.geodynamics.org/
Zhong, S., McNamara, A., Tan, E., Moresi, L., & Gurnis, M. CitcomS. Computational Infrastructure for Geodynamics. https://github.com/geodynamics/citcoms
Moresi, L., Mansour, J., Beall, A., et al. (2022). Underworld3. https://www.underworldcode.org/
Bauer, S., Huber, M., Mohr, M., Ruede, U., & Wohlmuth, B. (2020). TerraNeo — Mantle Convection Beyond a Trillion Degrees of Freedom. In Software for Exascale Computing – SPPEXA 2016-2019. Springer. https://doi.org/10.1007/978-3-030-47956-5_19
Bauer, S., Huber, M., Mohr, M., Ruede, U., & Wohlmuth, B. (2023). Challenges for Mantle Convection Simulations at the Exa-Scale. Springer. https://doi.org/10.1007/978-3-031-29082-4_4
Ilangovan, P., Kohl, N., & Mohr, M. (2025). Highly Scalable Geodynamic Simulations with HyTeG. EGUsphere (preprint). https://egusphere.copernicus.org/preprints/2025/egusphere-2025-2552/
McCallen, D., Pitarka, A., et al. (2024). EQSIM — A Multidisciplinary Framework for Fault-to-Structure Earthquake Simulations on Exascale Computers. Earthquake Spectra. https://doi.org/10.1177/87552930241246235
Mousavi, S.M. & Beroza, G.C. (2022). Deep-Learning Seismology. Science, 377(6607), eabm4470. https://doi.org/10.1126/science.abm4470
Agarwal, S., Tosi, N., Breuer, D., Padovan, S., & Ruedas, T. (2020). A Machine-Learning-Based Surrogate Model of Mars' Thermal Evolution. Geophysical Journal International, 222(3), 1656–1670. https://doi.org/10.1093/gji/ggaa234
Agarwal, S., Tosi, N., Breuer, D., Padovan, S., Kessel, P., & Montavon, G. (2021). Deep Learning for Surrogate Modeling of Two-Dimensional Mantle Convection. Physical Review Fluids, 6, 113801. https://doi.org/10.1103/PhysRevFluids.6.113801
Agarwal, S., Tosi, N., Hüttig, C., Greenberg, D.S., & Bekar, A.C. (2025). Accelerating the Discovery of Steady-States of Planetary Interior Dynamics With Machine Learning. Journal of Geophysical Research: Machine Learning and Computation. https://doi.org/10.1029/2024JH000438
(2025). Physics-Based Machine Learning for Mantle Convection Simulations. Physics of Fluids, 37, 086624. https://doi.org/10.1063/5.0266116
Kong, C., Gurnis, M., & Ross, Z.E. (2026). Forward and Inverse Mantle Convection with Neural Operators. arXiv preprint. https://arxiv.org/abs/2601.23178
Gelbrecht, M. et al. (2023). Surrogate Modeling for the Climate Sciences Dynamics with Machine Learning and Data Assimilation. Frontiers in Applied Mathematics and Statistics. https://doi.org/10.3389/fams.2023.1133226
IGyTheM — Integrating Geodynamics with Thermodynamics using Machine Learning. CIG Community Newsletter, February 2026. https://community.geodynamics.org/t/cig-newsletter-february-2026/4451
Rackauckas, C. et al. (2023). Perspectives of Physics-Based Machine Learning Strategies for Geoscientific Applications Governed by Partial Differential Equations. Geoscientific Model Development, 16, 7375–7409. https://doi.org/10.5194/gmd-16-7375-2023
Gerya, T.V. (2022). Numerical Modeling of Subduction: State of the Art and Future Directions. Geosphere, 18(2), 503–561. https://doi.org/10.1130/GES02472.1
(2025). Advances and Perspectives in Numerical Modeling for Geodynamics. Science China Earth Sciences. https://doi.org/10.1007/s11430-025-1791-3