Zhancun Mu

GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and full-waveform inversion

Shiqian Li, Zhi Li, Zhancun Mu, Shiji Xin, Zhixiang Dai, Kuangdai Leng, Ruihua Zhang, Xiaodong Song, Yixin Zhu

NeurIPS 2025 Datasets and Benchmarks Track β€’ 2025

conference
GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and full-waveform inversion teaser

Abstract#

GlobalTomo provides a global dataset for physics-ML seismic wavefield modeling and full-waveform inversion. This comprehensive dataset enables researchers to apply machine learning techniques to seismic data analysis and Earth’s structure modeling.

Dataset Overview#

Scale and Coverage#

  • Global Coverage: Worldwide seismic data collection
  • Temporal Range: Multi-year data spanning various seismic events
  • Data Volume: Terabytes of processed seismic waveforms
  • Resolution: High-resolution spatial and temporal sampling

Data Types#

  1. Seismic Waveforms: Raw and processed seismic signals
  2. Velocity Models: 3D Earth structure models
  3. Event Catalogs: Earthquake and other seismic event metadata
  4. Station Information: Global seismic station network data

Applications#

Physics-ML Integration#

  • Combining physical models with machine learning
  • Data-driven velocity model construction
  • Automated event detection and characterization

Full-Waveform Inversion#

  • Enhanced inversion algorithms using ML
  • Improved computational efficiency
  • Better handling of complex Earth structures

Technical Details#

Data Processing Pipeline#

Raw Seismic Data β†’ Quality Control β†’ Feature Extraction β†’ ML-Ready Format
plaintext

Machine Learning Applications#

  • Neural network-based wavefield modeling
  • Deep learning for velocity estimation
  • Automated data quality assessment

Impact#

This dataset enables:

  • Advanced seismic imaging techniques
  • Better understanding of Earth’s internal structure
  • Improved earthquake hazard assessment
  • Development of next-generation seismic analysis tools

Access and Usage#

The dataset is publicly available through our project website and includes:

  • Comprehensive documentation
  • Example usage scripts
  • Benchmark tasks for ML evaluation
Comment seems to stuck. Try to refresh?✨