Data Science for Geoscience



  Prof. Dr. Jef Caers (Stanford University, USA)


  2 days


  Data Science - Machine Learning






  10 CPD points




Course description

This course provides an overview of the most relevant areas of data science to address geoscientific challenges and questions as they pertain to the environment, earth resources & hazards. The focus lies on the methods that treat common characters of geoscientific data: multivariate, multi-scale, compositional, geospatial and space-time. In addition, the course will treat those statistical method that allow a quantification of the “human dimension” by looking at quantifying impact on humans (e.g. hazards, contamination) and how humans impact the environment (e.g. contamination, land use). The course focuses on developing skills that are not covered in traditional statistics and machine learning courses.

The material aims at exposure and application over in-depth methodological or theoretical development. Data science areas covered are: extreme value statistics, multi-variate analysis, factor analysis, compositional data analysis, spatial information aggregation, spatial analysis and estimation, geostatistics and spatial uncertainty, treating data of different scales of observation, spatio-temporal modeling. The focus lies on developing practical skills on real data sets, executing software and interpreting results.

Course objectives

The objectives of this course are to:

  • Discover fields of data science typically not covered in traditional courses
  • Identify a combination of data science methods to address a specific geoscientific question or challenge whether related to the environment, earth resources or hazard, and its impact on humans
  • Use statistical software on real datasets and communicate the results to a non-expert audience


Course outline

Part I: Extremes:
* Statistical analysis of skew data
* Extreme value statistics
* Applications: size and magnitude distributions (volcanoes, diamonds, earthquakes), extreme flooding, weather, climate.

Part II Compositions
* Compositional data analysis
* Applications: geochemical data in Earth Resources

Part III Causality
* Multivariate analysis of compositional data
* Application: pollution, water quality, anomaly detection, Earth Resources prospecting.

Part IV Geospatial analysis
* Bayesian Aggregation of geospatial information
* Weights of Evidence method
* Logistic regression

Part V spatial uncertainty
* Spatial analysis, geostatistics & spatial uncertainty
* Application: interpolating remote sensing data, pollution data, groundwater/reservoir modeling
* Variogram Analysis
* Kriging
* Multiple-point geostatistics

Participants' profile

Geoscientists and geo-engineers who wish to expand their knowledge on data scientific methods specifcally applicable to earth science type data sets: skew data, compositional/multivariate, spatio-temporal.


Recommended reading

Coles, S., Bawa, J., Trenner, L., & Dorazio, P. (2001). An introduction to statistical modeling of extreme values (Vol. 208). London: Springer.

Pawlowsky-Glahn, V., & Buccianti, A. (2011). Compositional data analysis: Theory and applications. John Wiley & Sons.

Härdle, W., & Simar, L. (2003). Applied multivariate statistical analysis. Berlin: Springer.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. New York: Springer.


About the instructor

Jef Caers received both an MSc (’93) in mining engineering / geophysics and a PhD (’97) in engineering from the Katholieke Universiteit Leuven, Belgium. Currently, he is Professor of Geological Sciences (since 2015) and previously Professor of Energy Resources Engineering at Stanford University, California, USA. He is also director of the Stanford Center for Earth Resources Forecasting, an industrial affiliates program in decision making under uncertainty with ~20 partners from the Earth Resources Industry. Dr. Caers’ research interests are quantifying uncertainty and risk in the exploration and exploitation of Earth Resources. Jef Caers has published in a diverse range of journals covering Mathematics, Statistics, Geological Sciences, Geophysics, Engineering and Computer Science. He was awarded the Vistelius award by the IAMG in 2001, was Editor-in-Chief of Computers and Geosciences (2010-2015). Dr. Caers has received several best paper awards and written four books entitled "Petroleum Geostatistics” (SPE, 2005) “Modeling Uncertainty in the Earth Sciences” (Wiley-Blackwell, 2011), "Multiple-point Geostatistics: stochastic modeling with training images" (Wiley-Blackwell, 2015) and “Quantifying Uncertainty in Subsurface Systems (Wiley-Blackwell, 2018). Dr. Caers was awarded the 2014 Krumbein Medal of the IAMG for his career achievement.


                    Learning Geoscience Logo


Explore other courses under this discipline:


Machine Learning

    Machine Learning in Geosciences (1 day)
        Mr. Gerard Schuster (King Abdullah University of Science and Technology)

    New Applications of Machine Learning to Oil & Gas Exploration and Production
        Dr. Bernard Montaron (Fraimwork SAS)

    Machine Learning for Geoscientists with Hands-on Coding
        Dr. Ehsan Naeini (Ikon Science)

    Data Science for Geoscience
       Prof. Dr. Jef Caers (Stanford University)