Accelerating Seismic Redatuming Using Tile Low-Rank Approximations on NEC SX-Aurora TSUBASA


  • Yuxi Hong king abdullah university of science and technology
  • Hatem Ltaief king abdullah university of science and technology
  • Matteo Ravasi king abdullah university of science and technology
  • Laurent Gatineau NEC Deutschland GmbH, HPC Division.
  • David Keyes king abdullah university of science and technology



With the aim of imaging subsurface discontinuities, seismic data recorded at the surface of the Earth must be numerically re-positioned inside the subsurface where reflections have originated, a process referred to as redatuming. The recently developed Marchenko method is able to handle full-wavefield data including multiple arrivals. A downside of this approach is that a multi-dimensional convolution operator must be repeatedly evaluated to solve an expensive inverse problem. As such an operator applies multiple dense matrix-vector multiplications (MVM), we identify and leverage the data sparsity structure for each frequency matrix and propose to accelerate the MVM step using tile low-rank (TLR) matrix approximations. We study the TLR impact on time-to-solution for the MVM using different accuracy thresholds whilst at the same time assessing the quality of the resulting subsurface seismic wavefields and show that TLR leads to a minimal degradation in terms of signal-to-noise ratio on a 3D synthetic dataset. We mitigate the load imbalance overhead and provide performance evaluation on two distributed-memory systems. Our MPI+OpenMP TLR-MVM implementation reaches up to 3X performance speedup against the dense MVM counterpart from NEC scientific library on 128 NEC SX-Aurora TSUBASA cards. Thanks to the second generation of high bandwidth memory technology, it further attains up to 67X performance speedup compared to the dense MVM from Intel MKL when running on 128 dual-socket 20-core Intel Cascade Lake nodes with DDR4 memory. This corresponds to 110 TB/s of aggregated sustained bandwidth for our TLR-MVM implementation, without suffering deterioration in the quality of the reconstructed seismic wavefields.


Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J.J.: Performance, design, and autotuning of batched GEMM for GPUs. In: Kunkel, J.M., Balaji, P., Dongarra, J.J. (eds.) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, vol. 9697, pp. 21–38. Springer (2016).

Akbudak, K., Ltaief, H., Mikhalev, A., Keyes, D.: Tile Low Rank Cholesky Factorization for Climate/Weather Modeling Applications on Manycore Architectures. In: High Performance Computing. ISC 2017. Lecture Notes in Computer Science, vol. 10266, pp. 22–40. Springer (2017).

Akbudak, K., Ltaief, H., Mikhalev, A., et al.: Exploiting data sparsity for large-scale matrix computations. In: Aldinucci, M., Padovani, L., Torquati, M. (eds.) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, vol. 11014, pp. 721–734. Springer (2018).

Al-Harthi, N., Alomairy, R., Akbudak, K., et al.: Solving Acoustic Boundary Integral Equations Using High Performance Tile Low-Rank LU Factorization. In: High Performance Computing. ISC High Performance 2020. Springer (2020).

Amestoy, P., Ashcraft, C., Boiteau, O., et al.: Improving Multifrontal Methods by Means of Block Low-Rank Representations. SIAM Journal on Scientific Computing 37(3), A1451–A1474 (2015).

Amundsen, L.: Elimination of Free-surface Related Multiples Without Need of a Source Wavelet. Geophysics 66, 327–341 (2001).

Berryhill, J.R.: Wave-equation Datuming Before Stack. Geophysics 49, 2064–2066 (1984).

Börm, S.: Efficient Numerical Methods for Non-Local Operators: H2-matrix Compression, Algorithms and Analysis, vol. 14. European Mathematical Society (2010).

Börm, S., Grasedyck, L., Hackbusch, W.: Introduction to Hierarchical Matrices with Applications. Engineering Analysis with Boundary Elements 27(5), 405–422 (2003).

Boukaram, W.H., Turkiyyah, G., Ltaief, H., Keyes, D.E.: Batched QR and SVD Algorithms on GPUs with Applications in Hierarchical Matrix Compression. Parallel Computing 74(C), 19–33 (2018).

Brackenhoff, J., Thorbecke, J., Koehne, V., et al.: Implementation of the 3D Marchenko method (2020).

Broggini, F., Snieder, R., Wapenaar, K.: Focusing the Wavefield Inside an Unknown 1D Medium: Beyond Seismic Interferometry. Geophysics 77(5), A25–A28 (2012).

Cao, Q., Pei, Y., Akbudak, K., et al.: Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications. In: Proceedings of the Platform for Advanced Scientific Computing Conference. pp. 2:1–2:11. ACM (2020).

Charara, A., Keyes, D., Ltaief, H.: Tile Low-Rank GEMM Using Batched Operations on GPUs. In: Aldinucci, M., Padovani, L., Torquati, M. (eds.) Euro-Par 2018: Parallel Processing. Lecture Notes in Computer Science, vol. 11014, pp. 811–825. Springer (2018).

Charara, A., Keyes, D., Ltaief, H.: Batched Triangular Dense Linear Algebra Kernels for Very Small Matrix Sizes on GPUs. ACM Transactions on Mathematical Software 45(2) (2019).

Corona, E., Martinsson, P.G., Zorin, D.: An O(N) Direct Solver for Integral Equations on the Plane. Applied and Computational Harmonic Analysis 38(2), 284–317 (2015).

Goreinov, S., Tyrtyshnikov, E., Yeremin, A.Y.: Matrix-Free Iterative Solution Strategies for Large Dense Linear Systems. Numerical Linear Algebra with Applications 4(4), 273–294 (1997)

Grasedyck, L., Kressner, D., Tobler, C.: A Literature Survey of Low-Rank Tensor Approximation Techniques. GAMM-Mitteilungen 36(1), 53–78 (2013).

van Groenestijn, G.J., Verschuur, D.J.: Estimating Primaries by Sparse Inversion and Application to Near-offset Data Reconstruction. Geophysics 74(3), 1MJ–Z54 (2009).

Hackbusch, W.: A Sparse Matrix Arithmetic Based on H-matrices. Part I: Introduction to H-Matrices. Computing 62(2), 89–108 (1999).

Halko, N., Martinsson, P.G., Tropp, J.A.: Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM Review 53(2), 217–288 (2011).

Jumah, B., Herrmann, F.J.: Dimensionality-reduced Estimation of Primaries by Sparse Inversion. Geophysical Prospecting 62(5), 972–993 (2014).

Keyes, D.E., Ltaief, H., Turkiyyah, G.: Hierarchical Algorithms on Hierarchical Architectures. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 378(2166), 20190055 (2020).

Kriemann, R.: H-LU Factorization on Many-Core Systems. Computing and Visualization in Science 16(3), 105–117 (2013).

Lindstrom, P.: Fixed-rate compressed floating-point arrays. IEEE Transactions on Visualization and Computer Graphics 20(12), 2674–2683 (2014).

Ltaief, H., Cranney, J., Gratadour, D., et al.: Meeting the Real-Time Challenges of Ground-Based Telescopes Using Low-Rank Matrix Computations (2021),

van der Neut, J., Thorbecke, J., Wapenaar, K., Slob, E.: Inversion of the Multidimensional Marchenko Equation. In: 77th Conference and Exhibition, EAGE, Extended Abstracts. vol. 2015, pp. 1–5. European Association of Geoscientists & Engineers (2015).

Ravasi, M., Vasconcelos, I.: PyLops – A Linear-operator Python Library for Scalable Algebra and Optimization. SoftwareX 11, 100361 (2020).

Ravasi, M., Vasconcelos, I.: An Open-source Framework for the Implementation of Largescale Integral Operators with Flexible, Modern HPC Solutions - Enabling 3D Marchenko

Imaging by Least Squares Inversion. Geophysics pp. 1–74 (2021).

Ravasi, M., Vasconcelos, I., Kritski, A., et al.: Target-oriented Marchenko Imaging of a North Sea Field. Geophysical Journal International 205(1), 99–104 (2016).

Rouet, F.H., Li, X.S., Ghysels, P., Napov, A.: A Distributed-memory Package for Dense Hierarchically Semi-separable Matrix Computations Using Randomization. ACM Transactions on Mathematical Software (TOMS) 42(4), 27 (2016).

Verschuur, D.J.: Surface-related Multiple Elimination in Terms of Huygens Sources. Journal of Seismic Exploration 1, 49–59 (1992)

Wapenaar, K., Thorbecke, J., van der Neut, J., et al.: Marchenko Imaging. Geophysics 79(3), WA39–WA57 (2014).

Williams, S., Waterman, A., Patterson, D.: Roofline: An Insightful Visual Performance Model for Multicore Architectures. Communications of the ACM 52(4), 65–76 (2009).

Yilmaz, O.: Seismic Data Analysis. Society of Exploration Geophysicists (2001)




How to Cite

Hong, Y., Ltaief, H., Ravasi, M., Gatineau, L., & Keyes, D. (2021). Accelerating Seismic Redatuming Using Tile Low-Rank Approximations on NEC SX-Aurora TSUBASA. Supercomputing Frontiers and Innovations, 8(2), 6–26.

Most read articles by the same author(s)