Core Module Optimizing PDE Sparse Matrix Models With HPCG Example

Earle Jennings

doi:10.14529/jsfi170205

Authors

Earle Jennings QSigma, Inc.

DOI:

https://doi.org/10.14529/jsfi170205

Abstract

This paper introduces a fundamentally new computer architecture for supercomputers. The core module is application compatible with an existing superscalar microprocessor, with minimized energy use, and is optimized for local sparse matrix operations. Optimized sparse matrix manip- ulation is discussed by analyzing the High Performance Conjugate Gradient (HPCG) benchmark speci...cation. This analysis shows how the DRAM memory wall is removed for this benchmark, and for sparse matrix models of partial di¤erential equations (PDEs) for a wide cross section of applications. By giving the programmer improved control over the con...guration of the super- computer, the potential for communication problems is minimized. Application compatibility is achieved while removing the superscalar instruction interpreter and multi-thread controller from the existing microprocessor’s hardware. These are transformed into compile-time utilities. The instruction cache is removed through an innovation in VLIW instruction processing. The data caches are unnecessary and are turned o¤ in order to optimally implement sparse matrix models.

References

Briggs, Henson, McCormick; A Multigrid Tutorial (2nd ed), epubs.siam.org/doi/book/ 10.1137/1.9780898719505 DOI: http://dx.doi.org/10.1137/1.9780898719505 (2000) Soci- ety of Industrial and Applied Mathematics (SIAM), Pihladelphia, PA, US

DOE; Scienti...c Grand Challenges in Fusion Energy Sciences and the Role of Computing at the Extreme Scale, https://science.energy.gov/~/media/ascr/pdf/ program-documents/docs/Fusion_report.pdf (2009) DOE, O¢ce of Fusion Energy Sciences, Office of Advanced Scientific Computer Research March 18-20, 2009, US

DOE-ASCAC Subcommittee Report; Top Ten Exascale Research Challenges, science. energy.gov/~/media/ascr/ascac/pdf/meetings/20140210/Top10reportFEB14.pdf (2014), US

Dongarra; Report on the Sunway TaihuLight System; (2016), University of Tennessee, Oak Ridge National Laboratory, Dept Electrical Engineering and Computer Science, Tech Report, UT-EECS-16-742, US

Ercegovac, Lang; Digital Arithmetic, https://www.elsevier.com/books/ digital-arithmetic/ercegovac/978-1-55860-798-9 Hardcover ISBN: 9781558607989, (2003) Elsevier Sciences, San Franciso, CA, US

Fisher; Very Long Instruction Word Architectures and the ELI-512, ht- tps://doi.org/10.1145/800046.801649, (1983) ACM, US

Goddard; Ivan Goddard and Mill at the 2015 European LLVM Conference, April 14, 2015;

https://millcomputing.com/event/1725/

Golub, van Loan; Matrix Computations (4th ed); https://jhupbooks.press.jhu.edu/ content/matrix-computations-0 ISBN: 9781421407944, (2013) Johns Hopkins University Press, Baltimore, Maryland, US

Gustafson; Beating Floating Point at its Own Game: Posit Arithmetic; http: //supercomputingfrontiers.com/2017/wp-content/uploads/2017/03/2_1100_ John-Gustafson.pdf (2017)

Heroux, Dongara, Luszcek; HPCG Technical Speci...cation, SAND 203-8752, www.osti.gov/ scitech/biblio/1113870f (2013), Sandia National Labs, US

Huck, Morris, Ross, Kneis, Mulder, Zahir; Introducing the I-64 Architecture, http: //ieeexplore.ieee.org/document/877947/ (2000), IEEE Micro, Sept, 2000, pp 24-35, US

Johnson; Superscalar Microprocessor Design, https://books.google.com/books/about/ Superscalar_microprocessor_design.html?id=9o1TAAAAMAAJ (1991), Prentis Hall, Englewood, NJ, US

O’Reilly; A Family of Large-Stencil Discrete Laplacian Approximations in Three Dimen- sions, ftp://grey.colorado.edu/pub/oreilly/misc/disc_lapl.3.pdf (2006) University of Colorado Boulder, CO, US

Press, Flannery, Teukolsky, Vetterling; Numerical Recipes in C: The Art of Scientific Programming 2nd ed., http://apps.nrbook.com/c/index.html (1992) Cambridge University Press, Cambridge, England

Saad; Interative Methods for Sparse Linear Systems (2nd ed); epubs.siam.org/doi/book/ 10.1137/1.978089871800 (2003) SIAM, Philadelphia, PA, US

Schlansker, Rau; EPIC: An Architecture for Instruction Level Parallel Processors, (February 2000) www.hpl.hp.com/techreports/1999/HPL-1999-111.pdf HP Laboratories Palo Alto, HPL-1999-111

Sharangpani, Arora; Itanium Processor Microarchitecture, https://www.researchgate. net/publication/3215154_Itanium_processor_microarchitecture (2000), IEEE Micro, Sept-Oct 2000, pp 24-43, US

Trottenberg, Osterlee, Shueller, Stuben, Oswald, Brandt; Multigrid; https://books. google.com/books/about/Multigrid.html?id=9ysyNPZoR24C (2001) Academic Press, Harcourt Science and Technology Company, San Diego, CA, US

Ungerer, Robic, Silc; A Survey of Processors with Explicit Multi-Threading, http://www. academia.edu/26319932/A_survey_of_processors_with_explicit_multithreading (2003), ACM Computing Surveys, ACM, vol. 35, No 1, March 2003, pp 29-63, US