SLOWER: A performance model for Exascale computing

Thomas Sterling; Daniel Kogler; Matthew Anderson; Maciej Brodowicz

doi:10.14529/jsfi140203

Authors

Thomas Sterling Indiana University, Bloomington
Daniel Kogler Indiana University, Bloomington
Matthew Anderson Indiana University, Bloomington
Maciej Brodowicz Indiana University, Bloomington

DOI:

https://doi.org/10.14529/jsfi140203

Abstract

A performance framework is introduced to facilitate the development and optimization of extreme-scale abstract execution models and the future systems derived from them. SLOWER defines a six-dimensional design trade-off space based on sources of performance degradation that are invariant across system classes. Exemplar previous generation execution models (e.g., vector) are examined in terms of the SLOWER parameters to illustrate their alternative responses to changing enabling technologies. New technology trends leading to nano-scale and the end of Moore's Law demand future innovations to address these same performance factors. An experimental execution model, ParalleX, is described to postulate one possible advanced abstraction upon which to base next generation hardware and software systems. A detailed examination is presented of how this class of dynamic adaptive execution model addresses SLOWER for advances in efficiency and scalability. To represent the SLOWER trade-off space, a queue model has been developed and is described. A set of simulation experiments spanning ranges of key parameters is presented to expose some initial properties of the SLOWER framework.

References

A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman. LogGP: Incorporating long messages into the LogP model — one step closer towards a realistic model for parallel computation. In Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’95, pages 95–105, New York, NY, USA, 1995. ACM.

G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18-20, 1967, Spring Joint Computer Conference, AFIPS ’67 (Spring), pages 483–485, New York, NY, USA, 1967. ACM.

G. E. Blelloch. Vector Models for Data-parallel Computing. MIT Press, Cambridge, MA, USA, 1990.

K. W. Cameron, R. Ge, and X.-H. Sun. logNP and log3P: Accurate analytical models of point-to-point communication in distributed systems. IEEE Trans. Comput., 56(3):314–327, Mar. 2007.

C.-K. Chui. The LogP and MLogP models for parallel image processing with multi-core microprocessor. In Proceedings of the 2010 Symposium on Information and Communication Technology, SoICT ’10, pages 23–27, New York, NY, USA, 2010. ACM.

D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: Towards a realistic model of parallel computation. In Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP ’93, pages 1–12, New York, NY, USA, 1993. ACM.

C. Dekate, M. Anderson, M. Brodowicz, H. Kaiser, B. Adelstein-Lelbach, and T. Sterling. Improving the scalability of parallel N-body applications with an event-driven constraint- based execution model. International Journal of High Performance Computing Applications, 26(3):319–332, 2012.

R. Dennard, F. Gaensslen, H.-N. Yu, V. Rideout, E. Bassous, and A. LeBlanc. Design of Ion- Implanted MOSFET’s with Very Small Physical Dimensions. IEEE Journal of Solid-State Circuits, SC-9(5):256–268, 1974.

D. Eppstein and Z. Galil. Parallel algorithmic techniques for combinational computation. Annual Review of Computer Science, 3(1):233–283, 1988.

F. Ino, N. Fujimoto, and K. Hagihara. LogGPS: A parallel computational model for syn- chronization analysis. SIGPLAN Not., 36(7):133–142, June 2001.

R. Pasupathy, S. Kim, A. Tolk, R. Hill, and M. Kuhl. Open-source simulation software “JAAMSIM”.