Scalability prediction for fundamental performance factors

Claudia Rosas, Judit Giménez, Jesús Labarta


Inferring the expected performance for parallel applications is getting harder than ever; applications need to be modeled for restricted or nonexistent systems and performance analysts are required to identify and extrapolate their behavior using only the available resources. Prediction models can be based on detailed knowledge of the application algorithms or on blindly trying to extrapolate measurements from existing architectures and codes. This paper describes the work done to define an intermediate methodology where the combination of (a) the essential knowledge about fundamental factors in parallel codes, and (b) detailed analysis of the application behavior at low core counts on current platforms, guides the modeling efforts to estimate behavior at very large core counts. Our methodology integrates the use of several components like instrumentation package, visualization tools, simulators, analytical models and very high level information from the application running on systems in production to build a performance model.

Full Text:



Åström, J.A.; Carter, A.; Hetherington, J.; Ioakimidis, K.; Lindahl, E.; Mozdzynski, G.; Nash, R.W.; Schlatter, P.; Signell, A. & Westerholm, J. (2013)

Preparing Scientific Application Software for Exascale Computing. Applied Parallel and Scientific Computing. Springer Berlin Heidelberg. vol. 7782. Pag. 27-42.

Balaprakash, Prasanna; Buntinas, Darius; Chan, Anthony; Guha, Apala; Gupta, Rinku; Narayanan, Sri Hari Krishna; Chien, Andrew A.; Hovland, Paul & Norris, Boyana

(2013). Exascale Workload Characterization and Architecture Implications. Proc. of the High Perf. Computing Symposium. Pag. 5:1-5:8.

Barker, K.J.; Davis, K.; Hoisie, A.; Kerbyson, D.K.; Lang, M.; Pakin, Scott & Sancho, J.C. (2009). Using Performance Modeling to Design Large-Scale Systems.

vol 42, num 11, Pag. 42-49

Calotoiu, Alexandru; Hoefler, Torsten; Poke, Marius & Wolf, Felix. (2013). Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes.

Proc. Intl. Conf. for High Perf. Computing, Networking, Storage and Analysis. Pag. 45:1-45:12.

Carrington, Laura; Laurenzano, Michael A. & Tiwari, Ananta. (2013). Inferring Large-Scale Computation Behavior via Trace Extrapolation.

Proc. IEEE 27th Intl Symp. on Par. & Dist. Proc. Workshops and PhD Forum. Pag. 1667-1674.

Casas, Marc; Badia, R. M. & Labarta, Jesus. (2008). Automatic Analysis of Speedup of MPI Applications. Proc. 22nd Intl. Conf. on Supercomputing. Pag. 349-358

Center for Exascale Simulation of Advanced Reactors. Proxy-Apps for Thermal Hydraulics.

S. Dosanjh, R. Barrett, D. Doerfler, S. Hammond, K. Hemmert, M. Heroux, P. Lin, K. Pedretti, A. Rodrigues,T. Trucano and J. Luitjens (2014). Exascale design space exploration and co-design. Future Generation Computer Systems. Vol. 30. Num 0. Pag 46-58.

Engelmann, Christian. (2014). Scaling to a million cores and beyond: Using light-weight simulation to understand the challenges ahead on the road to exascale.

Future Generation Computer Systems. Vol 30. Num 0. Pag. 59-65.

Gahvari, Hormozd; Baker, Allison H.; Schulz, Martin; Yang, Ulrike Meier; Jordan, Kirk E. & Gropp, William. (2011). Modeling the Performance of an Algebraic Multigrid Cycle on HPC Platforms. Proc. Intl. Conf. on Supercomputing. Pag. 172-181.

Geist, Al & Lucas, Robert. (2009). Major Computer Science Challenges At Exascale. Int. J. of High Perform. Comput. Appl. Vol 23. Num 4. Pag. 427-436.

Gonzalez, J.; Casas, M.; Gimenez, J.; Moreto, M.; Ramirez, A.; Labarta, J. & Valero, M. (2011). Simulating Whole Supercomputer Applications. IEEE Micro

Vol 31. Num 3. Pag. 32-45.

Gourdain, N; Gicquel, L; Montagnac, M; Vermorel, O; Gazaix, M; Staffelbach, G; Garcia, M; Boussuge, J-F & Poinsot, T. (2009)

High performance parallel computing of flows in complex geometries: I. Methods. Comput. Sci. Disc. Vol 2. Num 1. Pag 015003.

Habib, Salman; Morozov, Vitali; Finkel, Hal; Pope, Adrian; Heitmann, Katrin; Kumaran, Kalyan; Peterka, Tom; Insley, Joe; Daniel, David; Fasel, Patricia; Frontiere, Nicholas & Lukic,, Zarija. (2012). The Universe at Extreme Scale: Multi-petaflop Sky Simulation on the BG/Q. Proc. Intl. Conf. on High Performance Computing, Networking, Storage and Analysis. Pag. 4:1-4:11. doi: 978-1-4673-0804-5

Henson, Van E. & Yang, Ulrike M. (2002). BoomerAMG: A parallel algebraic multigrid solver and preconditioner. Appl. Numer. Math.

Vol 41. Num 1. Pag. 155-177.

van der Wijngaart, R.F.; Sridharan, S. & Lee, V.W. (2012). Extending the BT NAS Parallel Benchmark to exascale computing.

Intl. Conf. for High Performance Computing, Networking, Storage and Analysis. Pag. 1-9

Wu, Xing & Mueller, Frank. (2011). ScalaExtrap: Trace-based Communication Extrapolation for Spmd Programs.

Proc. 16th ACM Symposium on Principles and Practice of Parallel Prog. Pag. 113-122.

Publishing Center of South Ural State University (454080, Lenin prospekt, 76, Chelyabinsk, Russia)