Applications for ultrascale computing

Milan Mihajlovic, Lars Ailo Bongo, Raimondas Ciegis, Neki Frasheri, Dragi Kimovski, Peter Kropf, Svetozar Margenov, Maya Neytcheva, Thomas Rauber, Gudula Runger, Roman Trobec, Roel Wuyts, Roman Wyrzykowski, Jing Gong


Studies of complex physical and engineering systems, represented by multi-scale and multi-physics computer simulations have an increasing demand for computing power, especially when the simulations of realistic problems are considered. This demand is driven by the increasing size and complexity of the studied systems or the time constraints. Ultrascale computing systems offer a possible solution to this problem. Future ultrascale systems will be large-scale complex computing systems combining technologies from high performance computing, distributed systems, big data, and cloud computing. Thus, the challenge of developing and programming complex algorithms on these systems is twofold. Firstly, the complex algorithms have to be either developed from scratch, or redesigned in order to yield high performance, while retaining correct functional behaviour. Secondly, ultrascale computing systems impose a number of non-functional cross-cutting concerns, such as fault tolerance or energy consumption, which can significantly impact the deployment of applications on large complex systems. This article discusses the state-of-the-art of programming for current and future large scale systems with an emphasis on complex applications. We derive a number of programming and execution support requirements by studying several computing applications that the authors are currently developing and discuss their potential and necessary upgrades for ultrascale execution.

Full Text:



Ansys simulation software,

Comsol multiphysics,


Linear algebra package (lapack),


Portable, extensible toolkit for scientifc computation,

Suite of nonlinear and dierential/algebraic equation solvers (sundials),

Trilinos ml,

Modeling atmospheric and oceanic ows: Insights from laboratory experiments and numerical simulations. American Geophysical Union, Wiley, 2014.

S. Ashby, P. Beckman, J. Chen, P. Colella, B. Collins, D. Crawford, J. Dongarra, D. Kothe, R. Lusk, P. Messina, T. Mezzacappa, P. Moin, M. Norman, R. Rosner, V. Sarkar, A. Siegel, F. Streitz, A. White, and M. Wright. Report on exascale computing. Technical report, ASCAC, 2010.

J. Borgdor, M. Ben Belgacem, C. Bona-Casas, L. Fazendiero, D. Groen, O. Hoenen, A. Mizeranschi, J.L. Suter, D. Coster, P.V. Coveney, W. Dubitzky, A.G. Hoekstra, P. Strand, and B. Chopard. Performance of distributed multiscale simulations. philospohical Transactions of the Royal Society, A372:20130407, 2014.

J. Borgdor, J.-L. Falcone, E. Lorenz, C. Bona-Casas, B. Chopard, and A.G. Hoekstra. Foundations of distributed multiscale computing: formalization, specication and analysis. Journal of Parallel and Distributed Computing, 73:465-483, 2013.

J. Borgdor, M. Mamonski, B. Bosak, K. Kurowski, M. Ben Belgacem, B. Chopard, D. Groen, P.V. Coveney, and A.G. Hoekstra. Distributed multiscale coupling with MUSCLE2, the Multiscale Coupling Library and Environment. Journal of Computational Science, 5:719--731, 2014.

P. Brunner and C.T. Simmons. Hydrogeosphere: A fully integrated, physically based hydrological model. Ground Water , 50(2):170-176, 2012.

F. Cappello, A. Geist, W. Gropp, S. Kale, B. Kramer, and M. Snir. Toward exascale resilience: 2014 update. Supercomputing frontiers and innovations, 1:5-28, 2014.

I. Chakroun, T. Vander Aa, B. De Fraine, P. Costanza, T. Haber, R. Wuyts, and W. Demeuter. Exashark: A scalable hybrid array kit for exascale simulation. In

Proceedings of 23rd High Performance Computing Symposium (HPC 2015), 2015.

F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D. A Wallach, M. Burrows, T. Chandra, A. Fikes, and R.E. Gruber. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2):4, 2008.

P.V. Coveney, G. Giupponi, S. Jha, S. Manos, J. MacLaren, S.M. Pickles, R.S. Saskena, T. Soddermann, J.L. Suter, M. Thyveetil, and S.J. Zasada. Large scale computational science on federated international grids: The role of switched optical networks.

Future Generation Computer Systems, 26:99-110, 2010.

J.C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J.J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, and P. et al. Hochschild. Spanner: Google's globally distributed database. ACM Transactions on Computer Systems (TOCS), 31(3):8, 2013.

E. Keller D. Drutskoy and J. Rexford. Scalable network virtualization in software-denednetworks. IEEE Internet Computing, 17:20-27, 2013.

W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2004.

J. Dean and S. Ghemawat. Mapreduce: simplied data processing on large clusters. Communications of the ACM, 51(1):107-113, 2008.

J. Dongarra et al. The international exascale software project roadmap. International Journal of High Performance Computing Applications, 25(1):3-60, February 2011.

A. Dorostkar, D. Lukarski, B. Lund, M. Neytcheva, Y. Notay, and P. Schmidt. Parallel performance study of block-preconditioned iterative methods on multicore computer systems. In Proceedings of the Europar 2014 conference. Springer LNCS, 2014.

H. C. Elman, Eisenstat, S. C. and M. H. Schultz. Variational iterative methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal., 20:345{357, 1983.

H.C. Elman, M.D. Mihajlovic, and D.J. Silvester. Fast iterative solvers for buoyancy-driven flow problems. Journal of Computational Physics, 230(10):3900-3914, 2011.

V. Escuder, R. Duran, and R. Rico. Analysis of x86 isa condition codes in uence on super-scalar execution. In Aluru S., Parashar M., Badrinath R., and Prasanna V.K., editors, High Performance Computing - HiPC 2007: 14th International Conference, Goa - India, 2007.

A. Fichtner, D. Giardini, A. Jackson, T. Nissen-Meyer, D. Peter, J. Robertsson, P. Tackley, L. Dalguer, D. Roten, O. Schenk, and M. Grote. Hpc roadmap. white paper, Solid Earth Dynamics, 2013.

Michael J. Flynn, Oskar Mencer, Veljko Milutinovic, Goran Rakocevic, Per Stenstrom, Roman Trobec, and Mateo Valero. Moving from petaflops to petadata. Commun. ACM, 56(5):39-42, 2013.

H.J. Hendricks Franssen and W. Kinzelbach. Ensemble kalman ltering versus sequential self-calibration for inverse modelling of dynamic groundwater flow systems. Journal of Hydrology, 365(3-4):261-274, 2.

B. Fryxell, K. Olson, P. Ricker, F.X. Timmes, M. Zingale, D.Q. Lamb, P. MacNeice, R. Rosner, J.W. Truran, and H. Tufo. FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes. The Astrophysical Journal 131.

S. Ghemawat, H. Gobio, and S.-T. Leung. The google le system. In ACM SIGOPS operating systems review, volume 37, pages 29-43. ACM, 2003.

P. Ghysels and W. Vanroose. Hiding global synchronization latency in the preconditioned conjugate gradient algorithm. Parallel Computing, 40(7):224-238, 2014.

J. Gong, S. Markidis, M. Schliephake, E. Laure, D.S. Henningson, P. Schlatter, A. Peplinski, A. Hart, J. Doleschal, D. Henty, and P. F. Fischer. Nek5000 with openacc. In S. Markidis and E. Laure, editors, Solving Software Challenges for Exascale, pages 57-68, Heidelberg, Dordrecht London New York, 2015. Springer LNCS 8759.

D. Groen, S.J. Zasada, and P.V. Coveney. Survey of Multiscale and Multiphysics Applications and Communities. Computing in Science and Engineering, 16.

P.K. Gunda, L. Ravindranath, C.A. Thekkath, Y. Yu, and L. Zhuang. Nectar: Automatic management of data and computation in datacenters. In OSDI, volume 10, pages 1-8, 2010.

M. Heil, A.L. Hazel, and J. Boyle. Solvers for large-displacement fluid-structure interaction problems: Segregated vs. monolithic approaches. Computational Mechanics, 43:91-101, 2008.

D.J. Hill. Nuclear Energy for the Future. Nature Materials, 7:680-682, 2008.

R. Hockney and J. Eastwood. Computer Simulation Using Particles. Taylor & Francis, 1988.

A. Hoekstra, B. Chopard, and P. Coveney. Multiscale modelling and simulation: a position paper. Philosophical Transactions of the Royal Society, A372:20130377, 2014.

A.G. Hoekstra, E. Lorenz, J.-L. Falcone, and B. Chopard. Towards a complex automata formalism for multiscale modeling. International Journal for Multiscale Computational Engineering, 5:491-502, 2007.

S. Huang, S. Xiao, and W. Feng. On the energy eciency of graphics processing units for scienticomputing. In IPDPS 2009. IEEE International Symposium on Parallel and Distributed Processing. IEEE, 2009.

M. Huber, B. Gmeiner, U. Ruede, and B. Wohlmuth. Resilience for Exascale Enabled Multigrid Methods. arXiv, 1501.07400v1.

H.-T. Hwang. Development of a parallel computational framework to solve flow and transport in integrated surface-subsurface hydrologic systems. PhD thesis, University of Waterloo, Canada, 2012.

L. Hwang, T. Jordan, L. Kellogg, J. Tromp, and R. Willemann. Advancing solid earth system science through high-performance computing. Technical report, Lawrence Livermore National laboratory, University of California, 2014.

H. V. Jagadish, J. Gehrke, A. Labrinidis, Y. Papakonstantinou, J.M. Patel, R. Ramakrishnan, and C. Shahabi. Big data and its technical challenges. Commun. ACM , 57(7):86-94, July 2014.

H. Johansen, D. Bernholdt, B. Collins, M. Heroux, R. Jacob, P. Jones, L.C. McInnes, J.D. Moulton, T. Ndousse-Fetter, D. Post, and W. Tang. Extreme-scale scientic application software productivity: Harnessing the full capability of extreme-scale computing. white paper, ASCR, 2013.

H. Johansen, L.C. McInnes, D.E. Bernholdt, J. Carver, M. Heroux, R. Hornung, P. Jones, B. Lucas, and A. Siegel. Software productivity for extreme-scale science workshop report. white paper, Workshop on Software Productivity for Extreme-scale Science, January 13-14, 2014, Rockville, MD, 2014.

J. Kappenman. Geomagnetic storms and their impacts on the u.s. power grid. Technical Report Meta-R-319, Metatech Corporation, January 2010.

D. Keyes. Keynote Presentation: Adapting Upstream Applications to Extreme Scale. In EAGE Workshop on High Performance Computing for Upstream, 2014.

A. Khajeh-Saeed and J. Blair Perot. Computational Fluid Dynamics Simulations Using Many Graphics Processors. Computing in Science & Engineering, 14 (3):10-19, 2012.

R. Klein, S. Vater, E. Paeschke, and D. Ruprecht. Multiple scales methods in meteorology. In Asymptotic Methods in Fluid Mechanics: Survey and Recent Advances, CISM Courses and Lectures, volume 523, pages 127-196. Springer, 2010.

M. Korch and T. Rauber. A Comparison of Task Pools for Dynamic Load Balancing of Irregular Algorithms. Concurrency and Computation: Practice and Experience, 16:1-47, 2004.

M. Kornacker, A. Behm, V. Bittorf, T. Bobrovytsky, C. Ching, A. Choi, J. Erickson, M. Grund, D. Hecht, M. Jacobs, I. Joshi, L. Ku, D. Kumar, A. Leblang, N. Li, I. Pandis, H. Robinson, D. Rorke, S. Rus, J. Russell, D. Tsirogiannis, S. Wanderman-Milne, and M. Yoder. Impala: A modern, open-source sql engine for hadoop. In 7th Biennial Conference on Innovative Data Systems Research (CIDR'15). CIDR, 2015.

G. Kosec, M. Depolli, A. Rashkovska, and R. Trobec. Super linear speedup in a local parallel meshless solution of thermo-fluid problems. Computers and Structures, 133:30-38, 2014.

G. Kosec and B. Sarler. H-adaptive local radial basis function collocation meshless method. CMC: Computers, Materials, and Continua, 26(3):227-253, 2011.

G. Kosec and R Trobec. Simulation of semiconductor devices with a local numerical approach. Engineering Analysis with Boundary Elements, 50:69-75, 2015.

G. Kosec and P. Zinterhof. Local strong form meshless method on multiple graphics processing units. CMES: Computer Modeling in Engineering and Sciences, 91(5):377-396, 2013.

N.A. Krall and A.W. Trivelpiece. Principles of plasma physics. International Series in Pure and Applied Physics. McGraw-Hill, 1973.

P. Kropf, E. Schiller, P. Brunner, O. Schilling, D. Hunkeler, and A. Lapin. Wireless mesh networks and cloud computing for real time environmental simulations. In Recent Advances in Information and Communication Technology - Proceedings of the 10th International Conference on Computing and Information Technology, IC2IT 2014, Angsana Laguna, Phuket, Thailand, 8-9 May, 2014, number 265 in Advances in Intelligent Systems and Computing, pages 1-11. Springer, 2014.

J. L. Lang and G. Runger. An execution time and energy model for an energy-aware execution of a conjugate gradient method with CPU/GPU collaboration. Journal of Parallel and Distributed Computing, 74(9):2884-2897, 2014.

A. Lapin, E. Schiller, P. Kropf, O. Schilling, P. Brunner, A. Jamakovic-Kapic, T. Braun, and S. Maoletti. Real-time envoronmental monitoring for cloud-based hydrogeological modelling with hydrogeosphere. In High Performance Computing and Communications conference HPCC, Paris, 2014. IEEE.

M. Guest (lead author). The scientific case for high performance computing in Europe 2012-2020. Technical report, FP7 Project PRACE RI-261557, 2014.

A. Lucia. Multi-Scale Methods and Complex Processes: A Survey and Look Ahead. Computers and Chemical Engineering, 34:1467-1475, 2010.

S. Markidis, J. Gong, M. Schliephake, E. Laure, A. Hart, D. Henty, K. Heisey, and P. F. Fischer. Openacc acceleration of nek5000, spectral element code.International Journal of High Performance Computing Applications (accepted).

Michael T. Fisher Martin L. Abbott. Scalability Rules: 50 Principles for Scaling Web Sites. Addison-Westey, 2011.

A. Melnyk and V. Melnyk. Personal Supercomputers: Architecture, Design, Application. Lviv Polytechnic National University Publishing, 2013.

A. Melnyk and V. Melnyk. Self-congurable FPGA-based computer systems. Advances in Electrical and Computer Engineering, 13(2):33-38, 2013.

R.L. Muddle, M.D. Mihajlovic, and M. Heil. An ecient Preconditioner for Monolithically-Coupled Large-Displacement Fluid-Structure Interaction Problems with Pseudo-Solid Mesh Updates. Journal of Computational Physics, 231(21):7315-7334, 2012.

S. Margenov N. Kosturski and Y. Vutov. Balancing the communications and computations in parallel fem simulations on unstructured grids. In K. Karczewski R. Wyrzykowski, J. Dongara and J. Wasniewski, editors, Parallel Processing and Applied Mathematics, pages 211-220, Heidelberg Dordrecht London New York, 2012. Springer LNCS 7204.

S. Margenov N. Kosturski and Y. Vutov. Computer simulation of rf liver ablation on an mriscan data. In M.D. Todorov, editor, Application of Mathematics in Technical and Natural Sciences, pages 120-126, Melville, NY, USA, 2012. AIP Conf. Proc. 1487.

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1099-1110. ACM, 2008.


S. Owen, R. Anil, T. Dunning, and E. Friedman. Mahout in action. Manning, 2011.

S. Margenov P. Arbenz and Y. Vutov. Parallel mic(0) preconditioning of 3d elliptic problems discretized by Rannacher-Turek finite elements. Computers and Mathematics with Applications, 55(10):2197-2211, 2008.

T. Rauber and G. Runger. A Transformation Approach to Derive Ecient Parallel Implementations. IEEE Transactions on Software Engineering, 26(4):315-339, 2000.

T. Rauber and G. Runger. Tlib - A Library to Support Programming with Hierarchical Multi-Processor Tasks. Journal of Parallel and Distributed Computing, 65(3):347-360, 2005.

T. Rauber and G. Runger. Parallel Programming for Multicore and Cluster Systems, Second edition. Springer, 2013.

T. Rauber and G. Runger. Modeling and Analyzing the Energy Consumption of Fork-Join-based Task Parallel Programs. Concurrency and Computation: Practice and Experience, 2014.

T. Rauber, G. Runger, M. Schwind, H. Xu, and S. Melzner. Energy Measurement, Modeling, and Prediction for Processors with Frequency Scaling. The Journal of Supercomputing, 2014.

K. Rojek, M. Ciznicki, B. Rosa, P. Kopta, M. Kulczewski, K. Kurowski, Z. Piotrowski, L. Szustak, D. Wojcik, and R. Wyrzykowski. Adaptation of fluid model Eulag to graphics processing unit architecture. Concurrency and Computation: Practice and Experience, DOI: 10.1002/cpe.3417, 2014.

S. Schnell, R. Grima, and P.K. Maini. Multiscale Modeling in Biology. American Scientist, 95:134-142, 2007.

Y. Shimizu and M. Takashi. Effect of topology on parallel computing for optimizing large scale logistics through binary pso. In Lockhart Bogle I.D. and Fairweather M., editors, Computer Aided Chemical Engineering Volume 30, Pages 1-1435 (2012) - 22 European Symposium on Computer Aided Process Engineering, 2012.

K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed file system. In Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, pages 1-10, May 2010.

C.A. Smethurst, D.J. Silvester, and M.D. Mihajlovic. Unstructured finite element method for the solution of the Boussinesq problem in three dimensions. International Journal for Numerical Methods in Fluids, 73(9):791-812, 2013.

Science Staff. Special Collection: Dealing with Data. Science, 331(6018), 2011.

J. Suter, D. Groen, L. Kabalana, and P.V. Coveney. Distributed Multiscale Simulations of Clay-Polymer Nanocomposites. In MRS Proceedings, volume 1470. Cambridge University Press, 2012.

L. Szustak, K. Rojek, T. Olas, L. Kuczynski, K. Halbiniak, and P. Gepner. Adaptation of MPDATA heterogeneous stencil computation to Intel Xeon Phi coprocessor. Scientific Programming , DOI:10.3233/SPR-140403, 2014.

S. Tansley and K.M. Tolle. The fourth paradigm: data-intensive scientific discovery. Microsoft Research, 2009.

A. Thusoo, J.S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wycko, and R. Murthy. Hive: a warehousing solution over a map-reduce framework.Proceedings of the VLDB Endowment, 2(2):1626-1629, 2009.

Top500. Top500 supercomputers site. Available on:, (accessed August 2014).

R. Trobec. Advances in the MLPG meshless methods, chapter Experimental analysis of methods for moving least squares support determination, pages 307-358. Duluth: Tech Science Press, 2009.

Roman Trobec. Two-dimensional regular d-meshes. Parallel Comput., 26(13-14):1945-1953, 2000.

W.M. Washington and C.L. Parkinson. Introduction To Three-dimensional Climate Modeling. University Science Books, California, 2005.

W. Xue, Ch. Yang, H. Fu, Y. Xu, J. Liao, L. Gan, Y. Lu, R. Ranjan, and L. Wang. Ultra-scalable CPU-MIC Acceleration of Mesoscale Atmospheric Modeling on Tianhe-2. IEEE Transaction on Computers, DOI: 10.1109/TC.2014.2366754, 2014.

C. Yang, M. Goodchild, Q. Huang, D. Nebert, R. Raskin, Y. Xu, M. Bambacus, and D. Fay. Spatial cloud computing: how can the geospatial sciences use and help shape cloud computing? International Journal of Digital Earth, 4(4):305-329, July 2011.

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M.J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pages 2-2. USENIX Association, 2012.

S.J. Zasada and P.V. Coveney. Virtualizing access to scientic applications with the Application Hosting Environment. Computer Physics Communications, 180:2513-2525, 2009.

Publishing Center of South Ural State University (454080, Lenin prospekt, 76, Chelyabinsk, Russia)