Improving Reliability of Supercomputer CFD Codes on Unstructured Meshes

Andrey V. Gorobets, Pavel A. Bakhvalov

Abstract


The paper describes a particular technical solution targeted at improving reliability and quality of a highly-parallel computational fluid dynamics code written in C++. The code considered is based on rather complex high-accuracy numerical methods and models for simulation of turbulent flows on unstructured hybrid meshes. The cost of software errors is very high in largescale supercomputer simulations. Reproducing and localizing errors, especially “magic” unstable bugs related with wrong memory access, are extremely problematic due to the large amount of computing resources involved. In order to prevent, or at least notably filter out memory bugs, an approach of increased reliability is proposed for representing mesh data and organizing memory access. A set of containers is proposed, which causes no overhead in the release configuration compared to plain arrays. At the same time, it provides throughout access control in the safe mode configuration and additional compile-time protection from programming errors. Furthermore, it is fully compatible with heterogeneous computing within the OpenCL standard. The proposed approach provides internal debugging capabilities that allow us to localize problems directly in a supercomputer simulation.


Full Text:

PDF

References


Alumbaugh, T.J., Jiao, X.: Compact Array-Based Mesh Data Structures. In: Hanks B.W. (eds) Proceedings of the 14th International Meshing Roundtable. Springer, Berlin, Heidelberg pp. 485–503 (2013), DOI: 10.1007/3-540-29090-7_29

Cantwell, C.D., Moxey, D., Comerford, A., Bolis, A., Rocco, G., Mengaldo, G., De Grazia, D., Yakovlev, S., Lombard, J.E., Ekelschot, D., Jordi, B., Xu, H., Mohamied, Y., Eskilsson, C., Nelson, B., Vos, P., Biotto, C., Kirby, R.M., Sherwin, S.J.: Nektar++: An open-source spectral/hp element framework. Computer physics communications 192, 205–219 (2015), DOI: 10.1016/j.cpc.2015.02.008

Cuthill, E., McKee, J.: Reducing the Bandwidth of Sparse Symmetric Matrices. In: Proceedings of the 1969 24th National Conference. pp. 157–172. ACM ’69, ACM, New York, NY, USA (1969), DOI: 10.1145/800195.805928

Danilov, A.A., Terekhov, K.M., Konshin, I.N., Vassilevski, Y.V.: INMOST Parallel Platform: Framework for Numerical Modeling. Supercomputing Frontiers and Innovations 2(4), 55–66 (2015), DOI: 10.14529/jsfi150404

Dyedov, V., Ray, N., Einstein, D., Jiao, X., Tautges, T.J.: AHF: array-based half-facet data structure for mixed-dimensional and non-manifold meshes. Engineering with Computers 31(3), 389–404 (2015), DOI: 10.1007/s00366-014-0378-6

Fogerty, S., Martineau, M., Garimella, R., Robey, R.: A comparative study of multi-material data structures for computational physics applications. Computers & Mathematics with Applications 78(2), 565–581 (2019), DOI: 10.1016/j.camwa.2018.06.010

Garimella, R., Perkins, W., Buksas, M., Berndt, M., Lipnikov, K., Coon, E., Moulton, J., Painter, S.: Mesh infrastructure for coupled multiprocess geophysical simulations. Procedia Engineering 82 (12 2014), DOI: 10.1016/j.proeng.2014.10.371

Gorobets, A.: Parallel Algorithm of the NOISEtte Code for CFD and CAA Simulations. Lobachevskii Journal of Mathematics 39(4), 524–532 (2018), DOI: 10.1134/S1995080218040078

Gorobets, A., Soukov, S., Bogdanov, P.: Multilevel parallelization for simulating turbulent flows on most kinds of hybrid supercomputers. Computers and Fluids 173, 171–177 (2018), DOI: 10.1016/j.compfluid.2018.03.011

Tautges, T.J.: MOAB-SD: Integrated structured and unstructured mesh representation. Engineering With Computers 20(3), 286–293 (2004), DOI: 10.1007/s00366-004-0296-0

Vazquez, M., Houzeaux, G., Koric, S., Artigues, A., Aguado-Sierra, J., Aris, R., Mira, D., Calmet, H., Cucchietti, F., Owen, H., Taha, A., Burness, E.D., Cela, J.M., Valero, M.: Alya: Multiphysics engineering simulation toward exascale. Journal of Computational Science 14, 15–27 (2016), DOI: 10.1016/j.jocs.2015.12.007

Weinbub, J., Rupp, K., Selberherr, S.: A Flexible Dynamic Data Structure for Scientific Computing. IAENG Transactions on Engineering Technologies. Lecture Notes in Electrical Engineering 229, 565–577 (2013), DOI: 10.1007/978-94-007-6190-2_43




Publishing Center of South Ural State University (454080, Lenin prospekt, 76, Chelyabinsk, Russia)