Optimizing Load Balance in a Parallel CFD Code for a Large-scale Turbine Simulation on a Vector Supercomputer


  • Osamu Watanabe NEC Corporation
  • Kazuhiko Komatsu
  • Masayuki Sato
  • Hiroaki Kobayashi




A turbine for power generation is one of the essential infrastructures in our society. A turbine's failure causes severe social and economic impacts on our everyday life. Therefore, it is necessary to foresee such failures in advance. However, it is not easy to expect these failures from a real turbine. Hence, it is required to simulate various events occurring in the turbine by numerical simulations of the turbine. A multiphysics CFD code, ‘‘Numerical Turbine,’' has been developed on vector supercomputer systems for large-scale simulations of unsteady wet steam flows inside a turbine. To solve this problem, the Numerical Turbine code is a block structure code using MPI parallelization, and the calculation space consists of grid blocks of different sizes. Therefore, load imbalance occurs when executing the code in MPI parallelization. This paper creates an estimation model that finds the calculation time from each grid block's calculation amount and calculation performance. It proposes an OpenMP parallelization method for the load balance of MPI applications. This proposed method reduces the load imbalance by considering the vector performance according to the calculation amount based on the model. Moreover, this proposed method recognizes the need to reduce the load imbalance without pre-execution. The performance evaluation shows that the proposed method improves the load balance from 24.4 % to 9.3 %.


Society 5.0. https://www8.cao.go.jp/cstp/english/society5_0/index.html, accessed: 2021-07-02

Vector Supercomputer SX Series SX-Aurora TSUBASA. https://www.nec.com/en/global/solutions/hpc/sx/docs/SX-Aurora_e.pdf, accessed: 2021-06-13

Egawa, R., Komatsu, K., Isobe, Y., et al.: Performance and power analysis of SX-ACE using HP-X benchmark programs. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER). pp. 693–700. IEEE Computer Society (2017). https://doi.org/10.1109/CLUSTER.2017.65

Egawa, R., Fujimoto, S., Yamashita, T., et al.: Exploiting the Potentials of the Second Generation SX-Aurora TSUBASA. In: 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). pp. 39–49. IEEE (2020). https://doi.org/10.1109/PMBS51919.2020.00010

Giovannini, M., Marconcini, M., Arnone, A., Dominguez, A.: A Hybrid Parallelization Strategy of a CFD Code for Turbomachinery Applications. In: 11th European Conference on Turbomachinery Fluid Dynamics and Thermodynamics, ETC 2015, Madrid, Spain, March 23-27, 2015 (2015)

Gyarmathy, G.: Zur Wachstumsgeschwindigkeit kleiner Flüssigkeitstropfen in einer übersättigten Atmosphäre. Zeitschrift für angewandte Mathematik und Physik ZAMP 14(3), 280–293 (1963). https://doi.org/10.1007/BF01601066

Hougi, Y., Komatsu, K., Watanabe, O., et al.: A hierarchical wavefront method for LUSGS on modern multi-core vector processors. In: 32nd International Conference on Parallel Computational Fluid Dynamics (2020)

Ishizaka, K.: A High-Resolution Numerical Method for Transonic Non-Equilibrium Condensation Flow through a Steam Turbine Cascade. Proc. of the 6th ISCFD, 1995 1, 479–484 (1995)

Komatsu, K., Momose, S., Isobe, Y., et al.: Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. pp. 685–696. IEEE (2018). https://doi.org/10.1109/SC.2018.00057

Komatsu, K., Miyazawa, H., Yiran, C., Sato, M., Furusawa, T., Yamamoto, S., Kobayashi, H.: Detection of machinery failure signs from big time-series data obtained by flow simulation of intermediate-pressure steam turbines (2021)

Lindner, F., Totounferoush, A., Mehl, M., et al.: ExaFSA: Parallel Fluid-Structure-Acoustic Simulation. In: Software for Exascale Computing - SPPEXA 2016-2019. Lecture Notes in Computational Science and Engineering, vol. 136, pp. 271–300. Springer (2020). https://doi.org/10.1007/978-3-030-47956-5_10

MacDougall, F.H.: Kinetic Theory of Liquids. By J. Frenkel. The Journal of Physical and Colloid Chemistry 51(4), 1032–1033 (1947). https://doi.org/10.1021/j150454a025

Menter, F.R.: Two-equation eddy-viscosity turbulence models for engineering applications. AIAA Journal 32(8), 1598–1605 (1994). https://doi.org/10.2514/3.12149

Miyake, S., Koda, I., Yamamoto, S., et al.: Unsteady Wake and Vortex Interactions in 3-D Steam Turbine Low Pressure Final Three Stages. Turbo Expo: Power for Land, Sea, and Air, vol. Volume 1B: Marine; Microturbines, Turbochargers and Small Turbomachines; Steam Turbines (2014). https://doi.org/10.1115/GT2014-25491

Musa, A.,Watanabe, O., Matsuoka, H., et al.: Real-time tsunami inundation forecast system for tsunami disaster prevention and mitigation. Journal of Supercomputing 74(7), 3093–3113 (2018). https://doi.org/10.1007/s11227-018-2363-0

Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, Weimar, Germany, Feb. 18-20, 2009. pp. 427–436. IEEE (2009). https://doi.org/10.1109/PDP.2009.43

Roe, P.L.: Approximate Riemann Solvers, Parameter Vectors, and Difference Schemes. J. Comput. Phys. 135(2), 250–258 (1997). https://doi.org/10.1006/jcph.1997.5705

Simmendinger, C., Kuegeler, E.: Hybrid Parallelization of a Turbomachinery CFD Code: Performance Enhancements on Multicore Architectures pp. 14–17 (2010)

Soga, T., Musa, A., Shimomura, Y., et al.: Performance evaluation of NEC SX-9 using real science and engineering applications. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. pp. 1–12. ACM (2009). https://doi.org/10.1145/1654059.1654088

Watanabe, O., Hougi, Y., Komatsu, K., et al.: Optimizing memory layout of hyperplane ordering for vector supercomputer SX-Aurora TSUBASA. In: 2019 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC), Denver, CO, USA, Nov. 18, 2019. pp. 25–32. IEEE (2019). https://doi.org/10.1109/MCHPC49590.2019.00011

Yamada, Y., Momose, S.: Vector engine processor of NEC’s brand-new supercomputer SX-Aurora TSUBASA. In: International symposium on High Performance Chips (Hot Chips2018) (2018)

Yamamoto, S., Daiguji, H.: Higher-order-accurate upwind schemes for solving the compressible Euler and Navier-Stokes equations. Computers & Fluids 22(2), 259–270 (1993). https://doi.org/10.1016/0045-7930(93)90058-H

Yoon, S., Jameson, A.: Lower-upper Symmetric-Gauss-Seidel method for the Euler and Navier-Stokes equations. AIAA Journal 26(9), 1025–1026 (1988). https://doi.org/10.2514/3.10007




How to Cite

Watanabe, O., Komatsu, K., Sato, M., & Kobayashi, H. (2021). Optimizing Load Balance in a Parallel CFD Code for a Large-scale Turbine Simulation on a Vector Supercomputer. Supercomputing Frontiers and Innovations, 8(2), 114–130. https://doi.org/10.14529/jsfi210207

Most read articles by the same author(s)