Efficient Implementation of Liquid Crystal Simulation Software on Modern HPC Platforms

Authors

  • Ilya V. Afanasyev Moscow Center of Fundamental and Applied Mathematics; Research Computing Center, Lomonosov Moscow State University
  • Dmitry I. Lichmanov Moscow Center of Fundamental and Applied Mathematics; Research Computing Center, Lomonosov Moscow State University
  • Vladimir Yu. Rudyak Faculty of Physics, Lomonosov Moscow State University
  • Vadim V. Voevodin Moscow Center of Fundamental and Applied Mathematics; Research Computing Center, Lomonosov Moscow State University

DOI:

https://doi.org/10.14529/jsfi210306

Keywords:

NVIDIA GPU, NEC SX-Aurora TSUBASA, liquid crystals, HPC, co-design, performance optimization, Monte Carlo, cubic lattice

Abstract

In this paper we demonstrate the process of efficient porting a software package for Markov chain Monte Carlo (MCMC) simulations on a finite cubic lattice on multiple modern architectures: Pascal, Volta and Turing NVIDIA GPUs, NEC SX-Aurora TSUBASA vector engines and Intel Xeon Gold processors. In the studied software, MCMC methodology is used for simulations of liquid crystal structures, but it can be as well employed in a wide range of problems of mathematical physics and numerical methods. The main goals of this work are to determine the best software optimization strategy for this class of algorithms and to examine the speed and the efficiency of such simulations on modern HPC platforms. We evaluate the effects of various optimizations, such as using more suitable memory access patterns, multitasking for efficient utilization of massive parallelism on the target architectures, improved cache hit-rates, parallel workload balancing, etc. We perform a detailed performance analysis for each target platform using software tools such as nvprof, Ftrace and VTune. On this basis, we evaluate and compare the efficiency of the developed computational kernels on different platforms and subsequently rank these platforms by their performance. The results show that NVIDIA GPU and NEC SX-Aurora TSUBASA platforms, although at first glance seem very different, require similar optimization approaches in many cases due to similarities in data processing principles.

References

Afanasyev, I.V., Voevodin, V.V., Voevodin, V.V., et al.: Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors. In: Parallel Computing Technologies - 15th International Conference, PaCT 2019, Proceedings. Lecture Notes in Computer Science, vol. 11657, pp. 125–139. Springer (2019). https://doi.org/10.1007/978-3-030-25636-4_10

Barbosa, C.H., Kunstmann, L.N., Silva, R.M., et al.: A workflow for seismic imaging with quantified uncertainty. Computers & Geosciences 145, 104615 (2020). https://doi.org/10.1016/j.cageo.2020.104615

Baxter, R.: The inversion relation method for some two-dimensional exactly solved models in lattice statistics. Journal of Statistical Physics 28(1), 1–41 (1982). https://doi.org/10.1007/BF01011621

Bergstrom, L.: Measuring NUMA effects with the STREAM benchmark. CoRR abs/1103.3225 (2011), http://arxiv.org/abs/1103.3225

Block, B., Virnau, P., Preis, T.: Multi-GPU accelerated multi-spin Monte Carlo simulations of the 2D Ising model. Computer Physics Communications 181(9), 1549–1556 (2010). https://doi.org/10.1016/j.cpc.2010.05.005

Boroni, G., Dottori, J., Rinaldi, P.: Full GPU implementation of Lattice-Boltzmann methods with Immersed Boundary Conditions for Fast Fluid Simulations. The International Journal of Multiphysics 11(1), 1–14 (2017). https://doi.org/10.21152/1750-9548.11.1.1

Dudzin-acuteski, M., Sznajd, J.: Suzuki-Trotter decomposition and renormalization of a transverse-field Ising model in two dimensions. Phys. Rev. B 55(22), 14948–14952 (1997). https://doi.org/10.1103/PhysRevB.55.14948

Egawa, R., Komatsu, K., Momose, S., et al.: Potential of a modern vector supercomputer for practical applications: performance evaluation of SX-ACE. The Journal of Supercomputing 73(9), 3948–3976 (2017). https://doi.org/10.1007/s11227-017-1993-y

Fang, Y., Feng, S., Tam, K.M., et al.: Parallel tempering simulation of the three-dimensional Edwards–Anderson model with compact asynchronous multispin coding on GPU. Computer Physics Communications 185(10), 2467–2478 (2014). https://doi.org/10.1016/j.cpc.2014.05.020

Floudas, C.A., Pardalos, P.M.: Encyclopedia of Optimization. Springer Science+Buisiness Media, LLC. (2009), https://www.springer.com/gp/book/9780387747583

Geng, Y., Noh, J., Drevensek-Olenik, I., et al.: High-fidelity spherical cholesteric liquid crystal Bragg reflectors generating unclonable patterns for secure authentication. Scientific Reports 6(1) (2016). https://doi.org/10.1038/srep26840

Goodby, J.W., Tschierske, C., Raynes, P., et al. (eds.): Handbook of Liquid Crystals. Wiley-VCH Verlag GmbH & Co. KGaA (2014). https://doi.org/10.1002/9783527671403

Gourgoulias, K., Katsoulakis, M.A., Rey-Bellet, L.: Information criteria for quantifying loss of reversibility in parallelized KMC. Journal of Computational Physics 328, 438–454 (2017). https://doi.org/10.1016/j.jcp.2016.10.031

Gourgoulias, K., Katsoulakis, M.A., Rey-Bellet, L.: Information metrics for long-time errors in splitting schemes for stochastic dynamics and parallel kinetic Monte Carlo. SIAM Journal on Scientific Computing 38(6), A3808–A3832 (2016). https://doi.org/10.1137/15m1047271

Harju, A., Siro, T., Canova, F.F., et al.: Computational physics on graphics processing units. In: Applied Parallel and Scientific Computing - 11th International Conference, PARA 2012, Revised Selected Papers. Lecture Notes in Computer Science, vol. 7782, pp. 3–26. Springer (2012). https://doi.org/10.1007/978-3-642-36803-5_1

Huth, B., Meyer, N., Wettig, T.: Lattice QCD on a novel vector architecture. CoRR abs/2001.07557 (2020), https://arxiv.org/abs/2001.07557

Kapitan, V.Y., Nefedev, K.V.: High performance calculation of magnetic properties and simulation of nonequilibrium phenomena in nanofilms. In: Modeling, Simulation and Optimization of Complex Processes - HPSC 2012, pp. 95–107. Springer (2014). https://doi.org/10.1007/978-3-319-09063-4_8

Khan, M., Li, W., Mao, S., et al.: Real-time imaging of ammonia release from single live cells via liquid crystal droplets immobilized on the cell membrane. Advanced Science 6(20), 1900778 (2019). https://doi.org/10.1002/advs.201900778

Komatsu, K., Momose, S., Isobe, Y., et al.: Performance evaluation of a vector supercomputer SX-Aurora TSUBASA. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. pp. 54:1–54:12. SC ’18, IEEE, Piscataway, NJ, USA (2018). https://doi.org/10.1109/SC.2018.00057

Krakhalev, M.N., Rudyak, V.Yu., Prishchepa, O.O., et al.: Orientational structures in cholesteric droplets with homeotropic surface anchoring. Soft Matter 15(28), 5554–5561 (2019). https://doi.org/10.1039/c9sm00384c

Lagerwall, J.P., Scalia, G.: A new era for liquid crystal research: Applications of liquid crystals in soft matter nano-, bio- and microtechnology. Current Applied Physics 12(6), 1387–1412 (2012). https://doi.org/10.1016/j.cap.2012.03.019

Larsen, T., Bjarklev, A., Hermann, D., Broeng, J.: Optical devices based on liquid crystal photonic bandgap fibres. Optics Express 11(20), 2589 (2003). https://doi.org/10.1364/oe.11.002589

Li, W., Fan, Z., Wei, X., Kaufman, A.: GPU-Based flow simulation with complex boundaries. Tech. rep. (2003)

Maltseva, D., Zablotskiy, S., Martemyanova, J., et al.: Diagrams of states of single flexible-semiflexible multi-block copolymer chains: A flat-histogram Monte Carlo study. Polymers 11(5) (2019). https://doi.org/10.3390/polym11050757

Maruyama, N., Nomura, T., Sato, K., Matsuoka, S.: Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In: Conference on High Performance Computing Networking, Storage and Analysis, SC 2011. pp. 11:1–11:12. ACM (2011). https://doi.org/10.1145/2063384.2063398

Otsuka, Y., Seo, H., Motome, Y., Kato, T.: Finite-temperature phase diagram of quasi-onedimensional molecular conductors: Quantum Monte Carlo study. Journal of the Physical Society of Japan 77(11), 113705 (2008). https://doi.org/10.1143/jpsj.77.113705

Peng, B., Li, J., Akkas, S., et al.: Rank position forecasting in car racing. In: 35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021, Portland, OR, USA, May 17-21, 2021. pp. 724–733. IEEE (2021). https://doi.org/10.1109/IPDPS49936.2021.00082

Phillips, J.C., Hardy, D.J., Maia, J.D.C., et al.: Scalable molecular dynamics on CPU and GPU architectures with NAMD. The Journal of Chemical Physics 153(4), 044130 (2020). https://doi.org/10.1063/5.0014475

Raedt, H.D., Lagendijk, A.: Monte Carlo simulation of quantum statistical lattice models. Physics Reports 127(4), 233–307 (1985). https://doi.org/10.1016/0370-1573(85)90044-4

Rao, S.S.: Engineering Optimization: Theory and Practice. John Wiley & Sons, Inc. (2009), https://www.wiley.com/en-us/Engineering+Optimization%3A+Theory+and+Practice%2C+5th+Edition-p-9781119454793

Rathore, N., de Pablo, J.J.: Monte Carlo simulation of proteins through a random walk in energy space. The Journal of Chemical Physics 116(16), 7225–7230 (2002). https://doi.org/10.1063/1.1463059

Resch, M.M., Kovalenko, Y., Bez, W., et al. (eds.): Sustained Simulation Performance 2018 and 2019. Springer (2020). https://doi.org/10.1007/978-3-030-39181-2

Romero, J., Bisson, M., Fatica, M., Bernaschi, M.: High performance implementations of the 2D Ising model on GPUs. Computer Physics Communications 256, 107473 (2020). https://doi.org/10.1016/j.cpc.2020.107473

Rudyak, V.Yu., Emelyanenko, A.V., Loiko, V.A.: Structure transitions in oblate nematic droplets. Physical Review E 88(5) (2013). https://doi.org/10.1103/physreve.88.052501

Rudyak, V.Y., Krakhalev, M.N., Sutormin, V.S., et al.: Electrically induced structure transition in nematic liquid crystal droplets with conical boundary conditions. Physical Review E 96(5) (2017). https://doi.org/10.1103/physreve.96.052701

Shakirov, T., Zablotskiy, S., Boeker, A., et al.: Comparison of Boltzmann and Gibbs entropies for the analysis of single-chain phase transitions. The European Physical Journal H 226, 705–723 (2017). https://doi.org/10.1140/epjst/e2016-60326-1

Shao, W., Guo, G.: Multiple-try simulated annealing algorithm for global optimization. Mathematical Problems in Engineering 2018, 1–11 (2018). https://doi.org/10.1155/2018/9248318

Shvetsov, S.A., Emelyanenko, A.V., Boiko, N.I., et al.: Communication: Orientational structure manipulation in nematic liquid crystal droplets induced by light excitation of azodendrimer dopant. The Journal of Chemical Physics 146(21), 211104 (2017). https://doi.org/10.1063/1.4984984

Shvetsov, S.A., Rudyak, V.Yu., Emelyanenko, A.V., et al.: Photoinduced orientational structures of nematic liquid crystal droplets in contact with polyimide coated surface. Journal of Molecular Liquids 267, 222–228 (2018). https://doi.org/10.1016/j.molliq.2018.01.054

Sivakumar, S., Wark, K.L., Gupta, J.K., et al.: Liquid crystal emulsions as the basis of biological sensors for the optical detection of bacteria and viruses. Advanced Functional Materials 19(14), 2260–2265 (2009). https://doi.org/10.1002/adfm.200900399

Tian, Z., Yokoyama, H., Araki, T.: Parallel latent dirichlet allocation using vector processors. In: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). pp. 1548–1555. IEEE (2019). https://doi.org/10.1109/hpcc/smartcity/dss.2019.00213

Tran, N.P., Lee, M., Hong, S.: Performance optimization of 3D lattice Boltzmann flow solver on a GPU. Scientific Programming 2017, 1–16 (2017). https://doi.org/10.1155/2017/1205892

Tu, J., Yeoh, G.H., Liu, C.: Chapter 9 - some advanced topics in CFD. In: Tu, J., Yeoh, G.H., Liu, C. (eds.) Computational Fluid Dynamics (Third Edition), pp. 369–417. Butterworth-Heinemann, third edition edn. (2018). https://doi.org/10.1016/B978-0-08-101127-0.00009-X

Weigel, M.: Simulating spin models on GPU. Computer Physics Communications 182(9), 1833–1836 (2011). https://doi.org/10.1016/j.cpc.2010.10.031

Weigel, M.: Monte Carlo methods for massively parallel computers. In: Order, Disorder and Criticality, pp. 271–340. World Scientific (2017). https://doi.org/10.1142/9789813232105_0006

Yamada, Y., Momose, S.: Vector engine processor of NEC Brand-New supercomputer SX-Aurora TSUBASA. In: Intenational symposium on High Performance Chips (Hot Chips2018) (2018)

Yusoff, M.N.S., Jaafar, M.S.: Performance of CUDA GPU in Monte Carlo simulation of light-skin diffuse reflectance spectra. In: 2012 IEEE-EMBS Conference on Biomedical Engineering and Sciences. pp. 264–269. IEEE (2012). https://doi.org/10.1109/iecbes.2012.6498056

Zablotskiy, S.V., Martemyanova, J.A., Ivanov, V.A., Paul, W.: Diagram of states and morphologies of flexible-semiflexible copolymer chains: A Monte Carlo simulation. Journal of Chemical Physics 144(24), 244903 (2016). https://doi.org/10.1063/1.4946035

Downloads

Published

2021-10-20

How to Cite

Afanasyev, I. V., Lichmanov, D. I., Rudyak, V. Y., & Voevodin, V. V. (2021). Efficient Implementation of Liquid Crystal Simulation Software on Modern HPC Platforms. Supercomputing Frontiers and Innovations, 8(3), 104–125. https://doi.org/10.14529/jsfi210306

Most read articles by the same author(s)