Preparing for In Situ Processing on Upcoming Leading-edge Supercomputers

James Kress, Randy Michael Churchill, Scott Klasky, Mark Kim, Hank Childs, David Pugmire


High performance computing applications are producing increasingly large amounts of data and placing enormous stress on current capabilities for traditional post-hoc visualization techniques. Because of the growing compute and I/O imbalance, data reductions, including in situ visualization, are required. These reduced data are used for analysis and visualization in a variety of different ways. Many of he visualization and analysis requirements are known a priori, but when they are not, scientists are dependent on the reduced data to accurately represent the simulation in post hoc analysis. The contributions of this paper is a description of the directions we are pursuing to assist a large scale fusion simulation code succeed on the next generation of supercomputers. These directions include the role of in situ processing for performing data reductions, as well as the tradeoffs between data size and data integrity within the context of complex operations in a typical scientific workflow.

Full Text:



Sean Ahern, Arie Shoshani, Kwan-Liu Ma, Alok Choudhary, Terence Critchlow, Scott Klasky, Valerio Pascucci, J Ahrens, EW Bethel, H Childs, et al. Scientific discovery at the exascale. Report from the DOE ASCR 2011 Workshop on Exascale Data Management, 2011.

CS Chang, S Ku, PH Diamond, Z Lin, S Parker, TS Hahm, and N Samatova. Compressed ion temperature gradient turbulence in diverted tokamak edgea). Physics of Plasmas (1994-present), 16(5):056108, 2009.

Hank Childs, David Pugmire, Sean Ahern, Brad Whitlock, Mark Howison, Prabhat, Gunther H. Weber, and E. Wes Bethel. Extreme scaling of production visualization software on diverse architectures. IEEE Comput. Graph. Appl., 30(3):22–31, May 2010.

Hank Childs, David Pugmire, Sean Ahern, Brad Whitlock, Mark Howison, Prabhat, Gunther H. Weber, and E. Wes Bethel. Visualization at extreme scale concurrency. In E. Wes Bethel, Hank Childs, and Charles Hansen, editors, High Performance Visualization: Enabling Extreme-Scale Scientific Insight. CRC Press, Boca Raton, FL, 2012.

Jong Y Choi, Kesheng Wu, Jacky C Wu, Alex Sim, Qing G Liu, Matthew Wolf, C Chang, and Scott Klasky. Icee: Wide-area in transit data processing framework for near real-time scientific applications. In 4th SC Workshop on Petascale (Big) Data Analytics: Challenges and Opportunities in conjunction with SC13, 2013.

Ciprian Docan, Manish Parashar, and Scott Klasky. Dataspaces: an interaction and coordination framework for coupled simulation workflows. Cluster Computing, 15(2):163–181, 2012.

Nathan Fabian, Kenneth Moreland, David Thompson, Andrew Bauer, Pat Marion, Berk Geveci, Michel Rasquin, and Kenneth Jansen. The paraview coprocessing library: A scalable, general purpose in situ visualization library. In Large Data Analysis and Visualization (LDAV), 2011 IEEE Symposium on, pages 89–96. IEEE, 2011.

Gzip compression.

Hugues Hoppe. New quadric metric for simplifying meshes with appearance attributes. In Proceedings of the 10th IEEE Visualization 1999 Conference (VIS ’99), VISUALIZATION ’99, Washington, DC, USA, 1999. IEEE Computer Society.

D. A. Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40(9):1098–1101, Sept 1952.

Scott Klasky, Hasan Abbasi, Mark Ainsworth, Qing Liu, Jay Lofstead, CArlos Maltzahn, Manish Parashar, and Feyi Wang. Sirius: Science-driven data management for multi-tiered storage. Proceedings of the XXVII IUPAP Conference on Computational Physics, December 2015.

Kalyan Kumaran. Introduction to Mira. Visited June 20, 2016.

Sriram Lakshminarasimhan, Neil Shah, Stephane Ethier, Scott Klasky, Rob Latham, Rob Ross, and Nagiza F. Samatova. Compressing the incompressible with isabela: In-situ reduction of spatiotemporal data. In Proceedings of the 17th International Conference on Parallel Processing - Volume Part I, Euro-Par’11, pages 366–379, Berlin, Heidelberg, 2011. Springer-Verlag.

P. Lindstrom. Fixed-rate compressed floating-point arrays. IEEE Transactions on Visualization and Computer Graphics, 20(12):2674–2683, Dec 2014.

P. Lindstrom and M. Isenburg. Fast and efficient compression of floating-point data. IEEE Transactions on Visualization and Computer Graphics, 12(5):1245–1250, Sept 2006.

Peter Lindstrom, Po Chen, and En-Jui Lee. Reducing disk storage of full-3d seismic waveform tomography (f3dt) through lossy online compression. Computers & Geosciences, 93:45 – 54, 2016.

Qing Liu, Jeremy Logan, Yuan Tian, Hasan Abbasi, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Roselyne Tchoua, Jay Lofstead, Ron Oldfield, Manish Parashar, Nagiza Samatova, Karsten Schwan, Arie Shoshani, Matthew Wolf, Kesheng Wu, and Weikuan Yu. Hello adios: the challenges and lessons of developing leadership class i/o frameworks. Concurrency and Computation: Practice and Experience, 26(7):1453–1473, 2014.

Li-ta Lo, Christopher Sewell, and James P Ahrens. Piston: A portable cross-platform framework for data-parallel visualization operators. In EGPGV, pages 11–20, 2012.

Jay F. Lofstead, Scott Klasky, Karsten Schwan, Norbert Podhorszki, and Chen Jin. Flexible io and integration for scientific codes through the adaptable io system (adios). In Proceedings of the 6th international workshop on Challenges of large applications in distributed environments, CLADE’08, pages 15–24, New York, NY, USA, 2008. ACM.

David P. Luebke. A developer’s survey of polygonal simplification algorithms. IEEE Comput. Graph. Appl., 21(3):24–35, May 2001.

Jeremy S Meredith, Sean Ahern, Dave Pugmire, and Robert Sisneros. EAVL: the extreme-scale analysis and visualization library. In Eurographics Symposium on Parallel Graphics and Visualization, pages 21–30. The Eurographics Association, 2012.

Jeremy S Meredith, Sean Ahern, Dave Pugmire, and Robert Sisneros. EAVL: the extreme-scale analysis and visualization library. In Eurographics Symposium on Parallel Graphics and Visualization, pages 21–30. The Eurographics Association, 2012.

Jeremy S. Meredith, Robert Sisneros, David Pugmire, and Sean Ahern. A distributed data-parallel framework for analysis and visualization algorithm development. In Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, pages 11–19, New York, NY, USA, 2012. ACM.

K. Moreland, U. Ayachit, B. Geveci, and Kwan-Liu Ma. Dax toolkit: A proposed framework for data analysis and visualization at extreme scale. In Large Data Analysis and Visualization (LDAV), 2011 IEEE Symposium on, pages 97–104, Oct 2011.

Kenneth Moreland, Ron Oldfield, Pat Marion, Sebastien Jourdain, Norbert Podhorszki, Venkatram Vishwanath, Nathan Fabian, Ciprian Docan, Manish Parashar, Mark Hereld, et al. Examples of in transit visualization. In Proceedings of the 2nd international workshop on Petascale data analytics: challenges and opportunities, pages 1–6. ACM, 2011.

Kenneth Moreland, Christopher Sewell, William Usher, Lita Lo, Jeremy Meredith, David Pugmire, James Kress, Hendrik Schroots, Kwan-Liu Ma, Hank Childs, Matthew Larsen, Chun-Ming Chen, Robert Maynard, and Berk Geveci. VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures. IEEE Computer Graphics and Applications (CG&A), 36(3):48–58, May/June 2016.

Lucy Nowell. Science at extreme scale: Architectural challenges and opportunities, 2014.˜hereld/doecgf2014/slides/ScienceAtExtremeScale_DOECGF_ Nowell_ 140424v2.pdf.

R. A. Oldfield, P. Widener, A. B. Maccabe, L. Ward, and T. Kordenbrock. Efficient data-movement for lightweight i/o. In 2006 IEEE International Conference on Cluster Computing, pages 1–9, Sept 2006.

David Pugmire, James Kress, Hank Childs, Matthew Wolf, Greg Eisenhauer, Randy Churchill, Tahsin Kurc, Jong Choi, Scott Klasky, Kesheng Wu, Alex Sim, and Junmin Gu. Visualization and analysis for near-real-time decision making in distributed workflows. In High Performance Data Analysis and Visualization (HPDAV) 2016 held in conjuction with IPDPS 2016, May 2016.

David Pugmire, James Kress, Jeremy Meredith, Norbert Podhorszki, Jong Choi, and Scott Klasky. Towards scalable visualization plugins for data staging workflows. In Big Data Analytics: Challenges and Opportunities (BDAC-14) Workshop at Supercomputing Conference, November 2014.

Allen Sanderson, Guoning Chen, Xavier Tricoche, David Pugmire, Scott Kruger, and Joshua Breslau. Analysis of recurrent patterns in toroidal magnetic fields. IEEE Transactions on Visualization and Computer Graphics, 16(6):1431–1440, 2010.

Roselyne Tchoua, Jong Choi, Scott Klasky, Qing Liu, Jeremy Logan, Kenneth Moreland, Jingqing Mu, Manish Parashar, Norbert Podhorszki, David Pugmire, et al. Adios visualization schema: A first step towards improving interdisciplinary collaboration in high performance computing. In eScience (eScience), 2013 IEEE 9th International Conference on, pages 27–34. IEEE, 2013.

Patrick Thibodeau. Coming by 2023, an exascale supercomputer in the U.S. Visited June 20, 2016.

V. Vishwanath, M. Hereld, and M.E. Papka. Toward simulation-time data analysis and i/o acceleration on leadership-class systems. In Large Data Analysis and Visualization (LDAV), 2011 IEEE Symposium on, pages 9–14, 2011.

Brad Whitlock, Jean M Favre, and Jeremy S Meredith. Parallel in situ coupling of simulation with a fully featured visualization system. In Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization, pages 101–109. Eurographics Association, 2011.

J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337–343, May 1977.

Publishing Center of South Ural State University (454080, Lenin prospekt, 76, Chelyabinsk, Russia)