Continuum Computing - on a New Performance Trajectory beyond Exascale

Maciej Brodowicz; Thomas Sterling; Matthew Anderson

doi:10.14529/jsfi180301

Authors

Maciej Brodowicz Indiana University
Thomas Sterling Indiana University
Matthew Anderson Indiana University

DOI:

https://doi.org/10.14529/jsfi180301

Abstract

The end of Moore's Law is a cliche that none the less is a hard barrier to future scaling of high performance computing systems. A factor of about 4x in device density is all that is left of this form of improved throughput with a 5x gain required just to get to the milestone of exascale. The remaining sources of performance improvement are better delivered efficiency of more than 10x and alternative architectures to make better use of chip real estate. This paper will discuss the set of principles guiding a potential future of non-von Neumann architectures as adopted by the experimental class of Continuum Computer Architecture (CCA). It is being explored by the Semantic Memory Architecture Research Team (SMART) at Indiana University. CCA comprises a homogeneous aggregation of cellular components (function cells) which are orders of magnitude smaller than lightweight cores and individually is unable to accomplish a computation but in combination can do so with extreme cost efficiency and unprecedented scalability. It will be seen that a path exists based on such unconventional methods like neuromorphic computing or dataflow that not only will meet the likely exascale milestone in the same time with much better power, cost, and size but also will set a new performance trajectory leading to Zetaflops capability before 2030.

References

Top 500 performance development (Jun 2018), https://www.top500.org/statistics/perfdevel/, accessed: 2018-07-17

Top 500. The List. (Jun 2018), https://www.top500.org/, accessed: 2018-07-17

ARM Holdings Architecture Group: ARMv8 Instruction Set Overview (Nov 2011), PRD03-GENC-010197

Black, B., et al.: Die stacking (3D) microarchitecture. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO’06. pp. 469–479 (December 2006), DOI:10.1109/MICRO.2006.18

Dongarra, J.: Performance of various computers using standard linear equations software. Tech Report CS-89-85, University of Tennesse Computer Science (2014), http: //www.netlib.org/benchmark/performance.pdf, accessed: 2017-07-17

Gao, G.R., Sterling, T., Stevens, R., Hereld, M., Zhu, W.: Parallex: A study of a new parallel computation model. In: 2007 IEEE International Parallel and Distributed Processing Symposium. pp. 1–6 (March 2007), DOI: 10.1109/IPDPS.2007.370484

Hall, M., Kogge, P., Koller, J., Diniz, P., Chame, J., Draper, J., LaCoss, J., Granacki, J., Brockman, J., Srivastava, A., Athas, W., Freeh, V., Shin, J., Park, J.: Mapping irregular applications to DIVA, a PIM-based data-intensive architecture. In: Proceedings of the 1999 ACM/IEEE Conference on Supercomputing. SC ’99, ACM, New York, NY, USA (1999), DOI: 10.1145/331532.331589

Hisamoto, D., Lee, W.C., Kedzierski, J., Takeuchi, H., Asano, K., Kuo, C., Anderson, E., King, T.J., Bokor, J., Hu, C.: FinFET – a self-aligned double-gate MOSFET scalable to 20 nm. IEEE Transactions on Electron Devices 47(12), 2320–2325 (Dec 2000), DOI: 10.1109/16.887014

Hoare, C.A.R.: Communicating sequential processes. Commun. ACM 21(8), 666–677 (Aug 1978), DOI: 10.1145/359576.359585

Josephson, B.: Possible new effects in superconductive tunnelling. Physics Letters 1(7), 251–253 (1962), DOI: 10.1016/0031-9163(62)91369-0

Kaiser, H., Brodowicz, M., Sterling, T.: ParalleX an advanced parallel execution model for scaling-impaired applications. In: Parallel Processing Workshops, 2009. ICPPW ’09. International Conference on. pp. 394–401 (Sep 2009), DOI: 10.1109/ICPPW.2009.14

Lidl, R., Niederreiter, H.: Finite Fields. Encyclopedia of Mathematics and its Applications, Cambridge University Press, 2 edn. (1996), DOI: 10.1017/CBO9780511525926

Likharev, K.K., Semenov, V.K.: RSFQ logic/memory family: a new Josephson-junction technology for sub-terahertz-clock-frequency digital systems. IEEE Transactions on Applied Superconductivity 1(1), 3–28 (March 1991), DOI: 10.1109/77.80745

Modha, D.S.: Brain-inspired computing. In: 2015 International Conference on Parallel Architecture and Compilation (PACT). pp. 253–253 (Oct 2015), DOI: 10.1109/PACT.2015.49

Monroe, D.: Neuromorphic computing gets ready for the (really) big time. Comm. ACM 57(6), 13–15 (Jun 2014), DOI: 10.1145/2601069

Moore, G.E.: Cramming more components onto integrated circuits. Electronics pp. 33–35 (Apr 1965), DOI: 10.1109/N-SSC.2006.4785860

von Neumann, J.: First draft of a report on the EDVAC. Tech. rep., Moore School of Electrical Engineering, University of Pennsylvania (Jun 1945)

von Neumann, J., Taub, A.W., Taub, A.H.: The Collected Works of John von Neumann: 6-Volume Set. Reader’s Digest Young Families (1963)

Oak Ridge National Laboratory: Summit: America’s newest and smartest supercomputer (2018), https://www.olcf.ornl.gov/summit/, accessed: 2018-07-17

Schumacher, B.: Quantum coding. PhysRevA 51, 2738–2747 (Apr 1995), DOI: 10.1103/PhysRevA.51.2738

Shaw, D.E., Deneroff, M.M., Dror, R.O., Kuskin, J.S., Larson, R.H., Salmon, J.K., Young, C., Batson, B., Bowers, K.J., Chao, J.C., Eastwood, M.P., Gagliardo, J., Grossman, J.P., Ho, C.R., Ierardi, D.J., Kolossv´ary, I., Klepeis, J.L., Layman, T., McLeavey, C., Moraes, M.A., Mueller, R., Priest, E.C., Shan, Y., Spengler, J., Theobald, M., Towles, B., Wang, S.C.: Anton, a special-purpose machine for molecular dynamics simulation. Commun. ACM 51(7), 91–97 (Jul 2008), DOI: 10.1145/1364782.1364802

Sterling, T., Kogler, D., Anderson, M., Brodowicz, M.: SLOWER: A performance model for exascale computing. Supercomputing frontiers and innovations 1(2), 42–57 (2014), DOI: 10.14529/jsfi140203

Stern, J.M., Ivey, P.A., Davidson, S., Walker, S.N.: Silicon-on-insulator (SOI): A high performance ASIC technology. In: 1992 Proceedings of the IEEE Custom Integrated Circuits Conference. pp. 9.2.1–9.2.4 (May 1992), DOI: 10.1109/CICC.1992.591170

WikiChip: 14 nm lithography process, https://en.wikichip.org/wiki/14_nm_lithography_process, accessed: 2018-07-17