Energy Efficiency for Ultrascale Systems: Challenges and Trends from Nesus Project

Michel Bagein, Jorge Barbosa, Vicente Blanco, Ivona Brandic, Samuel Cremer, Sebastien Fremal, Helen Karatza, Laurent Lefevre, Toni Mastelic, Ariel Oleksiak, Anne-Cecile Orgerie, Georgios L. Stavrinides, Sebastien Varrette


Energy consumption is one of the main limiting factors for designing and deploying ultrascale systems. Therefore, this paper presents challenges and trends associated with energy efficiency for ultrascale systems based on current activities of the working group on "Energy Efficiency" in the European COST Action Nesus IC1305. The analysis contains major areas that are related to studies of energy efficiency in ultrascale systems: heterogeneous and low power hardware architectures, power monitoring at large scale, modeling and simulation of ultrascale systems, energy-aware scheduling and resource management, and energy-efficient application design.

Full Text:



Embedded tegra & jetson tk1 blog. URL 114318922342198493952/posts

Energy-conscious 3D Server-on-Chip for Green Cloud.

European Mont-Blanc Project.

The green500 list - november 2014. URL green500-list-november-2014

Hpl - a portable implementation of the high-performance linpack benchmark for distributed- memory computers. URL

Mapd - the fastest big data exploration platform. URL

Nvidia grid - gprahics accelerated virtual desktops and applications. URL http://www.

Parstream - analytics built for iot. URL

Pezy-sc many core processor(2014). URL

Adair, R.J.: A virtual machine system for the 360/40. International Business Machines Corporation, Cambridge Scientific Center (1966)

Adaptive Computing: Moab workload manager administrator’s guide, version 8.0.0 (2014). URL

Ajila, S., Bankole, A.: Cloud client prediction models using machine learning techniques. In: Computer Software and Applications Conference (COMPSAC), 2013 IEEE 37th Annual, pp. 134–142 (2013). DOI 10.1109/COMPSAC.2013.21

et al., B.H.: Mesos: A platform for fine-grained resource sharing in the data center. In: 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI’11). USENIX Association, Boston, MA (2011)

et al., M.S.: Omega: flexible, scalable schedulers for large compute clusters. In: EuroSys (2013)

Allalouf, M., Arbitman, Y., Factor, M., Kat, R.I., Meth, K., Naor, D.: Storage modeling for power estimation. In: Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, SYSTOR ’09, pp. 3:1–3:10. ACM, New York, NY, USA (2009). DOI 10.1145/ 1534530.1534535. URL

Arabnejad, H., Barbosa, J.: Fairness resource sharing for dynamic workflow scheduling on heterogeneous systems. In: Proceedings of the International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 633—-639. IEEE (2012)

ARM Limited: ARM Energy Probe (n.d.). URL arm-energy-probe/

Armand, F., Gien, M.: A practical look at micro-kernels and virtual machine monitors. In: Consumer Communications and Networking Conference, 2009. CCNC 2009. 6th IEEE, pp. 1–7 (2009). DOI 10.1109/CCNC.2009.4784874

Armbrust, M., al.: Above the clouds: A berkeley view of cloud computing. Tech. Rep. UCB/EECS-2009-28, EECS Department, University of California, Berkeley (2009). URL

Bakkum, P., Skadron, K.: Accelerating sql database operations on a gpu with cuda. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units. ACM, New York, NY, USA (2010)

Bakkum, P., Skadron, K.: Accelerating sql database operations on a gpu with cuda : Ex- tended results. In: University of Virginia Department of Computer Science Technical Report. Virginia, USA (2010)

Barrachina, S., Barreda, M., Catal ́an, S., Dolz, M.F., Fabregat, G., Mayo, R., Quintana- Ort ́ı, E.S.: An integrated framework for power-performance analysis of parallel scientific workloads. Energy 2013 : the third international conference on smart grids, green commu- nications and it energy-aware technologies pp. 114–119 (2013)

Barroso, L.A., Clidaras, J., Holzle, U.: The Datacenter as a Computer, 2nd edition edn. Morgan and Claypool Publishers (2013)

Basmadjian, R., Ali, N., Niedermeier, F., de Meer, H., Giuliani, G.: A methodology to predict the power consumption of servers in data centres. In: Proceedings of the 2Nd International Conference on Energy-Efficient Computing and Networking, e-Energy ’11, pp. 1–10. ACM, New York, NY, USA (2011). DOI 10.1145/2318716.2318718. URL http: //

Becchi, M., Sajjapongse, K., Graves, I., Procter, A., Ravi, V., Chakradhar, S.: A virtual memory based runtime to support multi-tenancy in clusters with gpus. In: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing. ACM, New York, NY, USA (2012)

Bedard, D., Lim, M.Y., Fowler, R., Porterfield, A.: Powermon: Fine-grained and integrated power monitoring for commodity computer systems. In: IEEE SoutheastCon 2010 (South- eastCon), Proceedings of the, pp. 479–484 (2010). DOI 10.1109/SECON.2010.5453824

Beloglazov, A., Buyya, R.: Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Concurrency and Computation: Practice and Experience 24(13), 1397–1420 (2012)

Beloglazov, A., Buyya, R., Lee, Y.C., Zomaya, A.Y.: A taxonomy and survey of energy- efficient data centers and cloud computing systems. CoRR abs/1007.0066 (2010). URL

Benoit, A., Marchal, L., Pineau, J., Robert, Y., Vivien, F.: Scheduling concurrent bag-of- tasks applications on heterogeneous platforms. IEEE Transactions on Computers 59(2), 202–217 (2010)

Bilal, K., Fayyaz, A., Khan, S.U., Usman, S.: Power-aware resource allocation in computer clusters using dynamic threshold voltage scaling and dynamic voltage scaling: comparison and analysis. Cluster Computing pp. 1–24 (2015)

Bourdon, A., Noureddine, A., Rouvoy, R., Seinturier, L.: Powerapi: A software library to monitor the energy consumed at the process-level. ERCIM News 2013(92) (2013)

Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000). DOI 10.1177/109434200001400303. URL 1177/109434200001400303

Cabrera, A., Almeida, F., Arteaga, J., Blanco, V.: Energy Measurement Library (EML) (n.d.). URL

Cabrera, A., Almeida, F., Blanco, V.: Eml, an energy measurement library. In: 31st Inter- national Symposium on Computer Performance, Modeling, Measurements and Evaluation (2013). Student Poster Abstracts

Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A.F., Buyya, R.: Cloudsim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw. Pract. Exper. 41(1), 23–50 (2011). DOI 10.1002/ spe.995. URL

Chase, J., Doyle, R.: Balance of power: Energy management for server clusters. In: Work- shop on Hot Topics in Operating Systems (2001)

Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Commu- nications ACM 51(1), 107–113 (2008)

Dechamps, N., Bagein, M., Benjelloun, M., Mahmoudi, S.: Boosting open-source database engines with graphics processors. In: Proceedings of the 2012 Seventh International Con- ference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC ’12, pp. 262–266. IEEE Computer Society, Washington, DC, USA (2012). DOI 10.1109/3PGCIC.2012.14. URL

Delplace, V., Manneback, P., Pinel, F., Varrette, S., Bouvry, P.: Comparing the Performance and Power Usage of GPU and ARM Clusters for Map-Reduce. In: Proc. of the 3rd Intl. Conf. on Cloud and Green Computing (CGC’13), pp. 199–200. IEEE Computer Society (2013)

Dias De Assuncao, M., Gelas, J.P., Lef`evre, L., Orgerie, A.C.: The Green Grid5000: Instru- menting a Grid with Energy Sensors. In: Springer (ed.) INGRID’2010 : 5th International Workshop on Distributed Cooperative Laboratories: Instrumenting the Grid, pp. 25–42. Springer, Poznan, Poland (2010). DOI 10.1007/978-1-4614-0508-5 3

Felter, W., Ferreira, A., Rajamony, R., Rubio, J.: An updated performance comparison of virtual machines and linux containers. Tech. rep., IBM Research (2014)

Flinn, J., Satyanarayanan, M.: Powerscope: A tool for profiling the energy usage of mobile applications. In: WMCSA, pp. 2–10. IEEE Computer Society (1999)

Ganglia: Ganglia Monitoring System webpage (n.d.). URL

Ge, R., Feng, X., Song, S., Chang, H.C., Li, D., Cameron, K.W.: Powerpack: Energy profiling and analysis of high-performance systems and applications. IEEE Trans. Parallel Distrib. Syst. 21(5), 658–671 (2010)

Govindaraju, N.K., Lloyd, B., Wang, W., Lin, M., Manocha, D.: Fast computation of database operations using graphics processors. In: Proceedings of the 2004 ACM SIGMOD international conference on Management of data. ACM, New York, NY, USA (2004)

Guzek, M., Varrette, S., Plugaru, V., Sanchez, J.E., Bouvry, P.: A Holistic Model of the Performance and the Energy-Efficiency of Hypervisors in an HPC Environment. In: Proc. of the Intl. Conf. on Energy Efficiency in Large Scale Distributed Systems (EE-LSDS’13), LNCS, vol. 8046, pp. 133–152. Springer Verlag, Vienna, Austria (2013). DOI 10.1007/ 978-3-642-40517-413

He, B., Yang, K., Fang, R., Lu, M., Govindaraju, N., Luo, Q., Sander, P.: Relational joins on graphics processors. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, New York, NY, USA (2008)

Healey, C.M., VanGilder J. W., S.Z.R., Zhang, X.S.: Potential-flow modeling for data center applications. In: Proceedings of the ASME 2011 Pacific Rim Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Systems, vol. 2, pp. 527–534. ASME (2011). DOI 10.1115/IPACK2011-52136

Hoffman, K., Hedge, P.: ARM Cortex-A8 vs. Intel Atom: Architectural and Benchmark Comparisons. Tech. rep., University of Texas at Dallas (2009)

Hsu, C., Huang, K., Wang, F.: Online scheduling of workflow applications in grid environ- ments. Future Generation Computer Systems 27(6), 860–870 (2011)

Jackson, D., Snell, Q., Clement, M.: Core algorithms of the maui scheduler. In: D. Feitelson, L. Rudolph (eds.) Job Scheduling Strategies for Parallel Processing, LNCS, vol. 2221, pp. 87–102 (2001)

Jarus, M., Varrette, S., Oleksiak, A., Bouvry, P.: Performance Evaluation and Energy Efficiency of High-Density HPC Platforms Based on Intel, AMD and ARM Processors. In: Proc. of the Intl. Conf. on Energy Efficiency in Large Scale Distributed Systems (EE- LSDS’13), LNCS, vol. 8046, pp. 182–200. Springer Verlag, Vienna, Austria (2013). DOI 10.1007/978-3-642-40517-416

Jin, Y., Wen, Y., Chen, Q.: Energy efficiency and server virtualization in data centers: An empirical investigation. In: Computer Communications Workshops (INFOCOM WKSHPS), 2012 IEEE Conference on, pp. 133–138. IEEE (2012)

Joshi, Y.: Reduced order thermal models of multi-scale microsystems. In: Proceedings of the 14th International Heat Transfer Conference, vol. 8, pp. 519–536. ASME (2010). DOI 10.1115/IHTC14-23373

Juve, G., Chervenak, A., Deelman, E., Bharathi, S., Mehta, G., Vahi, K.: Characterizing and profiling scientific workflows. Future Generation Computer Systems 29, 682–692 (2013) Perf Wiki (2014). URL Main_Page&oldid=3491

Kim, K.H., Lee, W.Y., Jong, K., Buyya, R.: Sla-based scheduling of bag-of-tasks applica- tions on power-aware cluster systems. IEICE TRANSACTIONS on Information and Systems 93(12), 3194–3201 (2010)

Kliazovich, D., Bouvry, P., Audzevich, Y., Khan, S.: Greencloud: A packet-level simulator of energy-aware cloud computing data centers. In: Global Telecommunications Conference (GLOBECOM 2010), 2010 IEEE, pp. 1–5 (2010). DOI 10.1109/GLOCOM.2010.5683561

Kliazovich, D., Bouvry, P., Khan, S.: Greencloud: a packet-level simulator of energy-aware cloud computing data centers. In: Global Telecommunications Conference. IEEE, Miami, FL, USA (2010)

Koomey, J.: Growth in data center electricity use 2005 to 2010. In: Technical report. AnalyticsPress, Oakland, CA (2011)

Kurowski, K., Oleksiak, A., Piatek, W., Piontek, T., Przybyszewski, A.W., Weglarz, J.: Dcworms - a tool for simulation of energy efficiency in distributed computing in- frastructures. Simulation Modelling Practice and Theory 39, 135–151 (2013). URL

Laros, J.H., DeBonis, D., Pokorny, P.: PowerInsight - A Commodity Power Measurement Capability. (2013)

Liu, H., Jin, H., Xu, C.Z., Liao, X.: Performance and energy modeling for live migration of virtual machines. Cluster computing 16(2), 249–264 (2013)

Mastelic, T., Brandic, I.: Timecap: Methodology for comparing it infrastructures based on time and capacity metrics. In: Cloud Computing (CLOUD), 2013 IEEE Sixth International Conference on, pp. 131–138. IEEE (2013)

Mastelic, T., Oleksiak, A., Claussen, H., Brandic, I., Pierson, J.M., Vasilakos, A.V.: Cloud computing: Survey on energy efficiency. ACM Comput. Surv. 47(2), 33:1–33:36 (2014). DOI 10.1145/2656204. URL

Minkenberg, C., Denzel, W., Rodriguez, G., Birke, R.: End-to-end modeling and simulation of high-performance computing systems. In: Use Cases of Discrete Event Simulation, pp. 201–240. Springer (2012)

Moore G. E., e.a.: Cramming more components onto integrated circuits. In: IEEE 86, vol. 1, pp. 82–85 (1988)

Moore, J., Chase, J., Ranganathan, P., Sharma, R.: Making scheduling ”cool”: Temperature-aware workload placement in data centers. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC ’05, pp. 5–5. USENIX Associ- ation, Berkeley, CA, USA (2005). URL 1247365

Orgerie, A.C., Assuncao, M.D.d., Lefevre, L.: A survey on techniques for improving the energy efficiency of large-scale distributed systems. ACM Comput. Surv. 46(4), 47:1–47:31 (2014). DOI 10.1145/2532637. URL

Ou, Z., Pang, B., Deng, Y., Nurminen, J., Yl ̈a-J ̈a ̈aski, A., Hui, P.: Energy- and Cost- Efficiency Analysis of ARM-Based Clusters. In: 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2012, pp. 115–123 (2012)

P3 International: Kill A Watt product page (n.d.). URL http://www.p3international. com/products/p4400.html

Padoin, E.L., de Oliveira, D., Velho, P., Navaux, P.: Time-to-Solution and Energy-to- Solution: A Comparison between ARM and Xeon. In: Third Workshop on Applications for Multi-Core Architectures (WAMCA), pp. 48–53 (2012)

PBS Works: Pbs professional 12.2, user’s guide (2014). URL

Pedram, M., Hwang, I.: Power and performance modeling in a virtualized server system. In: Parallel Processing Workshops (ICPPW), 2010 39th International Conference on, pp. 520–526 (2010). DOI 10.1109/ICPPW.2010.76

Petrucci, V., Loques, O., Moss ́e, D.: A dynamic optimization model for power and perfor- mance management of virtualized clusters. In: Proceedings of the 1st International Confer- ence on Energy-Efficient Computing and Networking, pp. 225–233. ACM (2010)

Piatek, W., Oleksiak, A., Da Costa, G.: Energy and thermal models for simulation of workload and resource management in computing systems. Simulation Modelling Practice and Theory (2015)

Pierson, J.M., Costa, G.D., Dittmann, L. (eds.): Energy Efficiency in Large Scale Dis- tributed Systems - COST IC0804 European Conference, EE-LSDS 2013, Vienna, Austria, April 22-24, 2013, Revised Selected Papers, Lecture Notes in Computer Science, vol. 8046. Springer (2013)

Rivoire, S., Ranganathan, P., Kozyrakis, C.: A comparison of high-level full-system power models. In: Proceedings of the 2008 Conference on Power Aware Computing and Systems, HotPower’08, pp. 3–3. USENIX Association, Berkeley, CA, USA (2008). URL http://dl.

Ryffel, S.: Lea2p: The linux energy attribution and accounting platform. Master’s thesis, Swiss Federal Institute of Technology (2009)

Sandia National Laboratories: High performance computing power application programming interface (api) specification (2014). URL

Sch ̈appi, B., Bogner, T., Teschner, H.: Efficacit ́e ́energ ́etique des technologies et in- frastructures dans les datacentres et salles serveurs. Prime Energy IT, Wien, Austria (2011). URL brochure_fr.pdf

Siriwardana, J., Halgamuge, S.K., Scherer, T., Schott, W.: Minimizing the thermal impact of computing equipment upgrades in data centers. Energy and Buildings 50(0), 81 – 92 (2012). DOI URL http://www.sciencedirect. com/science/article/pii/S0378778812001740

Stathopoulos, T., McIntire, D., Kaiser, W.J.: The energy endoscope: Real-time detailed energy accounting for wireless sensor nodes. In: IPSN, pp. 383–394. IEEE Computer Society (2008)

Stavrinides, G.L., Karatza, H.D.: The impact of resource heterogeneity on the timeliness of hard real-time complex jobs. In: Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments, p. 65. ACM (2014)

Tang, Q., Gupta, S., Stanzione, D., Cayton, P.: Thermal-aware task scheduling to minimize energy usage of blade server based datacenters. In: Dependable, Autonomic and Secure Computing, 2nd IEEE International Symposium on, pp. 195–202 (2006). DOI 10.1109/ DASC.2006.47

Terzopoulos, G., Karatza, H.: Performance evaluation and energy consumption of a real- time heterogeneous grid system using dvs and dpm. Simulation Modelling Practice and Theory 36, 33–43 (2013)

Terzopoulos, G., Karatza, H.: Bag-of-task scheduling on power-aware clusters using a dvfs- based mechanism. In: Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International, pp. 833–840. IEEE (2014)

Terzopoulos, G., Karatza, H.: Energy-efficient real-time heterogeneous cluster scheduling with node replacement due to failures. The Journal of Supercomputing 68(2), 867–889 (2014)

Treibig, J., Hager, G., Wellein, G.: Likwid: A lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures. San Diego CA (2010)

Turnbull, J.: The Docker book. v1.3.1 (2014)

Vallee, G., Naughton, T., Engelmann, C., Ong, H., Scott, S.L.: System-level virtualiza- tion for high performance computing. In: 16th Euromicro Conference on Distributed and Network-Based Processing, pp. 636–643 (2008)

Vaquero, L.M., Rodero-Merino, L., Caceres, J., Lindner, M.: A break in the clouds: towards a cloud definition. SIGCOMM Comput. Commun. Rev. 39(1), 50–55 (2008). DOI 10.1145/ 1496091.1496100. URL

Varrette, S., Guzek, M., Plugaru, V., Besseron, X., Bouvry, P.: HPC Performance and Energy-Efficiency of Xen, KVM and VMware Hypervisors. In: Proc. of the 25th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2013), pp. 89–96. IEEE Computer Society, Porto de Galinhas, Brazil (2013)

Vishwanath Member, A., Hinton, K., Ayre, R., Tucker, R.: Modeling energy consumption in high-capacity routers and switches. Selected Areas in Communications, IEEE Journal on 32(8), 1524–1532 (2014). DOI 10.1109/JSAC.2014.2335312

Von Laszewski, G., Wang, L., Younge, A.J., He, X.: Power-aware scheduling of virtual ma- chines in dvfs-enabled clusters. In: Cluster Computing and Workshops, 2009. CLUSTER’09. IEEE International Conference on, pp. 1–10. IEEE (2009)

Watt’s Up Meters: Watt’s Up product page (n.d.). URL https://www.wattsupmeters. com/

Yu, Z., Shi, W.: A planner-guided scheduling strategy for multiple workflow applications. In: Proceedings of the International Conference on Parallel Processing-Workshops (ICPP- W’08), pp. 1–8. IEEE (2008)

Zhang, X., McIntosh, S., Rohatgi, P., Griffin, J.L.: Xensocket: A high-throughput inter- domain transport for virtual machines. In: Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware. Springer-Verlag New York, Inc., New York, NY, USA (2007)

Zheng, G., Kakulapati, G., Kal ́e, L.V.: Bigsim: A parallel simulator for performance pre- diction of extremely large parallel machines. In: In18th Intl.Paralleland Distr.Proc. Symp. (IPDPS, p. 78 (2004)

Publishing Center of South Ural State University (454080, Lenin prospekt, 76, Chelyabinsk, Russia)