박사

빠르고 정확하고 효율적인 컴퓨터 시스템 성능 모델링: 프로세서부터 데이터센터까지 = Fast, Accurate, and Efficient Performance Modeling of Computer Systems: From Processors to Datacenters

이재원 2018년
논문상세정보
' 빠르고 정확하고 효율적인 컴퓨터 시스템 성능 모델링: 프로세서부터 데이터센터까지 = Fast, Accurate, and Efficient Performance Modeling of Computer Systems: From Processors to Datacenters' 의 주제별 논문영향력
논문영향력 선정 방법
논문영향력 요약
주제
  • 모델링
  • 성능
  • 컴퓨터시스템
  • 평가
동일주제 총논문수 논문피인용 총횟수 주제별 논문영향력의 평균
899 0

0.0%

' 빠르고 정확하고 효율적인 컴퓨터 시스템 성능 모델링: 프로세서부터 데이터센터까지 = Fast, Accurate, and Efficient Performance Modeling of Computer Systems: From Processors to Datacenters' 의 참고문헌

  • memcached - a distributed memory object caching system. http://www.memcached.org/.
  • memaslap - benchmark suite for memcached. http://docs.libmemcached.org/memaslap.html. Accessed: 2017-08-11.
  • Zhangxi Tan, Andrew Waterman, Rimas Avizienis, Yunsup Lee, Henry Cook, David Patterson, and Krste Asanovic. Ramp gold: an fpga-based architecture simulator for multiprocessors. In Design Automation Confer- ence (DAC), 2010 47th ACM/IEEE, pages 463-468. IEEE, 2010.
  • Yunqi Zhang, David Meisner, Jason Mars, and Lingjia Tang. Treadmill: Attributing the source of tail latency through precise load testing and statistical inference. In Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on, pages 456-468. IEEE, 2016.
  • Youngtaek Kim, Lizy Kurian John, Sanjay Pant, Srilatha Manne, Michael Schulte, William Lloyd Bircher, and Madhu Saravana Sibi Govindan. Audit: Stress testing the automatic way. In Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on, pages 212-223. IEEE, 2012.
  • Wim Heirman, Trevor E Carlson, Shuai Che, Kevin Skadron, and Lieven Eeckhout. Using cycle stacks to understand scaling bottlenecks in multithreaded workloads. In Workload Characterization (IISWC), 2011 IEEE International Symposium on, pages 38-49. IEEE, 2011.
  • William Feller. An introduction to probability theory and its applications: volume I, volume 3. John Wiley & Sons New York, 1968.
  • Vinicius Petrucci, Michael A Laurenzano, John Doherty, Yunqi Zhang, Daniel Mosse, Jason Mars, and Lingjia Tang. Octopus-man: Qos-driven task management for heterogeneous multicores in warehouse-scale computers. In High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on, pages 246-258. IEEE, 2015.
  • Tse-Yu Yeh and Yale N Patt. Alternative implementations of two-level adaptive branch prediction. In ACM SIGARCH Computer Architecture News, volume 20, pages 124-134. ACM, 1992.
  • Trevor E Carlson, Wim Heirmant, and Lieven Eeckhout. Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation. In High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for, pages 1-12. IEEE, 2011.
  • Trevor E Carlson, Wim Heirman, and Lieven Eeckhout. Sampled simulation of multi-threaded applications. In Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on, pages 2-12. IEEE, 2013.
  • Trevor E Carlson, Wim Heirman, Kenzo Van Craeynest, and Lieven Eeckhout. Barrierpoint: Sampled simulation of multi-threaded applications. In Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on, pages 2-12. IEEE, 2014.
  • Transaction Processing performance Council. http://www.tpc.org/default.asp. Accessed: 2017-06-15.
  • Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. Automatically characterizing large scale program behavior. In ACM SIGARCH Computer Architecture News, volume 30, pages 45-57. ACM, 2002.
  • Tianshi Chen, Qi Guo, Olivier Temam, Yue Wu, Yungang Bao, Zhiwei Xu, and Yunji Chen. Statistical performance comparisons of computers. IEEE Transactions on Computers, 64(5):1442-1455, 2015.
  • Tianshi Chen, Qi Guo, Ke Tang, Olivier Temam, Zhiwei Xu, Zhi-Hua Zhou, and Yunji Chen. Archranker: A ranking approach to design space exploration. In Computer Architecture (ISCA), 2014 ACM/IEEE 41st Interna- tional Symposium on, pages 85-96. IEEE, 2014.
  • Thomas Grass, Alejandro Rico, Marc Casas, Miquel Moreto, and Eduard Ayguad e. Taskpoint: Sampled simulation of task-based programs. In Per- formance Analysis of Systems and Software (ISPASS), 2016 IEEE Inter- national Symposium on, pages 296-306. IEEE, 2016.
  • Thomas F Wenisch, Roland E Wunderlich, Michael Ferdman, Anastassia Ailamaki, Babak Falsafi, and James C Hoe. Simflex: statistical sampling of computer system simulation. IEEE Micro, 26(4):18-31, 2006.
  • Tejas S Karkhanis and James E Smith. A first-order superscalar processor model. In Computer Architecture, 2004. Proceedings. 31st Annual Interna- tional Symposium on, pages 338-349. IEEE, 2004.
  • Svilen Kanev, Kim Hazelwood, Gu-Yeon Wei, and David Brooks. Tradeo ffs between power management and tail latency in warehouse-scale applications. In Workload Characterization (IISWC), 2014 IEEE International Symposium on, pages 31-40. IEEE, 2014.
  • Svilen Kanev, Juan Pablo Darago, Kim Hazelwood, Parthasarathy Ranganathan, Tipp Moseley, Gu-Yeon Wei, and David Brooks. Profiling a warehouse-scale computer. In Computer Architecture (ISCA), 2015 ACM/IEEE 42nd Annual International Symposium on, pages 158-169. IEEE, 2015.
  • Stijn Eyerman, Lieven Eeckhout, Tejas Karkhanis, and James E Smith. A performance counter architecture for computing accurate cpi components. In ACM SIGOPS Operating Systems Review, volume 40, pages 175-184. ACM, 2006.
  • Stijn Eyerman, Lieven Eeckhout, Tejas Karkhanis, and James E Smith. A mechanistic performance model for superscalar out-of-order processors. ACM Transactions on Computer Systems (TOCS), 27(2):3, 2009.
  • Steven Pelley, David Meisner, Pooya Zandevakili, Thomas F Wenisch, and Jack Underwood. Power routing: dynamic power provisioning in the data center. In ACM Sigplan Notices, volume 45, pages 231-242. ACM, 2010.
  • Steven Hart, Eitan Frachtenberg, and Mateusz Berezecki. Predicting memcached throughput using simulation and modeling. In Proceedings of the 2012 Symposium on Theory of Modeling and Simulation-DEVS Integrative M&S Symposium, page 40. Society for Computer Simulation International, 2012.
  • Standard performance evaluation corporation (spec). http://www.spec.org/.
  • Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst., 30(1-7):107-117, April 1998.
  • Samantika Subramaniam, Anne Bracy, Hong Wang, and Gabriel H Loh. Criticality-based optimizations for efficient load processing. In High Perfor- mance Computer Architecture, 2009. HPCA 2009. IEEE 15th International Symposium on, pages 419-430. IEEE, 2009.
  • SPECVirt. https://www.spec.org/virt_sc2013/. Accessed: 2017-06-15.
  • Roland E Wunderlich, Thomas F Wenisch, Babak Falsafi, and James C Hoe. Smarts: Accelerating microarchitecture simulation via rigorous statistical sampling. In Computer Architecture, 2003. Proceedings. 30th Annual International Symposium on, pages 84-95. IEEE, 2003.
  • Rodrigo Fonseca, George Porter, Randy H Katz, Scott Shenker, and Ion Stoica. X-trace: A pervasive network tracing framework. In Proceedings of the 4th USENIX conference on Networked systems design & implementa- tion, pages 20-20. USENIX Association, 2007.
  • Richard Hankins, Trung Diep, Murali Annavaram, Brian Hirano, Harald Eri, Hubert Nueckel, and John Paul Shen. Scaling and characterizing database workloads: Bridging the gap between research and practice. In Microarchitecture, 2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Symposium on, pages 151-162. IEEE, 2003.
  • Quan Chen, Hailong Yang, Minyi Guo, Ram Srivatsa Kannan, Jason Mars, and Lingjia Tang. Prophet: Precise qos prediction on non-preemptive accelerators to improve utilization in warehouse-scale computers. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, pages 17-32. ACM, 2017.
  • Quan Chen, Hailong Yang, Jason Mars, and Lingjia Tang. Baymax: Qos awareness and increased utilization for non-preemptive accelerators in warehouse scale computers. In ACM SIGPLAN Notices, volume 51, pages 681- 696. ACM, 2016.
  • Pierre Salverda and Craig Zilles. A criticality analysis of clustering in superscalar processors. In Proceedings of the 38th annual IEEE/ACM In- ternational Symposium on Microarchitecture, pages 55-66. IEEE Computer Society, 2005.
  • Philip G Emma. Understanding some simple processor-performance limits. IBM journal of Research and Development, 41(3):215-232, 1997.
  • PerfKit Benchmarker. http://googlecloudplatform.github.io/PerfKitBenchmarker/. Accessed: 2017-06-15.
  • Paul Barham, Rebecca Isaacs, Richard Mortier, and Dushyanth Narayanan. Magpie: Online modelling and performance-aware systems. In HotOS, pages 85-90, 2003.
  • Patrick Reynolds, Charles Edwin Killian, Janet L Wiener, Jeffrey C Mogul, Mehul A Shah, and Amin Vahdat. Pip: Detecting the unexpected in distributed systems. In NSDI, volume 6, pages 115-128, 2006.
  • PJ Joseph, Kapil Vaswani, and Matthew J Thazhuthaveetil. Construction and use of linear regression models for processor performance analysis. In High-Performance Computer Architecture, 2006. The Twelfth International Symposium on, pages 99-108. IEEE, 2006.
  • Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R Hower, Tushar Krishna, Somayeh Sardashti, et al. The gem5 simulator. ACM SIGARCH Computer Architecture News, 39(2):1-7, 2011.
  • Nadav Chachmon, Daniel Richins, Robert Cohn, Magnus Christensson, Wenzhi Cui, and Vijay Janapa Reddi. Simulation and analysis engine for scale-out workloads. In Proceedings of the 2016 International Conference on Supercomputing, page 22. ACM, 2016.
  • Mike Burrows. The chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th symposium on Operating systems design and implementation, pages 335-350. USENIX Association, 2006.
  • Michael Pellauer, Michael Adler, Michel Kinsy, Angshuman Parashar, and Joel Emer. Hasim: Fpga-based high-detail multicore simulation using time-division multiplexing. In High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on, pages 406-417. IEEE, 2011.
  • Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In ACM SIGPLAN Notices, volume 47, pages 37-48. ACM, 2012.
  • Mayank Agarwal, Nitin Navale, Kshitiz Malik, and Matthew I Frank. Fetchcriticality reduction through control independence. In Computer Architec- ture, 2008. ISCA'08. 35th International Symposium on, pages 13-24. IEEE, 2008.
  • Marco Zagha, Brond Larson, Steve Turner, and Marty Itzkowitz. Performance analysis using the mips r10000 performance counters. In Supercom- puting, 1996. Proceedings of the 1996 ACM/IEEE Conference on, pages 16-16. IEEE, 1996.
  • Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems, pages 351-364. ACM, 2013.
  • Luiz Andr e Barroso, Jimmy Clidaras, and Urs Holzle. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis lectures on computer architecture, 8(3):1-154, 2013.
  • Luiz Andr e Barroso, Jeffrey Dean, and Urs Holzle. Web search for a planet: The google cluster architecture. IEEE micro, 23(2):22-28, 2003.
  • Lingjia Tang, Jason Mars, Xiao Zhang, Robert Hagmann, Robert Hundt, and Eric Tune. Optimizing google's warehouse scale computers: The numa experience. In High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on, pages 188-197. IEEE, 2013.
  • Lingjia Tang, Jason Mars, Neil Vachharajani, Robert Hundt, and Mary Lou Soffa. The impact of memory subsystem resource sharing on datacenter applications. In ACM SIGARCH Computer Architecture News, volume 39, pages 283-294. ACM, 2011.
  • Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, Wanling Gao, Zhen Jia, Yingjie Shi, Shujie Zhang, et al. Bigdatabench: A big data benchmark suite from internet services. In High Performance Computer Architecture (HPCA), 2014 IEEE 20th Interna- tional Symposium on, pages 488-499. IEEE, 2014.
  • Kishore Kumar Pusukuri, Rajiv Gupta, and Laxmi N Bhuyan. Thread reinforcer: Dynamically determining number of threads via os level monitoring. In Workload Characterization (IISWC), 2011 IEEE International Symposium on, pages 116-125. IEEE, 2011.
  • Karthik Ganesan, Jungho Jo, and Lizy K John. Synthesizing memorylevel parallelism aware miniature clones for spec cpu2006 and implantbench workloads. In Performance Analysis of Systems & Software (ISPASS), 2010 IEEE International Symposium on, pages 33-44. IEEE, 2010.
  • Karthik Ganesan and Lizy K John. Maximum multicore power (mampo): an automatic multithreaded synthetic power virus generation framework for multicore systems. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, page 53. ACM, 2011.
  • John L Hennessy and David A Patterson. Computer architecture: a quan- titative approach. Elsevier, 2011.
  • Johann Hauswald, Yiping Kang, Michael A Laurenzano, Quan Chen, Cheng Li, Trevor Mudge, Ronald G Dreslinski, Jason Mars, and Lingjia Tang. Djinn and tonic: Dnn as a service and its implications for future warehouse scale computers. In ACM SIGARCH Computer Architecture News, volume 43, pages 27-40. ACM, 2015.
  • Jiawei Han, Micheline Kamber, and Jian Pei. Data mining: concepts and techniques. Morgan kaufmann, 2006.
  • Jennifer M Anderson, Lance M Berc, Jeffrey Dean, Sanjay Ghemawat, Monika R Henzinger, Shun-Tak A Leung, Richard L Sites, Mark T Vandevoorde, Carl A Waldspurger, and William E Weihl. Continuous profiling: Where have all the cycles gone? ACM Transactions on Computer Systems (TOCS), 15(4):357-390, 1997.
  • Jeffrey Dean, James E Hicks, Carl A Waldspurger, William E Weihl, and George Chrysos. Profileme: Hardware support for instruction-level profiling on out-of-order processors. In Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, pages 292-302. IEEE Computer Society, 1997.
  • Jeffrey Dean and Sanjay Ghemawat. Mapreduce: a flexible data processing tool. Communications of the ACM, 53(1):72-77, 2010.
  • Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th annual IEEE/ACM International Symposium on Microarchitecture, pages 248-259. ACM, 2011.
  • Jason Mars and Lingjia Tang. Whare-map: heterogeneity in homogeneous warehouse-scale computers. In ACM SIGARCH Computer Architecture News, volume 41, pages 619-630. ACM, 2013.
  • Jason E Miller, Harshad Kasture, George Kurian, Charles Gruenwald, Nathan Beckmann, Christopher Celio, Jonathan Eastep, and Anant Agarwal. Graphite: A distributed parallel simulator for multicores. In High Per- formance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on, pages 1-12. IEEE, 2010.
  • James C Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, Jeffrey John Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, et al. Spanner: Google's globally distributed database. ACM Transactions on Computer Systems (TOCS), 31(3):8, 2013.
  • Jaewon Lee, Hanhwi Jang, and Jangwoo Kim. Rpstacks: Fast and accurate processor design space exploration using representative stall-event stacks. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, pages 255-267. IEEE Computer Society, 2014.
  • Jaewon Lee, Hanhwi Jang, Jae-eon Jo, Gyu-hyeon Lee, and Jangwoo Kim. Stressright: Finding the right stress for accurate in-development system evaluation. In Performance Analysis of Systems and Software (ISPASS), 2017 IEEE International Symposium on, pages 205-216. IEEE, 2017.
  • Harshad Kasture and Daniel Sanchez. Tailbench: a benchmark suite and evaluation methodology for latency-critical applications. In Workload Char- acterization (IISWC), 2016 IEEE International Symposium on, pages 1-10. IEEE, 2016.
  • Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers. In ACM SIGARCH Computer Architecture News, volume 41, pages 607-618. ACM, 2013.
  • Guanying Wang, Ali R Butt, Prashant Pandey, and Karan Gupta. A simulation approach to evaluating design decisions in mapreduce setups. In Modeling, Analysis & Simulation of Computer and Telecommunication Sys- tems, 2009. MASCOTS'09. IEEE International Symposium on, pages 1-11. IEEE, 2009.
  • Greg Hamerly, Erez Perelman, Jeremy Lau, and Brad Calder. Simpoint 3.0: Faster and more flexible program phase analysis. Journal of Instruction Level Parallelism, 7(4):1-28, 2005.
  • Google Cloud Platform. https://cloud.google.com/. Accessed: 2017- 06-15.
  • Google Cloud Platform Customers. https://cloud.google.com/customers/. Accessed: 2017-06-15.
  • Gang Ren, Eric Tune, Tipp Moseley, Yixin Shi, Silvius Rus, and Robert Hundt. Google-wide profiling: A continuous profiling infrastructure for data centers. IEEE micro, 30(4):65-79, 2010.
  • Gabriel Southern and Jose Renau. Deconstructing parsec scalability. In Proc. of the Annual Workshop on Duplicating, Deconstructing, and De- bunking (WDDD), 2015.
  • G abor J Sz ekely, Maria L Rizzo, Nail K Bakirov, et al. Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6):2769-2794, 2007.
  • Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2):4, 2008.
  • Fabrice Bellard. Qemu, a fast and portable dynamic translator. In USENIX Annual Technical Conference, FREENIX Track, pages 41-46, 2005.
  • Eric Tune, Dongning Liang, Dean M Tullsen, and Brad Calder. Dynamic prediction of critical path instructions. In High-Performance Computer Ar- chitecture, 2001. HPCA. The Seventh International Symposium on, pages 185-195. IEEE, 2001.
  • Eric S Tune, Dean M Tullsen, and Brad Calder. Quantifying instruction criticality. In Parallel Architectures and Compilation Techniques, 2002. Proceedings. 2002 International Conference on, pages 104-113. IEEE, 2002.
  • Eric S Chung, Michael K Papamichael, Eriko Nurvitadhi, James C Hoe, Ken Mai, and Babak Falsafi. Protoflex: Towards scalable, full-system multiprocessor simulations using fpgas. ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2(2):15, 2009.
  • Erez Perelman, Greg Hamerly, Michael Van Biesbrouck, Timothy Sherwood, and Brad Calder. Using simpoint for accurate and efficient simulation. In ACM SIGMETRICS Performance Evaluation Review, volume 31, pages 318-319. ACM, 2003.
  • Ehsan K Ardestani and Jose Renau. Esesc: A fast multicore simulator using time-based sampling. In High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on, pages 448- 459. IEEE, 2013.
  • Eduardo Argollo, Ayose Falc on, Paolo Faraboschi, Matteo Monchiero, and Daniel Ortega. Cotson: infrastructure for full system simulation. ACM SIGOPS Operating Systems Review, 43(1):52-61, 2009.
  • Derek Chiou, Dam Sunwoo, Joonsoo Kim, Nikhil A Patil, William Reinhart, Darrel Eric Johnson, Jebediah Keefe, and Hari Angepat. Fpgaaccelerated simulation technologies (fast): Fast, full-system, cycle-accurate simulators. In Proceedings of the 40th Annual IEEE/ACM international Symposium on Microarchitecture, pages 249-261. IEEE Computer Society, 2007.
  • David Xinliang Li, Raksit Ashok, and Robert Hundt. Lightweight feedbackdirected cross-module optimization. In Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimiza- tion, pages 53-61. ACM, 2010.
  • David Meisner, Christopher M Sadler, Luiz Andr e Barroso, Wolf-Dietrich Weber, and Thomas FWenisch. Power management of online data-intensive services. In Computer Architecture (ISCA), 2011 38th Annual International Symposium on, pages 319-330. IEEE, 2011.
  • David Meisner, Brian T Gold, and Thomas FWenisch. Powernap: eliminating server idle power. In ACM Sigplan Notices, volume 44, pages 205-216. ACM, 2009.
  • David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. Heracles: improving resource efficiency at scale. In ACM SIGARCH Computer Architecture News, volume 43, pages 450-462. ACM, 2015.
  • David A Patterson. The data center is the computer. Communications of the ACM, 51(1):105-105, 2008.
  • Daniel Sanchez and Christos Kozyrakis. Zsim: fast and accurate microarchitectural simulation of thousand-core systems. In ACM SIGARCH Com- puter Architecture News, volume 41, pages 475-486. ACM, 2013.
  • Christos Kozyrakis, Aman Kansal, Sriram Sankar, and Kushagra Vaid. Server engineering insights for large-scale online services. IEEE micro, 30(4):8-19, 2010.
  • Christina Delimitrou, Sriram Sankar, Kushagra Vaid, and Christos Kozyrakis. Decoupling datacenter studies from access to large-scale applications: A modeling approach for storage workloads. In Workload Charac- terization (IISWC), 2011 IEEE International Symposium on, pages 51-60. IEEE, 2011.
  • Christina Delimitrou, Daniel Sanchez, and Christos Kozyrakis. Tarcil: reconciling scheduling speed and quality in large shared clusters. In Proceedings of the Sixth ACM Symposium on Cloud Computing, pages 97-110. ACM, 2015.
  • Christina Delimitrou and Christos Kozyrakis. ibench: Quantifying interference for datacenter applications. In Workload Characterization (IISWC), 2013 IEEE International Symposium on, pages 23-33. IEEE, 2013.
  • Christina Delimitrou and Christos Kozyrakis. Quasar: resource-efficient and qos-aware cluster management. In ACM SIGPLAN Notices, volume 49, pages 127-144. ACM, 2014.
  • Christina Delimitrou and Christos Kozyrakis. Paragon: Qos-aware scheduling for heterogeneous datacenters. In ACM SIGPLAN Notices, volume 48, pages 77-88. ACM, 2013.
  • Christina Delimitrou and Christos Kozyrakis. Hcloud: Resource-efficient provisioning in shared cloud systems. In ACM SIGOPS Operating Systems Review, volume 50, pages 473-488. ACM, 2016.
  • Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pages 72-81. ACM, 2008.
  • Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Acm sigplan notices, volume 40, pages 190-200. ACM, 2005.
  • Charlie Curtsinger and Emery D Berger. Stabilizer: statistically sound performance evaluation. In ACM SIGARCH Computer Architecture News, volume 41, pages 219-228. ACM, 2013.
  • Chang-Hong Hsu, Yunqi Zhang, Michael A Laurenzano, David Meisner, Thomas Wenisch, Jason Mars, Lingjia Tang, and Ronald G Dreslinski. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting. In High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on, pages 271-282. IEEE, 2015.
  • Brian Fields, Shai Rubin, and Rastislav Bod ik. Focusing processor policies via critical-path prediction. In Computer Architecture, 2001. Proceedings. 28th Annual International Symposium on, pages 74-85. IEEE, 2001.
  • Brian Fields, Rastislav Bod ik, and Mark D Hill. Slack: Maximizing performance under technological constraints. In Computer Architecture, 2002. Proceedings. 29th Annual International Symposium on, pages 47-58. IEEE, 2002.
  • Brian A Fields, Rastislav Bodik, Mark D Hill, and Chris J Newburn. Interaction cost and shotgun profiling. ACM Transactions on Architecture and Code Optimization (TACO), 1(3):272-304, 2004.
  • Benjamin H Sigelman, Luiz Andre Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, and Chandan Shanbhag. Dapper, a large-scale distributed systems tracing infrastructure. Technical report, Technical report, Google, 2010.
  • Benjamin C Lee, Jamison Collins, Hong Wang, and David Brooks. Cpr: Composable performance regression for scalable multiprocessor models. In Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, pages 270-281. IEEE Computer Society, 2008.
  • Benjamin C Lee and David M Brooks. Illustrative design space studies with microarchitectural regression models. In High Performance Computer Architecture, 2007. HPCA 2007. IEEE 13th International Symposium on, pages 340-351. IEEE, 2007.
  • Benjamin C Lee and David M Brooks. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In ACM SIGOPS Operating Systems Review, volume 40, pages 185-194. ACM, 2006.
  • Benjamin C Lee and David Brooks. Efficiency trends and limits from comprehensive microarchitectural adaptivity. In ACM SIGARCH Computer Architecture News, volume 36, pages 36-47. ACM, 2008.
  • Avadh Patel, Furat Afram, Shunfei Chen, and Kanad Ghose. Marss: a full system simulator for multicore x86 cpus. In Proceedings of the 48th Design Automation Conference, pages 1050-1055. ACM, 2011.
  • Andy Georges, Dries Buytaert, and Lieven Eeckhout. Statistically rigorous java performance evaluation. ACM SIGPLAN Notices, 42(10):57-76, 2007.
  • Allan Hartstein and Thomas R Puzak. The optimum pipeline depth for a microprocessor. In ACM Sigarch Computer Architecture News, volume 30, pages 7-13. IEEE Computer Society, 2002.
  • Ali G Saidi, Nathan L Binkert, Steven K Reinhardt, and Trevor Mudge. Full-system critical path analysis. In Performance Analysis of Systems and software, 2008. ISPASS 2008. IEEE International Symposium on, pages 63-74. IEEE, 2008.
  • Ali G Saidi, Nathan L Binkert, Steven K Reinhardt, and Trevor Mudge. End-to-end performance forecasting: finding bottlenecks before they happen. ACM SIGARCH Computer Architecture News, 37(3):361-370, 2009.
  • Alejandro Rico, Alejandro Duran, Felipe Cabarcas, Yoav Etsion, Alex Ramirez, and Mateo Valero. Trace-driven simulation of multithreaded applications. In Performance Analysis of Systems and Software (ISPASS), 2011 IEEE International Symposium on, pages 87-96. IEEE, 2011.
  • Alaa R Alameldeen, Milo MK Martin, Carl J Mauer, Kevin E Moore, Min Xu, Mark D Hill, David A Wood, and Daniel J Sorin. Simulating a $2 m commercial server on a $2 k pc. Computer, 36(2):50-57, 2003.
  • Alaa R Alameldeen and David A Wood. Variability in architectural simulations of multi-threaded workloads. In High-Performance Computer Archi- tecture, 2003. HPCA-9 2003. Proceedings. The Ninth International Sympo- sium on, pages 7-18. IEEE, 2003.
  • Ajay M Joshi, Lieven Eeckhout, Lizy K John, and Ciji Isen. Automated microprocessor stressmark generation. In High Performance Computer Archi- tecture, 2008. HPCA 2008. IEEE 14th International Symposium on, pages 229-239. IEEE, 2008.
  • Ahmad Yasin. A top-down method for performance analysis and counters architecture. In Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on, pages 35-44. IEEE, 2014.
  • Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. Large-scale cluster management at google with borg. In Proceedings of the Tenth European Conference on Computer Systems, page 18. ACM, 2015.
  • Aashish Phansalkar, Ajay Joshi, and Lizy K John. Analysis of redundancy and application balance in the spec cpu2006 benchmark suite. ACM SIGARCH Computer Architecture News, 35(2):412-423, 2007.