skip to main content
survey
Public Access

A Survey of Techniques for Architecting and Managing Asymmetric Multicore Processors

Published:08 February 2016Publication History
Skip Abstract Section

Abstract

To meet the needs of a diverse range of workloads, asymmetric multicore processors (AMPs) have been proposed, which feature cores of different microarchitecture or ISAs. However, given the diversity inherent in their design and application scenarios, several challenges need to be addressed to effectively architect AMPs and leverage their potential in optimizing both sequential and parallel performance. Several recent techniques address these challenges. In this article, we present a survey of architectural and system-level techniques proposed for designing and managing AMPs. By classifying the techniques on several key characteristics, we underscore their similarities and differences. We clarify the terminology used in this research field and identify challenges that are worthy of future investigation. We hope that more than just synthesizing the existing work on AMPs, the contribution of this survey will be to spark novel ideas for architecting future AMPs that can make a definite impact on the landscape of next-generation computing systems.

References

  1. Arunachalam Annamalai, Rance Rodrigues, Israel Koren, and Sandip Kundu. 2013. An opportunistic prediction-based thread scheduling to maximize throughput/watt in AMPs. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’13). 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Murali Annavaram, Ed Grochowski, and John Shen. 2005. Mitigating Amdahl’s law through EPI throttling. In Proceedings of the International Symposium on Computer Architecture (ISCA’05). 298--309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Amin Ansari, Shuguang Feng, Shantanu Gupta, Josep Torrellas, and Scott Mahlke. 2013. Illusionist: Transforming lightweight cores into aggressive cores on demand. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’13). 436--447. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. ARM. 2015a. big.LITTLE Technology. Retrieved December 29, 2015, from http://www.arm.com/products/processors/technologies/biglittleprocessing.php.Google ScholarGoogle Scholar
  5. ARM. 2015b. Cortex-A Series Processors. Retrieved December 29, 2015, from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.set.cortexa/index.html.Google ScholarGoogle Scholar
  6. Saisanthosh Balakrishnan, Ravi Rajwar, Mike Upton, and Konrad Lai. 2005. The impact of performance asymmetry in emerging multicore architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA’05). 506--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Antonio Barbalace, Marina Sadini, Saif Ansary, Christopher Jelesnianski, Akshay Ravichandran, Cagil Kendir, Alastair Murray, and Binoy Ravindran. 2015. Popcorn: Bridging the programmability gap in heterogeneous-ISA platforms. In Proceedings of the European Conference on Computer Systems (EuroSys’15). 29:1--29:16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michela Becchi and Patrick Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. In Proceedings of the Computing Frontiers Conference (CF’06). 29--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jeffery Brown, Leo Porter, and Dean M. Tullsen. 2011. Fast thread migration via cache working set prediction. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’11). 193--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ting Cao, Stephen M. Blackburn, Tiejun Gao, and Kathryn S. McKinley. 2012. The yin and yang of power and performance for asymmetric hardware and managed software. In Proceedings of the International Symposium on Computer Architecture (ISCA’12). 225--236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jian Chen and Lizy Kurian John. 2008. Energy-aware application scheduling on a heterogeneous multi-core system. In Proceedings of the International Symposium on Workload Characterization (IISWC’08). 5--13.Google ScholarGoogle Scholar
  12. Jian Chen and Lizy Kurian John. 2009. Efficient program scheduling for heterogeneous multi-core processors. In Proceedings of the Design Automation Conference (DAC’09). 927--930. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Quan Chen and Minyi Guo. 2014. Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures. ACM Transactions on Architecture and Code Optimization 11, 1, 8:1--8:25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nagabhushan Chitlur, Ganapati Srinivasa, Scott Hahn, Pragya K. Gupta, Dheeraj Reddy, David Koufaty, Paul Brett, Abirami Prabhakaran, Li Zhao, Nelson Ijih, Suchit Subhaschandra, Sabina Grover, Xiaowei Jiang, and Ravi Iyer. 2012. QuickIA: Exploring heterogeneous architectures on real prototypes. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’12). 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jih-Ching Chiu, Yu-Liang Chou, and Po-Kai Chen. 2010. Hyperscalar: A novel dynamically reconfigurable multi-core architecture. In Proceedings of the International Conference on Parallel Processing (ICPP’10). 277--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. CNXSoft. 2014. ARM Cortex A15/A17 SoCs Comparison—Nvidia Tegra K1 vs Samsung Exynos 5422 vs Rockchip RK3288 vs AllWinner A80. Retrieved December 29, 2015, from http://www.cnx-software.com/2014/05/21/comparison-nvidia-tegra-k1-samsung-exynos-5422-rockchip-rk3288-allwinner-a80/.Google ScholarGoogle Scholar
  17. Jason Cong and Bo Yuan. 2012. Energy-efficient scheduling on heterogeneous multi-core architectures. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED’12). 345--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Matthew DeVuyst, Ashish Venkat, and Dean M. Tullsen. 2012. Execution migration in a heterogeneous-ISA chip multiprocessor. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 261--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Stijn Eyerman and Lieven Eeckhout. 2010. Modeling critical sections in Amdahl’s law and its implications for multicore design. In Proceedings of the International Symposium on Computer Architecture (ISCA’10). 362--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Stijn Eyerman and Lieven Eeckhout. 2014. The benefit of SMT in the multi-core era: Flexibility towards degrees of thread-level parallelism. ACM SIGARCH Computer Architecture News 42, 1, 591--606. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Chris Fallin, Chris Wilkerson, and Onur Mutlu. 2014. The heterogeneous block architecture. In Proceedings of the International Conference on Computer Design (ICCD’14). 386--393.Google ScholarGoogle ScholarCross RefCross Ref
  22. Andrei Frumusanu and Ryan Smith. 2015. ARM A53/A57/T760 Investigated—Samsung Galaxy Note 4 Exynos Review. Retrieved December 29, 2015, from http://www.anandtech.com/show/8718/the-samsung-galaxy-note-4-exynos-rev iew/6.Google ScholarGoogle Scholar
  23. Giorgis Georgakoudis, Dimitrios S. Nikolopoulos, and Spyros Lalis. 2013. Fast dynamic binary rewriting to support thread migration in shared-ISA asymmetric multicores. In Proceedings of the International Workshop on Code Optimisation for Multi and Many Cores (COSMIC’13). 4:1--4:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dan Gibson and David A. Wood. 2010. Forwardflow: A scalable core for power-constrained CMPs. ACM SIGARCH Computer Architecture News 38, 14--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lori Gil. 2015. NVIDIAs Tegra X1 Crushes the Competition. Retrieved December 29, 2015, from http://liliputing.com/2015/02/nvidias-tegra-x1-crushes-the-competition.html.Google ScholarGoogle Scholar
  26. Ryan E. Grant and Ahmad Afsahi. 2006. Power-performance efficiency of asymmetric multiprocessors for multi-threaded scientific applications. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ed Grochowski, Ronny Ronen, John Shen, and Hong Wang. 2004. Best of both latency and throughput. In Proceedings of the IEEE International Conference on Computer Design (ICCD’04). 236--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael Gschwind, H. Peter Hofstee, Brian Flachs, Martin Hopkins, Yukio Watanabe, and Takeshi Yamazaki. 2006. Synergistic processing in Cell’s multicore architecture. IEEE Micro 26, 2, 10--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Divya P. Gulati, Changkyu Kim, Simha Sethumadhavan, Stephen W. Keckler, and Doug Burger. 2008. Multitasking workload scheduling on flexible-core chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’08). 187--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Shantanu Gupta, Shuguang Feng, Amin Ansari, and Scott Mahlke. 2010. Erasing core boundaries for robust and configurable performance. In Proceedings of the International Symposium on Microarchitecture (MICRO’10). 325--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Vishal Gupta and Ripal Nathuji. 2010. Analyzing performance asymmetric multicore processors for latency sensitive datacenter applications. In Proceedings of the Workshop on Power Aware Computing and Systems (HotPower’10). 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Anthony Gutierrez, Ronald G. Dreslinski, and Trevor Mudge. 2014. Evaluating private vs. shared last-level caches for energy efficiency in asymmetric multi-cores. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS’14). 191--198.Google ScholarGoogle ScholarCross RefCross Ref
  33. Mark D. Hill and Michael R. Marty. 2008. Amdahl’s law in the multicore era. IEEE Computer 7, 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Houman Homayoun, Vasileios Kontorinis, Amirali Shayan, Ta-Wei Lin, and Dean M. Tullsen. 2012. Dynamically heterogeneous cores through 3D resource pooling. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’12). 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Tomas Hruby, Herbert Bos, and Andrew S. Tanenbaum. 2013. When slower is faster: On heterogeneous multicores for reliable systems. In Proceedings of the USENIX Annual Technical Conference (ATC’13). 255--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ineda. 2015. Ineda Dhanush Wearable Processing Unit.Google ScholarGoogle Scholar
  37. Engin Ipek, Meyrem Kirman, Nevin Kirman, and Jose F. Martinez. 2007. Core fusion: Accommodating software diversity in chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA’07). 186--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Brian Jeff. 2012. Big.LITTLE system architecture from ARM: Saving power through heterogeneous multiprocessing and task context migration. In Proceedings of the ACM Design Automation Conference (DAC’12).Google ScholarGoogle ScholarCross RefCross Ref
  39. José A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt. 2012. Bottleneck identification and scheduling in multithreaded applications. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 223--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. José A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt. 2013. Utility-based acceleration of multithreaded applications on asymmetric CMPs. In Proceedings of the International Symposium on Computer Architecture (ISCA’13). 154--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. B. H. H. Juurlink and C. H. Meenderinck. 2012. Amdahl’s law for predicting the future of multicores considered harmful. ACM SIGARCH Computer Architecture News 40, 2, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Vahid Kazempour, Ali Kamali, and Alexandra Fedorova. 2010. AASH: An asymmetry-aware scheduler for hypervisors. ACM SIGPLAN Notices 45, 7, 85--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Omer Khan and Sandip Kundu. 2010. A self-adaptive scheduler for asymmetric multi-cores. In Proceedings of the ACM Great Lakes Symposium on VLSI (GLSVLSI’10). 397--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Khubaib Khubaib, M. Aater Suleman, Milad Hashemi, Chris Wilkerson, and Yale N. Patt. 2012. MorphCore: An energy-efficient microarchitecture for high performance ILP and high throughput TLP. In Proceedings of the International Symposium on Microarchitecture (MICRO’12). 305--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Changkyu Kim, Simha Sethumadhavan, Madhu S. Govindan, Nitya Ranganathan, Divya Gulati, Doug Burger, and Stephen W. Keckler. 2007. Composable lightweight processors. In Proceedings of the International Symposium on Microarchitecture (MICRO’07). 381--394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jun Kim, Joonwon Lee, and Jinkyu Jeong. 2015. Exploiting asymmetric CPU performance for fast startup of subsystem in mobile smart devices. IEEE Transactions on Consumer Electronics 61, 1, 103--111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Myungsun Kim, Kibeom Kim, James R. Geraci, and Seongsoo Hong. 2014. Utilization-aware load balancing for the energy efficient operation of the big.LITTLE processor. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’14). 223:1--223:4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Byeong-Moon Ko, Joonwon Lee, and Heeseung Jo. 2012. AMP aware core allocation scheme for mobile devices. In Proceedings of the IEEE Spring Congress on Engineering and Technology (S-CET’12). 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  49. David Koufaty, Dheeraj Reddy, and Scott Hahn. 2010. Bias scheduling in heterogeneous multi-core architectures. In Proceedings of the European Conference on Computer Systems (EuroSys’10). 125--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, and Dean M. Tullsen. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings of the International Symposium on Microarchitecture (MICRO’03). 81--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Rakesh Kumar, Norman P. Jouppi, and Dean M. Tullsen. 2004a. Conjoined-core chip multiprocessing. In Proceedings of the International Symposium on Microarchitecture (MICRO’04). 195--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Rakesh Kumar, Dean M. Tullsen, and Norman P. Jouppi. 2006. Core architecture optimization for heterogeneous chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’06). 23--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Rakesh Kumar, Dean M. Tullsen, Parthasarathy Ranganathan, Norman P. Jouppi, and Keith I. Farkas. 2004b. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. ACM SIGARCH Computer Architecture News 32, 64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Youngjin Kwon, Changdae Kim, Seungryoul Maeng, and Jaehyuk Huh. 2011. Virtualizing performance asymmetric multi-core systems. In Proceedings of the International Symposium on Computer Architecture (ISCA’11). 45--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Nagesh B. Lakshminarayana and Hyesoon Kim. 2008. Understanding performance, power and energy behavior in asymmetric multiprocessors. In Proceedings of the International Conference on Computer Design (ICCD’08). 471--477.Google ScholarGoogle Scholar
  56. Nagesh B. Lakshminarayana, Jaekyu Lee, and Hyesoon Kim. 2009. Age based scheduling for asymmetric multiprocessors. In Proceedings of the Conference on High Performance Computing Networking, Storage, and Analysis (SC’09). 25:1--25:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn. 2007. Efficient operating system scheduling for performance-asymmetric multi-core architectures. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’07). 53:1--53:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Tong Li, Paul Brett, Rob Knauerhase, David Koufaty, Dheeraj Reddy, and Scott Hahn. 2010. Operating system support for overlapping-ISA heterogeneous multi-core architectures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’10). 1--12.Google ScholarGoogle Scholar
  59. Felix Xiaozhu Lin, Zhen Wang, Robert LiKamWa, and Lin Zhong. 2012. Reflex: Using low-power processors in smartphones without knowing them. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Felix Xiaozhu Lin, Zhen Wang, and Lin Zhong. 2014. K2: A mobile operating system for heterogeneous coherence domains. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 285--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Guangshuo Liu, Jinpyo Park, and Diana Marculescu. 2013. Dynamic thread mapping for high-performance, power-efficient heterogeneous many-core systems. In Proceedings of the International Conference on Computer Design (ICCD’13). 54--61.Google ScholarGoogle ScholarCross RefCross Ref
  62. Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Ronald Dreslinski Jr., Thomas F. Wenisch, and Scott Mahlke. 2014. Heterogeneous microarchitectures trump voltage scaling for low-power cores. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT’14). 237--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Faissal M. Sleiman, Ronald Dreslinski, Thomas F. Wenisch, and Scott Mahlke. 2012. Composite cores: Pushing heterogeneity into a core. In Proceedings of the International Symposium on Microarchitecture (MICRO’12). 317--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Yangchun Luo, Venkatesan Packirisamy, Wei-Chung Hsu, and Antonia Zhai. 2010. Energy efficient speculative threads: Dynamic thread allocation in same-ISA heterogeneous multicore systems. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT’10). 453--464. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Daniel Lustig, Caroline Trippel, Michael Pellauer, and Margaret Martonosi. 2015. ArMOR: Defending against memory consistency model mismatches in heterogeneous architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA’15). 388--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Felipe Lopes Madruga, Henrique C. Freitas, and Philippe Olivier Alexandre Navaux. 2010. Parallel shared-memory workloads performance on asymmetric multi-core architectures. In Proceedings of the Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP’10). 163--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. N. Markovic, D. Nemirovsky, O. Unsal, M. Valero, and A. Cristal. 2014. Thread lock section-aware scheduling on asymmetric single-ISA multi-core. IEEE Computer Architecture Letters 14, 2, 160--163. DOI:http://dx.doi.org/10.1109/LCA.2014.2357805 Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Sparsh Mittal. 2014a. A survey of techniques for improving energy efficiency in embedded computing systems. International Journal of Computer Aided Engineering and Technology 6, 4, 440--459.Google ScholarGoogle ScholarCross RefCross Ref
  69. Sparsh Mittal. 2014b. Power Management Techniques for Data Centers: A Survey. Technical Report ORNL/TM-2014/381. Oak Ridge National Laboratory, Oak Ridge, TN.Google ScholarGoogle Scholar
  70. Sparsh Mittal, Matthew Poremba, Jeffrey Vetter, and Yuan Xie. 2014. Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool. Technical Report ORNL/TM-2014/636. Oak Ridge National Laboratory, Oak Ridge, TN.Google ScholarGoogle Scholar
  71. Sparsh Mittal and Jeffrey Vetter. 2015. A survey of CPU-GPU heterogeneous computing techniques. ACM Computing Surveys 47, 4, 69:1--69:35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Jeffrey C. Mogul, Jayaram Mudigonda, Nathan Binkert, Parthasarathy Ranganathan, and Vanish Talwar. 2008. Using asymmetric single-ISA CMPs to save energy on operating systems. IEEE Micro 28, 3, 26--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Tomer Y. Morad, Avinoam Kolodny, and Uri C. Weiser. 2010. Scheduling multiple multithreaded applications on asymmetric and symmetric chip multiprocessors. In Proceedings of the International Symposium on Parallel Architectures, Algorithms, and Programming (PAAP’10). 65--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Tomer Y. Morad, Uri C. Weiser, Avinoam Kolodny, Mateo Valero, and Eduard Ayguade. 2006. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors. Computer Architecture Letters 5, 1, 14--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Tobias Mühlbauer, Wolf Rödiger, Robert Seilbeck, Alfons Kemper, and Thomas Neumann. 2014. Heterogeneity-conscious parallel query execution: Getting a better mileage while driving faster! In Proceedings of the International Workshop on Data Management on New Hardware (DaMoN’14). 2:1--2:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Janani Mukundan, Saugata Ghose, Robert Karmazin, Engin Ipek, and José F. Martínez. 2012. Overcoming single-thread performance hurdles in the core fusion reconfigurable multicore architecture. In Proceedings of the International Conference on Supercomputing (ICS’12). 101--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Thannirmalai Somu Muthukaruppan, Anuj Pathania, and Tulika Mitra. 2014. Price theory based power management for heterogeneous multi-cores. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 161--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Thannirmalai Somu Muthukaruppan, Mihai Pricopi, Vanchinathan Venkataramani, Tulika Mitra, and Sanjay Vishin. 2013. Hierarchical power management for asymmetric multi-core in dark silicon era. In Proceedings of the Design Automation Conference (DAC’13). 174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Hashem Hashemi Najaf-Abadi, Niket Kumar Choudhary, and Eric Rotenberg. 2009. Core-selectability in chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’09). 113--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Hashem H. Najaf-Abadi and Eric Rotenberg. 2009. Architectural contesting. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’09). 189--200.Google ScholarGoogle Scholar
  81. Sandeep Navada, Niket K. Choudhary, Salil V. Wadhavkar, and Eric Rotenberg. 2013. A unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 133--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Rajiv Nishtala, Daniel Mossé, and Vinicius Petrucci. 2013. Energy-aware thread co-location in heterogeneous multicore processors. In Proceedings of the International Conference on Embedded Software (EMSOFT’13). 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. NVIDIA. 2011. Variable SMP—A Multi-Core CPU Architecture for Low Power and High Performance. Retrieved December 29, 2015, from http://www.nvidia.com/content/PDF/tegra_white_papers/tegra-whitepaper-0 911b.pdf.Google ScholarGoogle Scholar
  84. Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, and Scott Mahlke. 2013. Trace based phase prediction for tightly-coupled heterogeneous cores. In Proceedings of the International Symposium on Microarchitecture. 445--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Sankaralingam Panneerselvam and Michael M. Swift. 2012. Chameleon: Operating system support for dynamic processors. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 99--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. George Patsilaras, Niket K. Choudhary, and James Tuck. 2012. Efficiently exploiting memory level parallelism on asymmetric coupled cores in the dark silicon era. ACM Transactions on Architecture and Code Optimization 8, 4, 28:1--28:21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Miquel Pericas, Adrian Cristal, Francisco J. Cazorla, Ruben Gonzalez, Daniel A. Jimenez, and Mateo Valero. 2007. A flexible heterogeneous multi-core architecture. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques (PACT’07). 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Vinicius Petrucci, Orlando Loques, and Daniel Mossé. 2012. Lucky scheduling for energy-efficient heterogeneous multi-core systems. In Proceedings of the USENIX Conference on Power-Aware Computing and Systems (HotPower’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Dmitry Ponomarev, Gurhan Kucuk, and Kanad Ghose. 2001. Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources. In Proceedings of the International Symposium on Microarchitecture. 90--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Mihai Pricopi and Tulika Mitra. 2012. Bahurupi: A polymorphic heterogeneous multi-core architecture. ACM Transactions on Architecture and Code Optimization 8, 4, 22:1--22:21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Mihai Pricopi and Tulika Mitra. 2014. Task scheduling on adaptive multi-core. IEEE Transactions on Computers 63, 10, 2590--2603. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Mihai Pricopi, Thannirmalai Somu Muthukaruppan, Vanchinathan Venkataramani, Tulika Mitra, and Sanjay Vishin. 2013. Power-performance modeling on asymmetric multi-cores. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’13). 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Moo-Ryong Ra, Bodhi Priyantha, Aman Kansal, and Jie Liu. 2012. Improving energy efficiency of personal sensing applications with heterogeneous multi-processors. In Proceedings of the ACM Conference on Ubiquitous Computing (Ubicomp’12). 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. M. Mustafa Rafique, Benjamin Rose, Ali R. Butt, and Dimitrios S. Nikolopoulos. 2009. Supporting MapReduce on large-scale asymmetric multi-core clusters. ACM SIGOPS Operating Systems Review 43, 2, 25--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Behnam Robatmili, Dong Li, Hadi Esmaeilzadeh, Sibi Govindan, Aaron Smith, Andrew Putnam, Doug Burger, and Stephen W. Keckler. 2013. How to implement effective prediction and forwarding for fusable dynamic multicore architectures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’13). 460--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. Rance Rodrigues, Arunachalam Annamalai, Israel Koren, Sandip Kundu, and Omer Khan. 2011. Performance per watt benefits of dynamic core morphing in asymmetric multicores. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’11). 121--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Rance Rodrigues, Israel Koren, and Sandip Kundu. 2014. Performance and power benefits of sharing execution units between a high performance core and a low power core. In Proceedings of the International Conference on VLSI Design (VLSID’14). 204--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Juan Carlos Saez, Alexandra Fedorova, David Koufaty, and Manuel Prieto. 2012. Leveraging core specialization via OS scheduling to improve performance on asymmetric multicore systems. ACM Transactions on Computer Systems 30, 2, 6:1--6:38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Juan Carlos Saez, Alexandra Fedorova, Manuel Prieto, and Hugo Vegas. 2010. Operating system support for mitigating software scalability bottlenecks on asymmetric multicore processors. In Proceedings of the Computing Frontiers Conference (CF’10). 31--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Juan Carlos Saez, Adrian Pousa, Fernando Castro, Daniel Chaver, and Manuel Prieto-Matias. 2015. ACFS: A completely fair scheduler for asymmetric single-ISA multicore systems. In Proceedings of the ACM Symposium on Applied Computing (SAC’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Pierre Salverda and Craig Zilles. 2008. Fundamental performance constraints in horizontal fusion of in-order cores. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’08). 252--263.Google ScholarGoogle ScholarCross RefCross Ref
  102. Samsung. 2013. SAMSUNG Highlights Innovations in Mobile Experiences Driven by Components, in CES Keynote. Retrieved December 29, 2015, from http://www.samsung.com/us/news/20353.Google ScholarGoogle Scholar
  103. Karthikeyan Sankaralingam, Ramadass Nagarajan, Haiming Liu, Changkyu Kim, Jaehyuk Huh, Doug Burger, Stephen W. Keckler, and Charles R. Moore. 2003. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In Proceedings of the International Symposium on Computer Architecture (ISCA’03). 422--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. Lina Sawalha and Ronald D. Barnes. 2012. Energy-efficient phase-aware scheduling for heterogeneous multicore processors. In Proceedings of the IEEE Green Technologies Conference. 1--6.Google ScholarGoogle Scholar
  105. Daniel Shelepov, Juan Carlos Saez Alcaide, Stacey Jeffery, Alexandra Fedorova, Nestor Perez, Zhi Feng Huang, Sergey Blagodurov, and Viren Kumar. 2009. HASS: A scheduler for heterogeneous multicore systems. ACM SIGOPS Operating Systems Review 43, 2, 66--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Tyler Sondag and Hridesh Rajan. 2009. Phase-guided thread-to-core assignment for improved utilization of performance-asymmetric multi-core processors. In Proceedings of the ICSE Workshop on Multicore Software Engineering. 73--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Sudarshan Srinivasan, Nithesh Kurella, Israel Koren, and Sandip Kundu. 2015. Exploring heterogeneity within a core for improved power efficiency. IEEE Transactions on Parallel and Distributed Systems PP, 99, 1.Google ScholarGoogle Scholar
  108. Sudarshan Srinivasan, Rance Rodrigues, Arunachalam Annamalai, Israel Koren, and Sandip Kundu. 2013. A study on polymorphing superscalar processor dynamically to improve power efficiency. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI’13). 46--51.Google ScholarGoogle ScholarCross RefCross Ref
  109. Sadagopan Srinivasan, Li Zhao, Ramesh Illikkal, and Ravishankar Iyer. 2011. Efficient interaction between OS and architecture in heterogeneous platforms. ACM SIGOPS Operating Systems Review 45, 1, 62--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. Richard Strong, Jayaram Mudigonda, Jeffrey C. Mogul, Nathan Binkert, and Dean Tullsen. 2009. Fast switching of threads between cores. ACM SIGOPS Operating Systems Review 43, 2, 35--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. M. Aater Suleman, Onur Mutlu, José A. Joao, Khubaib, and Yale Patt. 2010. Data marshaling for multi-core architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA’10). 441--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. M. Aater Suleman, Onur Mutlu, Moinuddin K. Qureshi, and Yale N. Patt. 2009. Accelerating critical section execution with asymmetric multi-core architectures. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). 253--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. M. Aater Suleman, Yale N. Patt, Eric Sprangle, Anwar Rohillah, Anwar Ghuloum, and Doug Carmean. 2007. Asymmetric Chip Multiprocessors: Balancing Hardware Efficiency and Programmer Efficiency. TR-HPS-2007-001. University of Texas, Austin, TX.Google ScholarGoogle Scholar
  114. Hsin-Ching Sun, Bor-Yeh Shen, Wuu Yang, and Jenq-Kuen Lee. 2011. Migrating Java threads with fuzzy control on asymmetric multicore systems for better energy delay product. In Proceedings of the International Conference on Computing and Security.Google ScholarGoogle Scholar
  115. Tao Sun, Hong An, Tao Wang, Haibo Zhang, and Xiufeng Sui. 2012. CRQ-based fair scheduling on composable multicore architectures. In Proceedings of the International Conference on Supercomputing (ICS’12). 173--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Ibrahim Takouna, Wesam Dawoud, and Christoph Meinel. 2011. Efficient virtual machine scheduling-policy for virtualized heterogeneous multicore systems. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’11).Google ScholarGoogle Scholar
  117. David Tarjan, Michael Boyer, and Kevin Skadron. 2008. Federation: Repurposing scalar cores for out-of-order instruction issue. In Proceedings of the Design Automation Conference (DAC’08). 772--775. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Kenzo Van Craeynest, Shoaib Akram, Wim Heirman, Aamer Jaleel, and Lieven Eeckhout. 2013. Fairness-aware scheduling on single-ISA heterogeneous multi-cores. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’13). 177--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. Kenzo Van Craeynest and Lieven Eeckhout. 2013. Understanding fundamental design choices in single-ISA heterogeneous multicore architectures. ACM Transactions on Architecture and Code Optimization 9, 4, 32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. Kenzo Van Craeynest, Aamer Jaleel, Lieven Eeckhout, Paolo Narvaez, and Joel Emer. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In Proceedings of the International Symposium on Computer Architecture (ISCA’12). 213--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Ashish Venkat and Dean M. Tullsen. 2014. Harnessing ISA diversity: Design of a heterogeneous-ISA chip multiprocessor. In Proceedings of the International Symposium on Computer Architecture (ISCA’14). 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. Jeffrey Vetter and Sparsh Mittal. 2015. Opportunities for nonvolatile memory systems in extreme-scale high performance computing. Computing in Science and Engineering 17, 2, 73--82.Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. Carl A. Waldspurger and William E. Weihl. 1994. Lottery scheduling: Flexible proportional-share resource management. In Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI’94). Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Yasuko Watanabe, John D. Davis, and David A. Wood. 2010. WiDGET: Wisconsin decoupled grid execution tiles. In Proceedings of the International Symposium on Computer Architecture (ISCA’10), Vol. 38. 2--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. Ryan Whitwam. 2014. Qualcomm Unveils 64-Bit Snapdragon 808 and 810 SoCs: The Apple A7 Stop-Gap Measures Continue. Retrieved December 29, 2015, from http://goo.gl/v4ywMW.Google ScholarGoogle Scholar
  126. Youfeng Wu, Shiliang Hu, Edson Borin, and Cheng Wang. 2011. A HW/SW co-designed heterogeneous multi-core virtual machine for energy-efficient general purpose computing. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’11). 236--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. Ying Zhang, Lide Duan, Bin Li, Lu Peng, and Srinivasan Sadagopan. 2014a. Energy efficient job scheduling in single-ISA heterogeneous chip-multiprocessors. In Proceedings of the International Symposium on Quality Electronic Design (ISQED’14). 660--666.Google ScholarGoogle ScholarCross RefCross Ref
  128. Ying Zhang, Li Zhao, Ramesh Illikkal, Ravi Iyer, Andrew Herdrich, and Lu Peng. 2014b. QoS management on heterogeneous architecture for parallel applications. In Proceedings of the IEEE International Conference on Computer Design (ICCD’14). 332--339.Google ScholarGoogle ScholarCross RefCross Ref
  129. Hongtao Zhong, Steven A. Lieberman, and Scott A. Mahlke. 2007. Extending multicore architectures to exploit hybrid parallelism in single-thread applications. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’07). 25--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. Yuhao Zhu and Vijay Janapa Reddi. 2013. High-performance and energy-efficient mobile web browsing on big/little systems. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’13). 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Survey of Techniques for Architecting and Managing Asymmetric Multicore Processors

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Computing Surveys
        ACM Computing Surveys  Volume 48, Issue 3
        February 2016
        619 pages
        ISSN:0360-0300
        EISSN:1557-7341
        DOI:10.1145/2856149
        • Editor:
        • Sartaj Sahni
        Issue’s Table of Contents

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 February 2016
        • Accepted: 1 November 2015
        • Revised: 1 August 2015
        • Received: 1 April 2015
        Published in csur Volume 48, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • survey
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader