research-article

Toward understanding compiler bugs in GCC and LLVM

Authors:
Chengnian Sun

University of California at Davis, USA

University of California at Davis, USA
View Profile

,
Vu Le

University of California at Davis, USA

University of California at Davis, USA
View Profile

,
Qirun Zhang

University of California at Davis, USA

University of California at Davis, USA
View Profile

,
Zhendong Su

University of California at Davis, USA

University of California at Davis, USA
View Profile

ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and AnalysisJuly 2016Pages 294–305https://doi.org/10.1145/2931037.2931074

Published:18 July 2016Publication History

Get Citation Alerts

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.
Manage my Alerts

New Citation Alert!

Please log in to your account
Publisher Site

Get Access

ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

Pages 294–305

ABSTRACT

Compilers are critical, widely-used complex software. Bugs in them have significant impact, and can cause serious damage when they silently miscompile a safety-critical application. An in-depth understanding of compiler bugs can help detect and fix them. To this end, we conduct the first empirical study on the characteristics of the bugs in two main-stream compilers, GCC and LLVM. Our study is significant in scale — it exhaustively examines about 50K bugs and 30K bug fix revisions over more than a decade’s span. This paper details our systematic study. Summary findings include: (1) In both compilers, C++ is the most buggy component, accounting for around 20% of the total bugs and twice as many as the second most buggy component; (2) the bug revealing test cases are typically small, with 80% having fewer than 45 lines of code; (3) most of the bug fixes touch a single source file with small modifications (43 lines for GCC and 38 for LLVM on average); (4) the average lifetime of GCC bugs is 200 days, and 111 days for LLVM; and (5) high priority tends to be assigned to optimizer bugs, most notably 30% of the bugs in GCC’s inter-procedural analysis component are labeled P1 (the highest priority). This study deepens our understanding of compiler bugs. For application developers, it shows that even mature production compilers still have many bugs, which may affect development. For researchers and compiler developers, it sheds light on interesting characteristics of compiler bugs, and highlights challenges and opportunities to more effectively test and debug compilers.

References

ACE. SuperTest compiler test and validation suite. http://www.ace.nl/compiler/supertest.html.Google Scholar
A. Balestrat. CCG: A random C code generator. https: //github.com/Merkil/ccg/.Google Scholar
S. Blazy, Z. Dargaye, and X. Leroy. Formal Verification of a C Compiler Front-End. In Int. Symp. on Formal Methods (FM), pages 460–475, 2006. Google ScholarDigital Library
N. Chen, S. C. H. Hoi, and X. Xiao. Software Process Evaluation: A Machine Learning Approach. In ASE, pages 333–342, Washington, DC, USA, 2011. ISBN 978- 1-4577-1638-6. Google ScholarDigital Library
Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197–208, 2013. Google ScholarDigital Library
R. Chillarege, W.-L. Kao, and R. G. Condit. Defect Type and Its Impact on the Growth Curve. In Proceedings of the 13th International Conference on Software Engineering (ICSE), pages 246–255, 1991. Google ScholarDigital Library
ISBN 0- 89791-391-4. URL http://dl.acm.org/citation.cfm?id= 256664.256773.Google Scholar
A. Chou, J. Yang, B. Chelf, S. Hallem, and D. Engler. An Empirical Study of Operating Systems Errors. In Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles (SOSP), pages 73–88, 2001. Google ScholarDigital Library
ISBN 1-58113-389-8.Google Scholar
P. Cuoq, B. Monate, A. Pacalet, V. Prevosto, J. Regehr, B. Yakobowski, and X. Yang. Testing static analyzers with randomly generated programs. In NASA Formal Methods - 4th International Symposium (NFM), pages 120–125, 2012. Google ScholarDigital Library
GCC. GIMPLE – GNU Compiler Collection (GCC) Internals,. https://gcc.gnu.org/onlinedocs/gccint/ GIMPLE.html, accessed: 2014-06-25.Google Scholar
GCC. RTL – GNU Compiler Collection (GCC) Internals,. https://gcc.gnu.org/onlinedocs/gccint/RTL. html, accessed: 2014-06-25.Google Scholar
V. Le, M. Afshari, and Z. Su. Compiler Validation via Equivalence Modulo Inputs. In Proceedings of the 2014 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2014. Google ScholarDigital Library
V. Le, C. Sun, and Z. Su. Randomized Stress-Testing of Link-Time Optimizers. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA), pages 327–337. ACM, 2015. Google ScholarDigital Library
V. Le, C. Sun, and Z. Su. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 386–399. ACM, 2015. Google ScholarDigital Library
X. Leroy, A. W. Appel, S. Blazy, and G. Stewart. The CompCert Memory Model, Version 2. Research report RR-7987, INRIA, June 2012.Google Scholar
Z. Li, L. Tan, X. Wang, S. Lu, Y. Zhou, and C. Zhai. Have Things Changed Now?: An Empirical Study of Bug Characteristics in Modern Open Source Software. In Proceedings of the 1st Workshop on Architectural and System Support for Improving Software Dependability (ASID), pages 25–33, 2006. ISBN 1-59593-576-2. Google ScholarDigital Library
N. P. Lopes, D. Menendez, S. Nagarakatte, and J. Regehr. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 22–32, 2015.. URL http://doi.acm.org/10.1145/2737924.2737965. Google ScholarDigital Library
S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from Mistakes: A Comprehensive Study on Real World Concurrency Bug Characteristics. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 329–339, 2008. ISBN 978-1-59593-958-6. Google ScholarDigital Library
L. Martignoni, R. Paleari, G. Fresi Roglia, and D. Bruschi. Testing system virtual machines. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA), pages 171–182, 2010. ISBN 978-1-60558-823-0. Google ScholarDigital Library
L. Martignoni, R. Paleari, A. Reina, G. F. Roglia, and D. Bruschi. A methodology for testing cpu emulators. ACM Trans. Softw. Eng. Methodol., 22(4):29:1–29:26, Oct. 2013. ISSN 1049-331X. Google ScholarDigital Library
E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48–53, 2012.Google Scholar
E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88–93, 2013.Google Scholar
Plum Hall, Inc. The Plum Hall Validation Suite for C. http://www.plumhall.com/stec.html.Google Scholar
A. Pnueli, M. Siegel, and E. Singerman. Translation Validation. In 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151–166, 1998. Google ScholarDigital Library
J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335–346, 2012. Google ScholarDigital Library
S. K. Sahoo, J. Criswell, and V. Adve. An Empirical Study of Reported Bugs in Server Software with Implications for Automated Bug Diagnosis. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering (ICSE), pages 485–494, 2010. ISBN 978-1-60558-719-6. Google ScholarDigital Library
L. Song and S. Lu. Statistical Debugging for Real-world Performance Problems. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), pages 561–578, 2014. ISBN 978-1-4503-2585-1.. URL http://doi.acm.org/10.1145/2660193.2660234. Google ScholarDigital Library
M. Sullivan and R. Chillarege. A Comparison of Software Defects in Database Management Systems and Operating Systems. In Twenty-Second International Symposium on Fault-Tolerant Computing (FTCS), pages 475–484, July 1992.Google ScholarCross Ref
C. Sun, D. Lo, X. Wang, J. Jiang, and S.-C. Khoo. A Discriminative Model Approach for Accurate Duplicate Bug Report Retrieval. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering (ICSE), pages 45–54, 2010. Google ScholarDigital Library
C. Sun, J. Du, N. Chen, S.-C. Khoo, and Y. Yang. Mining Explicit Rules for Software Process Evaluation. In ICSSP, pages 118–125, 2013. ISBN 978-1-4503-2062- 7. Google ScholarDigital Library
C. Sun, V. Le, and Z. Su. Finding and Analyzing Compiler Warning Defects. In Proceedings of the 38th International Conference on Software Engineering (ICSE). ACM, 2016. Google ScholarDigital Library
F. Thung, S. Wang, D. Lo, and L. Jiang. An Empirical Study of Bugs in Machine Learning Systems. In Software Reliability Engineering (ISSRE), 2012 IEEE 23rd International Symposium on, pages 271–280, Nov 2012. Google ScholarDigital Library
Y. Tian, D. Lo, and C. Sun. DRONE: Predicting Priority of Reported Bugs by Multi-factor Analysis. In 29th IEEE International Conference on Software Maintenance (ICSM), pages 200–209, Sept 2013. Google ScholarDigital Library
TIOBE. TIOBE Index for May 2016. http://www. tiobe.com/tiobe index, accessed: 2016-05-15.Google Scholar
J.-B. Tristan and X. Leroy. Formal Verification of Translation Validators: A Case Study on Instruction Scheduling Optimizations. In Proceedings of the 35th ACM Symposium on Principles of Programming Languages (POPL), pages 17–27, Jan. 2008. Google ScholarDigital Library
X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and Understanding Bugs in C Compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283–294, 2011. Google ScholarDigital Library
Z. Yin, X. Ma, J. Zheng, Y. Zhou, L. N. Bairavasundaram, and S. Pasupathy. An empirical study on configuration errors in commercial and open source systems. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP), pages 159– 172, 2011. ISBN 978-1-4503-0977-6. Google ScholarDigital Library
Z. Yin, D. Yuan, Y. Zhou, S. Pasupathy, and L. Bairavasundaram. How Do Fixes Become Bugs? In 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE), pages 26–36, 2011. ISBN 978-1-4503- 0443-6. Google ScholarDigital Library
A. Zeller and R. Hildebrandt. Simplifying and Isolating Failure-Inducing Input. IEEE Trans. Softw. Eng., 28 (2):183–200, Feb. 2002. ISSN 0098-5589. Google ScholarDigital Library
C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang. Automated test program generation for an industrial optimizing compiler. In ICSE Workshop on Automation of Software Test (AST), pages 36–43, 2009.Google Scholar
T. Zimmermann, N. Nagappan, P. J. Guo, and B. Murphy. Characterizing and Predicting Which Bugs Get Reopened. In Proceedings of the 34th International Conference on Software Engineering (ICSE), pages 1074–1083, 2012. Google ScholarDigital Library

Index Terms

Toward understanding compiler bugs in GCC and LLVM
1. General and reference
  1. Cross-computing tools and techniques
    1. Empirical studies
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        
        Software testing and debugging

Recommendations

A comprehensive study of deep learning compiler bugs

ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

There are increasing uses of deep learning (DL) compilers to generate optimized code, boosting the runtime performance of DL models on specific hardware. Like their traditional counterparts, DL compilers can generate incorrect code, resulting in ...

Read More
Finding compiler bugs via live code mutation

OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

Validating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...

Read More
Finding and understanding bugs in C compilers

PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation

Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to ...

Read More

Comments

comments powered by Disqus.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

July 2016

452 pages

ISBN:9781450343909

DOI:10.1145/2931037

General Chair:

Andreas Zeller
Saarland University, Germany
,

Program Chair:

Abhik Roychoudhury
National University of Singapore, Singapore
Copyright © 2016 ACM

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher

Association for Computing Machinery

New York, NY, United States
Publication History
- Published: 18 July 2016
Permissions

Request permissions about this article.
Request Permissions

Check for updates
Author Tags
compiler bugs

compiler testing

empirical studies
Qualifiers
- research-article
Conference

Acceptance Rates

Overall Acceptance Rate58of213submissions,27%
Upcoming Conference
ISSTA '24

Sponsor:

sigsoft

33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

September 16 - 20, 2024

Vienna , Austria
Funding Sources
Other Metrics

View Article Metrics

Article Metrics
- 74
  Total Citations
  View Citations
- 800
  Total Downloads
- Downloads (Last 12 months)152
- Downloads (Last 6 weeks)27
Other Metrics

View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Toward understanding compiler bugs in GCC and LLVM

ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

ABSTRACT

References

Cited By

Index Terms

Recommendations

A comprehensive study of deep learning compiler bugs

Finding compiler bugs via live code mutation

Finding and understanding bugs in C compilers