An Empirical Study on Common Bugs in Deep Learning Compilers

Xiaoting Du, Zheng Zheng, Lei Ma, Jianjun Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The highly diversified deep learning (DL) frame-works and target hardware architectures bring big challenges for DL model deployment for industrial production. Up to the present, continuous efforts have been made to develop DL compilers with multiple state-of-the-arts available, e.g., TVM, Glow, nGraph, PlaidML, and Tensor Comprehensions (TC). Unlike traditional compilers, DL compilers take a DL model built by DL frameworks as input and generate optimized code as the output for a particular target device. Similar to other software, DL compilers are also error-prone. Buggy DL compilers can generate incorrect code and result in unexpected model behaviors. To better understand the current status and common bug characteristics of DL compilers, we performed a large-scale empirical study of five popular DL compilers covering TVM, Glow, nGraph, PlaidML, and TC, collecting a total of 2,717 actual bug reports submitted by users and developers. We made large manual efforts to investigate these bug reports and classified them based on their root causes, during which five root causes were identified, including environment, compatibility, memory, document, and semantic. After labeling the types of bugs, we further examined the important consequences of each type of bug and analyzed the correlation between bug types and impacts. Besides, we studied the time required to fix different types of bugs in DL compilers. Seven important findings are eventually obtained, with practical implications provided for both DL compiler developers and users.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE 32nd International Symposium on Software Reliability Engineering, ISSRE 2021
EditorsZhi Jin, Xuandong Li, Jianwen Xiang, Leonardo Mariani, Ting Liu, Xiao Yu, Nahgmeh Ivaki
PublisherIEEE Computer Society
Pages184-195
Number of pages12
ISBN (Electronic)9781665425872
DOIs
Publication statusPublished - 2021
Event32nd IEEE International Symposium on Software Reliability Engineering, ISSRE 2021 - Wuhan, China
Duration: Oct 25 2021Oct 28 2021

Publication series

NameProceedings - International Symposium on Software Reliability Engineering, ISSRE
Volume2021-October
ISSN (Print)1071-9458

Conference

Conference32nd IEEE International Symposium on Software Reliability Engineering, ISSRE 2021
Country/TerritoryChina
CityWuhan
Period10/25/2110/28/21

All Science Journal Classification (ASJC) codes

  • Software
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'An Empirical Study on Common Bugs in Deep Learning Compilers'. Together they form a unique fingerprint.

Cite this