We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Soft error resilience in Big Data kernels through modular analysis.
- Authors
Chen, Sui; Bronevetsky, Greg; Peng, Lu; Li, Bin; Fu, Xin
- Abstract
The shrinking processor feature and operating voltages of processor circuits are making them increasingly vulnerable to soft faults, which calls for fault resilience techniques at both the software and hardware levels under the big data context. To assist software developers in writing fault-resilient big data applications, we propose the tool ErrorSight, which helps them to focus their efforts on code regions and data structures that are most vulnerable to soft errors, understand how numerical errors propagate through the program, and apply fault resilience techniques effectively. ErrorSight achieves this through efficient generation of error profiles leveraging the predictive power of the Boosted Regression Tree model. We use four big data kernels to illustrate the modular analysis mechanism of ErrorSight and show its usefulness in the development of numerical fault-resilience in Big Data.
- Subjects
SOFT errors; HIGH performance computing research; BIG data; COMPUTER software developers; COMPUTER software development
- Publication
Journal of Supercomputing, 2016, Vol 72, Issue 4, p1570
- ISSN
0920-8542
- Publication type
Article
- DOI
10.1007/s11227-016-1682-2