We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Soft-error mitigation by means of decoupled transactional memory threads.
- Authors
Sánchez, Daniel; Cebrián, Juan; García, José; Aragón, Juan
- Abstract
CMOS scaling exacerbates hardware errors making reliability a big concern for recent and future microarchitecture designs. Mechanisms to provide fault tolerance in architectures must accomplish several objectives such as low performance degradation, power consumption and area overhead. Several studies have already proposed fault tolerance for parallel codes. However, these proposals are usually implemented over non-realistic environments including the use of shared-buses among processors or modifying highly optimized hardware designs such as caches. Our attempt to face this multiple challenge is an architectural design called LBRA (Log-Based Redundant Architecture). Based on a Hardware Transactional Memory architecture, LBRA executes redundant threads which communicate through a pair-shared virtual memory log allocated in cache. Our initial version of LBRA executes these redundant threads in SMT cores. To avoid the performance penalty inherent to this architecture, we propose to decouple their execution in different cores, solving the inter-core communication by means of a log buffer empowered by a simple prefetch strategy. Simulation results using a variety of scientific and multimedia applications show that the execution time overhead of our best design is less than 7 % over a base case without fault tolerance. Additionally, we show that LBRA outperforms previous proposals that we have implemented and evaluated in the same framework.
- Subjects
CMOS memory circuits; FAULT-tolerant computing; SOFT errors; CACHE memory; COMPUTER simulation
- Publication
Distributed Computing, 2015, Vol 28, Issue 2, p75
- ISSN
0178-2770
- Publication type
Article
- DOI
10.1007/s00446-014-0215-6