We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
On using MapReduce to scale algorithms for Big Data analytics: a case study.
- Authors
Kijsanayothin, Phongphun; Chalumporn, Gantaphon; Hewett, Rattikorn
- Abstract
Introduction: Many data analytics algorithms are originally designed for in-memory data. Parallel and distributed computing is a natural first remedy to scale these algorithms to "Big algorithms" for large-scale data. Advances in many Big Data analytics algorithms are contributed by MapReduce, a programming paradigm that enables parallel and distributed execution of massive data processing on large clusters of machines. Much research has focused on building efficient naive MapReduce-based algorithms or extending MapReduce mechanisms to enhance performance. However, we argue that these should not be the only research directions to pursue. We conjecture that when naive MapReduce-based solutions do not perform well, it could be because certain classes of algorithms are not amendable to MapReduce model and one should find a fundamentally different approach to a new MapReduce-based solution. Case description: This paper investigates a case study of a scaling problem of "Big algorithms" for a popular association rule-mining algorithm, particularly the development of Apriori algorithm in MapReduce model. Discussion and evaluation: Formal and empirical illustrations are explored to compare our proposed MapReduce-based Apriori algorithm with previous solutions. The findings support our conjecture and our study shows promising results compared to the state-of-the-art performer with 7% increase in performance on the average of transactions ranging from 10,000 to 120,000. Conclusions: The results confirm that effective MapReduce implementation should avoid dependent iterations, such as that of the original sequential Apriori algorithm. These findings could lead to many more alternative non-naive MapReduce-based "Big algorithms".
- Subjects
BIG data; APRIORI algorithm; ASSOCIATION rule mining; TRANSACTION systems (Computer systems); ALGORITHMS; DATA analysis; PARALLEL programming
- Publication
Journal of Big Data, 2019, Vol 6, Issue 1, pN.PAG
- ISSN
2196-1115
- Publication type
Article
- DOI
10.1186/s40537-019-0269-1