We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Big data pre-processing methods with vehicle driving data using MapReduce techniques.
- Authors
Choi, Eunmi; Cho, Wonhee
- Abstract
A huge amount of sensing data are generated by a large number of pervasive IoT devices. In order to find meaningful information from the big data, it is essential to perform pre-processing, in which many outlier data points need to be removed, because they deteriorate as time passes. Although pre-processing is essential in the big data field, there has been a significant lack of research works with case studies. In this paper, big data pre-processing methods are investigated and proposed. To evaluate the pre-processing methods for accurate analysis, we used a collection of digital tachograph (DTG) data. We obtained DTG sensing data of 6198 driving vehicles over a year. We studied five kinds of pre-processing methods: filtering ranges, excluding meaningless values, comparing filters from variables, applying statistical techniques, and finding driving patterns. In addition, we developed a MapReduce program using a Hadoop ecosystem and deployed big data to perform the pre-processing analysis. Through the pre-processing steps, we confirmed that the proportion of DTG sensing data points including any errors was up to 27.09%. Compared to the traditional brute-force way to detect, ours had 71.1% additional detection effect. In addition, we confirmed that outlier data points, which are difficult to detect through simple range error pre-processing, could be well detected.
- Subjects
BIG data; SENSOR networks; INTERNET of things; TACHOGRAPHS; HADOOP (Company); COMPUTER network resources
- Publication
Journal of Supercomputing, 2017, Vol 73, Issue 7, p3179
- ISSN
0920-8542
- Publication type
Article
- DOI
10.1007/s11227-017-2014-x