We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
MapReduce中连接负载均衡优化研究.
- Authors
ZHAI Hon-min; LIU Guo-hua; ZHAO Wei; LIU Yuan-yuan; ZHAI Hong-kun
- Abstract
Data analysis and processing is one of the most important tasks in large-scale distributed data processing appHcaiions. Due to Its simpicclty and scalability, MapReduce programming model has gradually become the crudal model for large-scale dsstributed data processing systems (eg. Hadoop). Since the data may be uniformly dsstributed, data skew occurs when MapReduce programming model joins data, thus degrading the join performance severely. To solve data skew, tts reason Is analyzed, the load bahncing cost model Is estabHshed, and the rangepariioner algorithm Is proposed to control data skew so as to realize load bahncing. Experimental resutts demonstrate that our method can obviously improve the efticiency of joins.
- Publication
Computer Engineering & Science / Jisuanji Gongcheng yu Kexue, 2014, Vol 36, Issue 10, p1860
- ISSN
1007-130X
- Publication type
Article
- DOI
10.3969/j.issn.1007-130X.2014.10.004