GET THE APP

International Journal of Biomedical Data Mining

International Journal of Biomedical Data Mining
Open Access

ISSN: 2090-4924

Abstract

Implementation of Decision Tree Using Hadoop MapReduce

Tianyi Yang and Anne Hee Hiong Ngu

Hadoop is one of the most popular general-purpose computing platforms for the distributed processing of big data. HDFS is implementation of distributed file system by Hadoop to be able to store huge amount of data in a reliable way and serve data processing component by Hadoop at the same time. MapReduce is the main processing engine of Hadoop. In this study, we have implemented HDFS and MapReduce for a well- known learning algorithm—decision tree in a scalable fashion to large input problem size. Computational performance with node count and problem size is evaluated.

Top