Handling Big Data Efficiently by Using Map Reduce Technique

Handling Big Data Efficiently by Using Map Reduce Technique Extremely large amount of data is being captured by today’s organizations and is continue to increase. It becomes computationally inefficient to analyze such huge data. Researchers has addressed problem in discovering knowledge from these continuously growing large data sets. Quantity of available raw data has been increasing at a very high rate. The precious information is concealed in large databases. Data mining has become an interesting area to extract the embedded precious information from them. For many years it has been found its root in all kinds of application areas. Thus, gave evolution to many data mining methods which started to get applied in several real life fields. But not all the methods possess the capability to deal with and handle the huge collection of data. In recent years, numbers of computation and data intensive scientific data analyses are established. To perform the large scale data mining analyses so as to meet the scalability and performance requirements of big data, several efficient parallel and concurrent algorithms got applied. A lot of parallel algorithms are put into action using different parallelization techniques, such as-threads, MPI, MapReduce etc. Which yield different performance and usability characteristics. The MPI model works efficiently in computing rigorous problems but it is a complicated task to bring this model into the practical use. There is currently considerable enthusiasm around the MapReduce paradigm for large-scale data analysis. It is inspired by functional programming which allows expressing distributed computations on massive amounts of data. It is designed for large-scale data processing as it allows to run on clusters of commodity hardware. A prominent parallel data processing tool MapReduce is gaining significant momentum from both industry and academia as the volume of data to analyze grows rapidly. In this paper, we are going to work around MapReduce, its advantages, disadvantages and how it can be – sed in integration with other technology.