Enhancing performance of Hadoop and MapReduce for scientific data using NoSQL database

Enhancing performance of Hadoop and MapReduce for scientific data using NoSQL database Scientific data sets usually have similar jobs that are frequently applied to them by different users. In addition, many of these data sets are unstructured and complex, and required fast and simple processing. In order to increase the performance of the existing Hadoop and MapReduce algorithm, it is necessary to develop an algorithm based on the type of data sets and requirements of the jobs. Genomic and biological data is an example of unstructured data because it only has a huge sequence of unreadable and non-relational letters. In this paper, we present an overview of a developed MapReduce algorithm and its simulation using HBase as a NoSQL database.