Enhancing security of Hadoop in a public cloud

Enhancing security of Hadoop in a public cloud Hadoop has become increasingly popular as it rapidly processes data in parallel. Cloud computing gives reliability, flexibility, scalability, elasticity and cost saving to cloud users. Deploying Hadoop in cloud can benefit Hadoop users. Our evaluation exhibits that various internal cloud attacks can bypass current Hadoop security mechanisms, and compromised Hadoop components can be used to threaten overall Hadoop. It is urgent to improve compromise resilience, Hadoop can maintain a relative high security level when parts of Hadoop are compromised. Hadoop has two vulnerabilities that can dramatically impact its compromise resilience. The vulnerabilities are the overloaded authentication key, and the lack of fine-grained access control at the data access level. We developed a security enhancement for a public cloud-based Hadoop, named SEHadoop, to improve the compromise resilience through enhancing isolation among Hadoop components and enforcing least access privilege for Hadoop processes. We have implemented the SEHadoop model, and demonstrated that SEHadoop fixes the above vulnerabilities with minimal or no run-time overhead, and effectively resists related attacks.