Apache™ Hadoop® is an open source software project that enables distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with very high degree of fault tolerance. Hadoop is composed of four core components—Hadoop Common, Hadoop Distributed File System (HDFS), MapReduce and YARN.
A module containing the utilities that support the other Hadoop components.
Hadoop Distributed File System (HDFS)
A file system that provides reliable data storage and access across all the nodes in a Hadoop cluster. It links together the file systems on many local nodes to create a single file system.
A framework for writing applications that process large amounts of structured and unstructured data in parallel across a cluster of thousands of machines, in a reliable, fault-tolerant manner.
Yet Another Resource Negotiator (YARN)
The next-generation MapReduce, which assigns CPU, memory and storage to applications running on a Hadoop cluster. It enables application frameworks other than MapReduce to run on Hadoop, opening up a wealth of possibilities.
3) Hadoop: Beginner’s Guide
4) Hadoop: The Definitive Guide, 4th Edition
5) The Hadoop Ecosystem Table
RHadoop, which is a collection of R packages for connecting R to Hadoop and running R on Hadoop nodes, allows users to manage and analyze data with Hadoop in R, including the creation of map-reduce jobs.
The R packages in RHadoop Toolkit are: rmr2, rhdfs, rhbase, plyrmr, and ravro.
rmr2: functions providing Hadoop MapReduce functionality in R.
rhdfs: functions providing file management of the HDFS from within R.
rhbase: functions providing database management for the HBase distributed database from within R.
plyrmr: higher level plyr-like data processing for structured data, powered by rmr.
ravro: read and write files in avro format.
1) Step by Step Guide to Setting Up an R-Hadoop System
2) RHadoop Tutorial
3) Big Data Analytics with R and Hadoop
4) R and Hadoop Data Analysis – RHadoop
5) Big Data Analysis using RHadoop
6) Hadoop MapReduce Cookbook