Overview
Apache Hadoop is an open-source framework for distributed processing of large datasets across clusters using simple programming models. It scales from single servers to thousands of machines, each offering local computation and storage. Core modules include Hadoop Common, HDFS, YARN, and MapReduce.
Features
- HDFS (Distributed File System)
- MapReduce (Computing Framework)
- YARN (Resource Manager)