Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Hadoop ApplicationMaster


May 26, 2021 Hadoop


Table of contents


YARN - ApplicationMaster

Resource management and task monitoring for individual jobs

Description of the specific functions:

  1. To calculate the application's resource requirements, resources can be static or dynamic calculations, static is generally specified when requests, dynamic needs to be decided by the Application Master according to the running state of the app
  2. Request resources for the location based on data (Data Locality)
  3. Request resources from ResourceManager, interact with NodeManager to run and monitor programs, monitor the use of requested resources, and monitor job progress
  4. Track task status and progress, send heartbeat messages to ResourceManager on a timely schedule, and report on resource usage and app progress
  5. Be responsible for fault tolerance of tasks within this job

ApplicationMaster can be a program written in any language, it interacts with ResourceManager and NodeManager via ProtocolBuf, which was previously the responsibility of a global JobTracker, and now has one job, more scalable, at least not because there are too many jobs, causing JobTracker bottlenecks. At the same time, the logic of the job is placed in a separate ApplicationMaster, making it more flexible, and each job can have its own way of handling it without binding to MapReduce's processing mode

How to calculate resource requirements

The average MapReduce determines the calculated number of Map and Reduce based on the number of block, and then the average Map or Reduce takes up a Container

How to discover localization of data

Data localization is obtained through the blocking information of HDFS