Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Hadoop ResourceManager


May 26, 2021 Hadoop


Table of contents


YARN - ResourceManager

Responsible for global resource management and task scheduling, the entire cluster as a computing resource pool, only focus on allocation, regardless of application, and not responsible for fault tolerance

Resource management

  1. Previously, resources were divided into Map slots and Reduce slots for each node, and now they are Containers, each of which can run ApplicationMaster, Map, Reduce, or any program as needed
  2. Previous resource allocations were static, currently dynamic, and resource utilization was higher
  3. Container is the unit of resource application, a resource application format: .lt;resource-name, priority, resource-requirement, number-of-containers, resource-name: host name, rack name, or s (on behalf of any machine), resource-requirement: CPU and memory are currently only supported
  4. The user submits the job to ResourceManager, then assigns a Consoler on a NodeManager to run the ApplicationMaster, which then requests resources from ResourceManager based on their program needs
  5. YARN has a set of Container lifecycle management mechanisms, and the management between ApplicityMaster and its Container is defined by the application itself

Task scheduling

  1. Focus only on the use of resources and allocate resources according to your needs
  2. Scheluer can apply for specific resources on a particular machine according to the needs of the application (ApplicationMaster is responsible for localizing the data when applying for resources, ResourceManager will try to meet its application needs, assigning Container on the designated machine to reduce data movement)

The internal structure

Hadoop ResourceManager

  • Client Service: App submission, termination, output information (status information for apps, queues, clusters, etc.)
  • Adaminstration Service: Queue, Node, Client Rights Management
  • ApplicationMasterService: Register, terminate The ApplicationMaster, obtain a request for a resource request or cancellation from the ApplicationMaster, and pass it asynchronously to Scheduler, single-threaded processing
  • ApplicationMaster Liveliness Monitor: Receives a heartbeat message from The ApplicationMaster, if an ApplicationMaster does not send a heartbeat for a certain period of time, the task fails, its resources are reclaimed, and ResourceManager reassigns an ApplicationMaster to run the app (2 default attempts)
  • Resource Tracker Service: Register nodes to receive heartbeat messages for each registration node
  • NodeManagers Liveliness Monitor: Monitoring the heartbeat message for each node, if you do not receive a heartbeat message for a long time, considers the node invalid, and all Containers on that node are marked as invalid and tasks are not scheduled to run to that node
  • ApplicationManager: Manage applications, record and manage completed apps
  • ApplicationMaster Launcher: Once an app is submitted, it is responsible for interacting with NodeManager, assigning Container and loading the ApplicationMaster, and terminating or destroying it
  • Yarn Scheduler: Resource scheduling allocation, with With Priority, Fair, Capacity
  • ContainerAllocationExpirer: Manage assigned but not enabled Container and recycle it after a certain amount of time