Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

What does hdfs dfsadmin-report-cloudera do?


Asked by Adele Stout on Dec 05, 2021 FAQ



hdfs dfsadmin -report outputs a brief report on the overall HDFS filesystem. It’s a useful command to quickly view how much disk is available, how many DataNodes are running, corrupted blocks etc. Note: This article explains the disk space calculations as seen by the HDFS.
Next,
Command hdfs dfsadmin -report (line DFS Used) shows actual disk usage, taking into account data replication. So it should be several times bigger when number getting from dfs -ud command. How HDFS Storage works in brief: If you are on latest version of Hadoop, try the following command.
In fact, The hdfs dfsadmin command lets you administer HDFS from the command line. While the hdfs dfs commands you learned about in the previous section help you manage HDFS files and directories, the dfsadmin command is useful for performing general HDFS-specific administrative tasks.
In addition,
You can add access rights and browse the file system to get the cluster information like the number of dead nodes, live nodes, spaces used, etc. To read any file from the HDFS, you have to interact with the NameNode as it stores the metadata about the DataNodes.
Keeping this in consideration,
Command hdfs dfsadmin -report (line DFS Used) shows actual disk usage, taking into account data replication. So it should be several times bigger when number getting from dfs -ud command. Depending on the type of command you use, you will get different values for space occupied by HDFS (10GB vs 30GB)