How To Check Disk Space

broken image


In this tutorial, we will introduce how to get disk or directory total space, used space and free space using python, which is very helpful if you want to save some files on you computer.

Import library

A disk space analyzer tool, sometimes called a storage analyzer, tells you the details. These programs scan and interpret everything that's using up disk space, like saved files, videos, program installation files—everything—and then provide you one or more reports that help make very clear what's using up all your storage space. To check the total disk space left on your Windows 10 device, select File Explorer from the taskbar, and then select This PC on the left. The available space on your drive will appear under Devices and drives. Logs are purged when the quota is exceeded, so it is recommended not to allocate more than 95% of the space to allow some buffer space. Disk quota for the different log types can be viewed from CLI or the GUI in Device Setup Logging and Reporting Settings show system logdb-quota. Quotas: traffic: 32.00%, 38.123 GB.

Get total, used and free space information

The disk usage information is:

As to shutil.disk_usage() function.

On windows, path must be a directory; on unix, it can be a file or directory.

Meanwhile, the disk usage information is not user friendly, we can format it.

Format disk usage

As to free space, we can get it like this: How can i download the sims 4 for free.

The free space is: 14.21G

Of course, you also can use other way to get disk free space, you can read this tutorial.

Home > Articles

  1. Managing HDFS Storage
< BackPage 4 of 7Next >
This chapter is from the book
Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS

This chapter is from the book

This chapter is from the book

Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS

Managing HDFS Storage

You deal with very large amounts of data in a Hadoop cluster, often ranging over multiple petabytes. However, your cluster is also going to use a lot of that space, sometimes with several terabytes of data arriving daily. This section shows you how to check for used and free space in your cluster, and manage HDFS space quotas. The following section shows how to balance HDFS data across the cluster.

The following subsections show how to

  • Check HDFS disk usage (used and free space)

  • Allocate HDFS space quotas

Checking HDFS Disk Usage

Throughout this book, I show how to use various HDFS commands in their appropriate contexts. Here, let's review some HDFS space and file related commands. You can view the help facility for any individual HDFS file command by issuing the following command first:

Let's review some of the most useful file system commands that let you check the HDFS usage in your cluster. The following sections explain how to

  • Use the df command to check free space in HDFS

  • Use the du command to check space usage

  • Use the dfsadmin command to check free and used space

Finding Free Space with the df Command

You can check the free space in an HDFS directory with a couple of commands. The -df command shows the configured capacity, available free space and used space of a file system in HDFS.

You can specify the –h option with the df command for more readable and concise output:

The df –h command shows that this cluster's currently configured HDFS storage is 1.8PB, of which 1.4PB have been used so far.

Finding the Used Space with the du Command

You can view the size of the files and directories in a specific directory with the du command. The command will show you the space (in bytes) used by the files that match the file pattern you specify. If it's a file, you'll get the length of the file. The usage of the du command is as follows:

Here's an example:

You can view the used storage in the entire HDFS file system with the following command:

The following command uses the –h option to get more readable output:

Note the following about the output of the du –h command shown here:

  • The first column shows the actual size (raw size) of the files that users have placed in the various HDFS directories.

  • The second column shows the actual space consumed by those files in HDFS.

The values shown in the second column are much higher than the values shown in the first column. Why? The reason is that the second column's value is derived by multiplying the size of each file in a directory by its replication factor, to arrive at the actual space occupied by that file.

As you can see, directories such as /schema and /tmp reveal that the replication factor for all files in these two directories is three. However, not all files in the /data and the /user directories are being replicated three times. If they were, the second column's value for these two file systems would also be three times the value of its first column.

If you sum up the sizes in the second column of the dfs –du command, you'll find that it's identical to that shown by the Used column of the dfs -df command, as shown here:

Getting a Summary of Used Space with the du -s Command

The du –s command lets you summarize the used space in all files instead of giving individual file sizes as the du command does.

How to Check Whether Hadoop Can Use More Storage Space

If you're under severe space pressure and you can't add additional DataNodes right away, you can see if there's additional space left on the local file system that you can commandeer for HDFS use immediately. In Chapter 3, I showed how to configure the HDFS storage directories by specifying multiple disks or volumes with the dfs.data.dir configuration parameter in the hdfs-site.xml file. Here's an example:

There's another configuration parameter you can specify in the same file, named dfs.datanode.du.reserved, which determines how much space Hadoop can use from each disk you list as a value for the dfs.data.dir parameter. The dfs.datanode.du.reserved parameter specifies the space reserved for non-HDFS use per DataNode. Hadoop can use all data in a disk above this limit, leaving the rest for non-HDFS uses. Here's how you set the dfs.datanode.du.reserved configuration property:

In this example, the dfs.datanode.du.reserved parameter is set to 10GB (the value is specified in bytes). HDFS will keep storing data in the data directories you assigned to it with the dfs.data.dir parameter, until the Linux file system reaches a free space of 10GB on a node. By default, this parameter is set to 10GB. You may consider lowering the value for the dfs.datanode.du.reserved parameter if you think there's plenty of unused space lying around on the local file system on the disks configured for Hadoop's use.

Storage Statistics from the dfsadmin Command

How to download microsoft excel 2016 for free. You've seen how you can get storage statistics for the entire cluster, as well as for each individual node, by running the dfsadmin –report command. The Used, Available and Use% statistics from the dfs –du command match the disk storage statistics from the dfsadmin –report command, as shown here:

In the following example, the top portion of the output generated by the dfsadmin–report command shows the cluster's storage capacity:

You can see that both the dfs –du command and the dfsadmin –report command show identical information regarding the used and available HDFS space.

Testing for Files

You can check whether a certain HDFS file path exists and whether that path is a directory or a file with the test command:

This command uses the –e option to check whether the specified path exists.

You can create a file of zero length with the touchz command, which is identical to the Linux touch command:

Allocating HDFS Space Quotas

You can configure quotas on HDFS directories, thus allowing you to limit how much HDFS space users or applications can consume. HDFS space allocations don't have a direct connection to the space allocations on the underlying Linux file system. Hadoop lets you actually set two types of quotas:

  • Space quotas: Allow you to set a ceiling on the amount of space used for an individual directory

  • Name quotas: Let you specify the maximum number of file and directory names in the tree rooted at a directory

The following sections cover

  • Setting name quotas

  • Setting space quotas How to download a mirror.

  • Checking name and space quotas

  • Clearing name and space quotas

Setting Name Quotas

You can set a limit on the number of files and directory names in any directory by specifying a name quota. If the user tries to create files or directories that go beyond the specified numerical quota, the file/directory creation will fail. Use the dfsadmin command –setQuota to set the HDFS name quota for a directory. Here's the syntax for this command:

For example, you can set the maximum number of files that can be used by a user under a specific directory by doing this:

This command sets a limit on the number of files user alapati can create under that user's home directory, which is /user/alapati. If you grant user alapati privileges on other directories, of course, the user can create files in those directories, and those files won't count against the name quota you set on the user's home directory. In other words, name quotas (and space quotas) aren't user specific—rather, they are directory specific.

Setting Space Quotas on HDFS Directories

A space quota lets you set a limit on the storage assigned to a specific directory under HDFS. This quota is the number of bytes that can be used by all files in a directory. Once the directory uses up its assigned space quota, users and applications can't create files in the directory.

A space quota sets a hard limit on the amount of disk space that can be consumed by all files within an HDFS directory tree. You can restrict a user's space consumption by setting limits on the user's home directory or other directories that the user shares with other users. If you don't set a space quota on a directory it means that the disk space quota is unlimited for that directory—it can potentially use the entire HDFS.

Hadoop checks disk space quotas recursively, starting at a given directory and traversing up to the root. The quota on any directory is the minimum of the following:

  • Directory space quota

  • Parent space quota

  • Grandparent space quota

  • Root space quota

Managing HDFS Space Quotas

It's important to understand that in HDFS, there must be enough quota space to accommodate an entire block. If the user has, let's say, 200MB free in their allocated quota, they can't create a new file, regardless of the file size, if the HDFS block size happens to be 256MB. You can set the HDFS space quota for a user by executing the setSpace-Quota command. Here's the syntax:

The space quota you set acts as the ceiling on the total size of all files in a directory. You can set the space quota in bytes (b), megabytes (m), gigabytes (g), terabytes (t) and even petabytes (by specifying p—yes, this is big data!). And here's an example that shows how to set a user's space quota to 60GB:

You can set quotas on multiple directories at a time, as shown here:

How To Check A Mac's Free Hard Drive Space | Macworld

This command sets a quota of 10GB on two directories—/user/alapati and /test/alapati. Both the directories must already exist. If they do not, you can create them with the dfs –mkdir command.

You use the same command, -setSpaceQuota, both for setting the initial limits and modifying them later on. When you create an HDFS directory, by default, it has no space quota until you formally set one.

You can remove the space quota for any directory by issuing the –clrSpaceQuota command, as shown here:

If you remove the space quota for a user's directory, that user can, theoretically speaking, use up all the space you have in HDFS. As with the –setSpaceQuota command, you can specify multiple directories in the –clrSpaceQuota command.

Things to Remember about Hadoop Space Quotas

Windows 10 Hard Drive Space

Both the Hadoop block size you choose and the replication factor in force are key determinants of how a user's space quota works. Let's suppose that you grant a new user a space quota of 30GB and the user has more than 500MB still free. If the user tries to load a 500MB file into one of his directories, the attempt will fail with an error similar to the following, even though the directory had a bit over 500MB of free space.

In this case, the user had enough free space to load a 500MB file but still received the error indicating that the file system quota for the user was exceeded. This is so because the HDFS block size was 128MB, and so the file needed 4 blocks in this case. Hadoop tried to replicate the file three times since the default replication factor was three and so was looking for 128*12=1556MB of space, which clearly was over the space quota left for this user.

The administrator can reduce the space quota for a directory to a level below the combined disk space usage under a directory tree. In this case, the directory is left in an indefinite quota violation state until the administrator or the user removes some files from the directory. The user can continue to use the files in the overfull directory but, of course, can't store any new files there since their quota is violated.

Checking Current Space Quotas

You can check the size of a user's HDFS space quota by using the dfs –count –q command as shown in Figure 9.7.

Figure 9.7 How to check a user's current space usage in HDFS against their assigned storage limits

How

When you issue a dfs –count –q command, you'll see eight different columns in the output. This is what each of the columns stands for:

  • QUOTA: Limit on the files and directories

  • REMAINING_QUOTA: Remaining number of files and directories in the quota that can be created by this user

  • SPACE_QUOTA: Space quota granted to this user

  • REMAINING_SPACE_QUOTA: Space quota remaining for this user

  • DIR_COUNT: The number of directories

  • FILE_COUNT: The number of files

  • CONTENT_SIZE: The file sizes

  • PATH_NAME: The path for the directories

The -count –q command shows that the space quota for user bdaldr is about 100TB. Of this, the user has about 67 TB left as free space.

Clearing Current Space Quotas

You can clear the current space quota for a user by issuing the clrSpaceQuota command as shown here:

Here's an example showing how to clear the space quota for a user:

The user still can use HDFS to read files but won't be able to create any files in that user's HDFS 'home' directory. If the user has sufficient privileges, however, she can create files in other HDFS directories. It's a good practice to set HDFS quotas on a peruser basis. You must also set quotas for data directories on a per-project basis.





broken image