Difference between revisions of "File system"

From HPC users
Jump to navigationJump to search
Line 51: Line 51:
Due to technical reasons the numbers for the data and home directory are only updated daily (every morning at 1 o'clock). The numbers for the GPFS file system (only on FLOW) are updated hourly.
Due to technical reasons the numbers for the data and home directory are only updated daily (every morning at 1 o'clock). The numbers for the GPFS file system (only on FLOW) are updated hourly.


The Output should look something like this: <br/>
[[Image:lastquota.png|500px]]


== Snapshot functionality ==
== Snapshot functionality ==

Revision as of 09:15, 10 May 2019

User directories

The user of FLOW and HERO has per default two directories for storage. The directories are the home directory

 /user/<user_group>/<user_name>

and the working directory

 /data/work/<user_group>/<user_name>

The aim of the directories are that the home directory should be used for files that should be stored over a long term, e.g. source code of your programs. The working directory should be used for data produced during the computations and which needs no long term storage (e.g. because the data has to be post-processed).

Both directories have different properties:

  • home directory
    • fully backed up
    • limited size (usually 110 Gb per user)
    • usage can be determined by the command iquota
    • mountable via CIFS on your workstation (see Mounting Directories of FLOW and HERO)
    • place for long term storage
  • working directory
    • not backed up
    • huge (limited) size (usually 3.5 Tb per user)
    • up to now there exist no command for checking the quota/size of usage
    • mountable via CIFS on your workstation (see Mounting Directories of FLOW and HERO)
    • place for data used or produced during the computations


High performance directory on FLOW

On FLOW there exist an additional high performance GPFS file system. It is accessible under

 /data/work/gpfs/<user_group>/<user_name>

FLOW user should used this as working directory for huge data sets. The properties of the GPFS file system are:

  • high transfer rates (about 4 times higher than the other directories)
  • parallel file system
  • not backed up
  • huge size
  • up to now no quota (command to check: /usr/lpp/mmfs/bin/mmlsquota)
  • only mountable via sshfs (SFTP) on your workstation (see Mounting Directories of FLOW and HERO)
  • place for data used or produced during the computations


Checking disk usage and quota

To check your current disk usage in all three file systems (home, work and gpfs), simply use the command

 lastquota

Due to technical reasons the numbers for the data and home directory are only updated daily (every morning at 1 o'clock). The numbers for the GPFS file system (only on FLOW) are updated hourly.


The Output should look something like this:

Lastquota.png

Snapshot functionality

Note that within your homedirectory, several states of your system at particular points in time, termed snapshots, are available. In case you deleted or overwrote files accidentally, this comes in handy and can be used to restore the respective files as they appeared in previous versions. However, note that the term version is not fully correct: actually, the appearance of the files at previous times is stored and the available snapshots can be distinguished by means of their time-stamps. As a technical aside and from a more coarse grained point of view, note that a snapshot is not a full copy of your full data. Instead, from one snapshot to the next (in time), it is amended by the changes you make in bit-by-bit fashion.

Each folder in your homedirectory features such a snapshot directory. It is a hidden directory that is also not visible if you list the content of your directory via

  ls -la

To see the snapshots you have to enter the respective hidden directory by typing

 cd .snapshot

Within that directory you can find a list of different snapshots, taken at different points in time (as evident from the appended timestamps):

 
hpc_user_daily_2013-06-28_01-00
hpc_user_daily_2013-06-29_01-00
hpc_user_daily_2013-06-30_01-00
hpc_user_daily_2013-07-01_01-00
hpc_user_daily_2013-07-02_01-00
hpc_user_daily_2013-07-03_01-00
hpc_user_daily_2013-07-04_01-00
hpc_user_daily_recent
hpc_user_hourly_2013-07-03_20-00
hpc_user_hourly_2013-07-04_00-00
hpc_user_hourly_2013-07-04_04-00
hpc_user_hourly_2013-07-04_08-00
hpc_user_hourly_2013-07-04_12-00
hpc_user_hourly_2013-07-04_16-00
hpc_user_hourly_recent
hpc_user_weekly_2013-06-10_02-00
hpc_user_weekly_2013-06-17_02-00
hpc_user_weekly_2013-06-24_02-00
hpc_user_weekly_2013-07-01_02-00
hpc_user_weekly_recent
   

Within these subdirectories you find the file-structure of the current parent directory at that point in time where the particular snapshot was taken at. Albeit it is not possible to copy a snapshot directory (since you have only read-access to those directories), you can enter a given snapshot directory and copy the contained files and subdirectories to any location you have write-access to.