Difference between revisions of "File system"

From HPC users
Jump to navigationJump to search
 
(11 intermediate revisions by 4 users not shown)
Line 9: Line 9:
   /data/work/<user_group>/<user_name>
   /data/work/<user_group>/<user_name>


The aim of the directories are that the home directory should be used for date that should be stored over a long term, e.g. source code of your programs. The working directory should be used for data produced during the computations and needs no long term storage (e.g. because the data has to be post-processed).
The aim of the directories are that the home directory should be used for files that should be stored over a long term, e.g. source code of your programs. The working directory should be used for data produced during the computations and which needs no long term storage (e.g. because the data has to be post-processed).
   
   
Both directories have different properties:
Both directories have different properties:
Line 16: Line 16:
** fully backed up
** fully backed up
** limited size (usually 110 Gb per user)
** limited size (usually 110 Gb per user)
** usage can be determined by th command ''iquota''
** usage can be determined by the command ''iquota''
** mountable via CIFS on your workstation (see [[Mounting Directories of FLOW and HERO]])
** mountable via CIFS on your workstation (see [[Mounting Directories of FLOW and HERO]])
** place for long term storage
** place for long term storage
Line 24: Line 24:
** up to now there exist no command for checking the quota/size of usage
** up to now there exist no command for checking the quota/size of usage
** mountable via CIFS on your workstation (see [[Mounting Directories of FLOW and HERO]])
** mountable via CIFS on your workstation (see [[Mounting Directories of FLOW and HERO]])
** place for data used/produced during the computations
** place for data used or produced during the computations




Line 38: Line 38:
* not backed up
* not backed up
* huge size
* huge size
* up to now no quota (command to check: ''/usr/lpp/mmfs/bin/mmlsquota'')
* only mountable via sshfs (SFTP) on your workstation (see [[Mounting Directories of FLOW and HERO]])
* only mountable via sshfs (SFTP) on your workstation (see [[Mounting Directories of FLOW and HERO]])
* place for data used/produced during the computations
* place for data used or produced during the computations




== Checking disk usage and quota ==
To check your current disk usage in all three file systems (home, work and gpfs), simply use the command
  lastquota
Due to technical reasons the numbers for the data and home directory are only updated daily (every morning at 1 o'clock). The numbers for the GPFS file system (only on FLOW) are updated hourly.
The Output should look something like this: <br/>
[[Image:lastquota.png|500px]]


== Snapshot functionality ==
== Snapshot functionality ==
Line 56: Line 69:
amended by the changes you make in bit-by-bit fashion.
amended by the changes you make in bit-by-bit fashion.


The folder that contains your particular snapshots is located in every directory an subdirectory of your homedirectory. It  
Each folder in your homedirectory features such a snapshot directory.
is a ''hidden'' directory that is also '''not''' visible if you list the content of your  
It is a ''hidden'' directory that is also '''not''' visible if you list the content of your  
directory via
directory via
    
    
   ls -la
   ls -la


To see the snapshots you have to enter the respective hidden directory by
To see the snapshots you have to enter the respective hidden directory by typing


   cd .snapshot
   cd .snapshots


Within that directory the snapshots with timestamps are visible, e.g.
Within that directory you can find a list of different snapshots, taken at different points in time (as evident from the appended timestamps):


   <nowiki>
   <nowiki>
hpc_user_daily_2013-06-28_01-00
@GMT-2019.04.24-21.00.00-hpc_user-daily
hpc_user_daily_2013-06-29_01-00
@GMT-2019.04.25-21.00.00-hpc_user-daily
hpc_user_daily_2013-06-30_01-00
@GMT-2019.04.26-21.00.00-hpc_user-daily
hpc_user_daily_2013-07-01_01-00
@GMT-2019.04.27-21.00.00-hpc_user-daily
hpc_user_daily_2013-07-02_01-00
@GMT-2019.04.28-21.00.00-hpc_user-daily
hpc_user_daily_2013-07-03_01-00
@GMT-2019.04.30-21.00.00-hpc_user-daily
hpc_user_daily_2013-07-04_01-00
@GMT-2019.05.01-21.00.00-hpc_user-daily
hpc_user_daily_recent
@GMT-2019.05.02-21.00.00-hpc_user-daily
hpc_user_hourly_2013-07-03_20-00
@GMT-2019.05.03-21.00.00-hpc_user-daily
hpc_user_hourly_2013-07-04_00-00
@GMT-2019.05.04-21.00.00-hpc_user-daily
hpc_user_hourly_2013-07-04_04-00
@GMT-2019.05.05-21.00.00-hpc_user-daily
hpc_user_hourly_2013-07-04_08-00
@GMT-2019.05.06-21.00.00-hpc_user-daily
hpc_user_hourly_2013-07-04_12-00
@GMT-2019.05.07-21.00.00-hpc_user-daily
hpc_user_hourly_2013-07-04_16-00
@GMT-2019.05.08-21.00.00-hpc_user-daily
hpc_user_hourly_recent
@GMT-2019.05.09-21.00.00-hpc_user-daily
hpc_user_weekly_2013-06-10_02-00
@GMT-2019.05.10-21.00.00-hpc_user-daily
hpc_user_weekly_2013-06-17_02-00
 
hpc_user_weekly_2013-06-24_02-00
 
hpc_user_weekly_2013-07-01_02-00
hpc_user_weekly_recent
hpc_user_weekly_recent
   </nowiki>  
   </nowiki>  


I.e., the snapshots distinguished by the precise points in
Within these subdirectories you find the  
time where they where taken. Within these folders you find the  
file-structure of the current parent directory at that point in time where the particular snapshot was taken at.
file-structure at the timestamp of the parent directory you are in.
Albeit it is not possible to copy a snapshot directory (since you have only read-access to those directories), you can enter
a given snapshot directory and copy the contained files and subdirectories to any location you have write-access to.

Latest revision as of 08:19, 17 May 2019

User directories

The user of FLOW and HERO has per default two directories for storage. The directories are the home directory

 /user/<user_group>/<user_name>

and the working directory

 /data/work/<user_group>/<user_name>

The aim of the directories are that the home directory should be used for files that should be stored over a long term, e.g. source code of your programs. The working directory should be used for data produced during the computations and which needs no long term storage (e.g. because the data has to be post-processed).

Both directories have different properties:

  • home directory
    • fully backed up
    • limited size (usually 110 Gb per user)
    • usage can be determined by the command iquota
    • mountable via CIFS on your workstation (see Mounting Directories of FLOW and HERO)
    • place for long term storage
  • working directory
    • not backed up
    • huge (limited) size (usually 3.5 Tb per user)
    • up to now there exist no command for checking the quota/size of usage
    • mountable via CIFS on your workstation (see Mounting Directories of FLOW and HERO)
    • place for data used or produced during the computations


High performance directory on FLOW

On FLOW there exist an additional high performance GPFS file system. It is accessible under

 /data/work/gpfs/<user_group>/<user_name>

FLOW user should used this as working directory for huge data sets. The properties of the GPFS file system are:

  • high transfer rates (about 4 times higher than the other directories)
  • parallel file system
  • not backed up
  • huge size
  • up to now no quota (command to check: /usr/lpp/mmfs/bin/mmlsquota)
  • only mountable via sshfs (SFTP) on your workstation (see Mounting Directories of FLOW and HERO)
  • place for data used or produced during the computations


Checking disk usage and quota

To check your current disk usage in all three file systems (home, work and gpfs), simply use the command

 lastquota

Due to technical reasons the numbers for the data and home directory are only updated daily (every morning at 1 o'clock). The numbers for the GPFS file system (only on FLOW) are updated hourly.


The Output should look something like this:

Lastquota.png

Snapshot functionality

Note that within your homedirectory, several states of your system at particular points in time, termed snapshots, are available. In case you deleted or overwrote files accidentally, this comes in handy and can be used to restore the respective files as they appeared in previous versions. However, note that the term version is not fully correct: actually, the appearance of the files at previous times is stored and the available snapshots can be distinguished by means of their time-stamps. As a technical aside and from a more coarse grained point of view, note that a snapshot is not a full copy of your full data. Instead, from one snapshot to the next (in time), it is amended by the changes you make in bit-by-bit fashion.

Each folder in your homedirectory features such a snapshot directory. It is a hidden directory that is also not visible if you list the content of your directory via

  ls -la

To see the snapshots you have to enter the respective hidden directory by typing

 cd .snapshots

Within that directory you can find a list of different snapshots, taken at different points in time (as evident from the appended timestamps):

 
@GMT-2019.04.24-21.00.00-hpc_user-daily
@GMT-2019.04.25-21.00.00-hpc_user-daily
@GMT-2019.04.26-21.00.00-hpc_user-daily
@GMT-2019.04.27-21.00.00-hpc_user-daily
@GMT-2019.04.28-21.00.00-hpc_user-daily
@GMT-2019.04.30-21.00.00-hpc_user-daily
@GMT-2019.05.01-21.00.00-hpc_user-daily
@GMT-2019.05.02-21.00.00-hpc_user-daily
@GMT-2019.05.03-21.00.00-hpc_user-daily
@GMT-2019.05.04-21.00.00-hpc_user-daily
@GMT-2019.05.05-21.00.00-hpc_user-daily
@GMT-2019.05.06-21.00.00-hpc_user-daily
@GMT-2019.05.07-21.00.00-hpc_user-daily
@GMT-2019.05.08-21.00.00-hpc_user-daily
@GMT-2019.05.09-21.00.00-hpc_user-daily
@GMT-2019.05.10-21.00.00-hpc_user-daily


hpc_user_weekly_recent
   

Within these subdirectories you find the file-structure of the current parent directory at that point in time where the particular snapshot was taken at. Albeit it is not possible to copy a snapshot directory (since you have only read-access to those directories), you can enter a given snapshot directory and copy the contained files and subdirectories to any location you have write-access to.