Difference between revisions of "File system"

From HPC users
Jump to navigationJump to search
Line 1: Line 1:
== User directories ==
== User directories ==
The user of FLOW and HERO has per default two directories for storage. The directories are the home directory
  /user/<user_group>/<user_name>
and the working directory
  /data/work/<user_group>/<user_name>
The aim of the directories are that the home directory should be used for date that should be stored over a long term, e.g. source code of your programs. The working directory should be used for data produced during the computations and needs no long term storage (e.g. because the data has to be post-processed).
Both directories have different properties:
* home directory
** fully backed up
** limited size (usually 110 Gb per user)
** usage can be determined by th command ''iquota''
** mountable via CIFS on your workstation (see [[Mounting Directories of FLOW and HERO]])
** place for long term storage
* working directory
** not backed up
** huge (limited) size (usually 3.5 Tb per user)
** up to now there exist no command for checking the quota/size of usage
** mountable via CIFS on your workstation (see [[Mounting Directories of FLOW and HERO]])
** place for data used/produced during the computations
== High performance directory on FLOW ==
On FLOW there exist an additional high performance GPFS file system. It is accessible under
  /data/work/gpfs/<user_group>/<user_name>
FLOW user should used this as working directory for huge data sets. The properties of the GPFS file system are:
* high transfer rates (about 4 times higher than the other directories)
* parallel file system
* not backed up
* huge size
* only mountable via sshfs (SFTP) on your workstation (see [[Mounting Directories of FLOW and HERO]])
* place for data used/produced during the computations





Revision as of 09:26, 5 July 2013

User directories

The user of FLOW and HERO has per default two directories for storage. The directories are the home directory

 /user/<user_group>/<user_name>

and the working directory

 /data/work/<user_group>/<user_name>

The aim of the directories are that the home directory should be used for date that should be stored over a long term, e.g. source code of your programs. The working directory should be used for data produced during the computations and needs no long term storage (e.g. because the data has to be post-processed).

Both directories have different properties:

  • home directory
    • fully backed up
    • limited size (usually 110 Gb per user)
    • usage can be determined by th command iquota
    • mountable via CIFS on your workstation (see Mounting Directories of FLOW and HERO)
    • place for long term storage
  • working directory
    • not backed up
    • huge (limited) size (usually 3.5 Tb per user)
    • up to now there exist no command for checking the quota/size of usage
    • mountable via CIFS on your workstation (see Mounting Directories of FLOW and HERO)
    • place for data used/produced during the computations


High performance directory on FLOW

On FLOW there exist an additional high performance GPFS file system. It is accessible under

 /data/work/gpfs/<user_group>/<user_name>

FLOW user should used this as working directory for huge data sets. The properties of the GPFS file system are:

  • high transfer rates (about 4 times higher than the other directories)
  • parallel file system
  • not backed up
  • huge size
  • only mountable via sshfs (SFTP) on your workstation (see Mounting Directories of FLOW and HERO)
  • place for data used/produced during the computations


Snapshot functionality

Note that within your homedirectory, several states of your system at particular points in time, termed snapshots, are available. In case you deleted or overwrote files accidentally, this comes in handy and can be used to restore the respective files as they appeared in previous versions. However, note that the term version is not fully correct: actually, the appearance of the files at previous times is stored and the available snapshots can be distinguished by means of their time-stamps. As a technical aside and from a more coarse grained point of view, note that a snapshot is not a full copy of your full data. Instead, from one snapshot to the next (in time), it is amended by the changes you make in bit-by-bit fashion.

The folder that contains your particular snapshots is located in every directory an subdirectory of your homedirectory. It is a hidden directory that is also not visible if you list the content of your directory via

  ls -la

To see the snapshots you have to enter the respective hidden directory by

 cd .snapshot

Within that directory the snapshots with timestamps are visible, e.g.

 
hpc_user_daily_2013-06-28_01-00
hpc_user_daily_2013-06-29_01-00
hpc_user_daily_2013-06-30_01-00
hpc_user_daily_2013-07-01_01-00
hpc_user_daily_2013-07-02_01-00
hpc_user_daily_2013-07-03_01-00
hpc_user_daily_2013-07-04_01-00
hpc_user_daily_recent
hpc_user_hourly_2013-07-03_20-00
hpc_user_hourly_2013-07-04_00-00
hpc_user_hourly_2013-07-04_04-00
hpc_user_hourly_2013-07-04_08-00
hpc_user_hourly_2013-07-04_12-00
hpc_user_hourly_2013-07-04_16-00
hpc_user_hourly_recent
hpc_user_weekly_2013-06-10_02-00
hpc_user_weekly_2013-06-17_02-00
hpc_user_weekly_2013-06-24_02-00
hpc_user_weekly_2013-07-01_02-00
hpc_user_weekly_recent
   

I.e., the snapshots distinguished by the precise points in time where they where taken. Within these folders you find the file-structure at the timestamp of the parent directory you are in.