.. index:: filesystem .. index:: cluster; filesystem Filesystems =========== There are two main options for storing files on the cluster. Understanding how they work and the policies in operation is important for both keeping your work secure and also obtaining good input/output performance when running calculations. In addition to the options discussed below, you can also store files in ``/tmp``. ``/tmp`` is local to each workstation and can be up to 30GB in size. However, as its name suggests, ``/tmp`` should only be used for storing temporary files. ``/tmp`` is erased by the operating system on a regular basis and without warning. In particular, if the root filesystem requires more space or the computer is rebooted, files in ``/tmp`` are likely to be deleted. All workstations can read and write CDs and DVDs and tools for burning the disk are installed. Perhaps the best solution for external storage is USB drives. These are automatically mounted when connected to a workstation and a user is logged in via Xfce. Some thought should be taken with regards to the format of the USB drive. Most USB drives come formatted as FAT32 (which can be read by Linux, OSX and Windows). FAT32 is limited. In particular, it cannot store individual files larger than 4GB, which is a major issue in computational science. Other formats exist which don't have this limitation. Linux and Windows can both read and write to NTFS drives. I believe OSX has limited support. Ext2, ext3 and (more recently) ext4 are Linux filesystems which can also handle large files but require 3rd party drivers for OSX and Windows. The best format to use depends upon which operating systems you wish to use the drive with and the file sizes you need to store. .. index:: filesystem; home .. _home: home ---- Your home directory is ``/home/username`` and is mounted on the NFS server (thomson or ramsay). This means that when you log into any of the CMTH or TYC workstations (which can be identified from the login screen), you will find the same environment with the same home directory and personal customisations. Whilst the NFS server performs extremely well, running heavy i/o operations (e.g. reading and/or writing files of the order of several GB) can have severe consequences on the performance of the cluster. In practice, this is only an issue when :ref:`running several calculations `. Your home directory is subject to a quota. The default soft limit is 10GB; if you go above this, then you will have 7 days to reduce your file usage. The default hard limit is 10.2GB; exceeding this limit (or being above the soft limit for more than 7 days) will result in you no longer being able to log into the cluster. It may be possible to increase quota limits for sufficiently good reasons, though this can't be done for everyone. The quota limits are quite small. This is mainly due to limitations in how much data we can safely :ref:`backup `. Nevertheless, they should be sufficient for important files (e.g. personal settings, mail, papers, etc). You can see your usage and quota limits by running: .. code-block:: bash $ quota A good way to save space and still keep important files is to compress them. The `gzip` and `bzip2` tools can reduce filesize substantially and work especially well for text files. Alternatively, files can be archived to external media or stored in ``/workspace``. data ---- Every user has a a directory located at ``/data/users/username`` which is NFS-mounted on all computers from maxwell. As it is a NFS-mounted drive, the same warnings about intensive i/o operations for ``/home`` also apply to ``/data``. This is especially true as maxwell is intended primarily for computational workloads. Your data directory is also subject to a quota: the default soft limit is 50GB and the default hard limit is 50.2GB. /data is :ref:`backed up `. However, the backup system does not have space to handle hundreds of GB changing each day; /workspace is a better place to store such temporary and fluxional files. Some research groups also have a directory under ``/data/groups/groupname``, where ``groupname`` is the surname of the group leader, to which all members of that group have write access. .. index:: filesystem; workspace .. _workspace: workspace --------- You have a directory under ``/workspace/username``. This is **local** to each workstation, so any files you store there will only be accessible on that workstation. A symbolic link in your home directory (``~/workspace``) is for convenience. This filesystem is not subject to quota and is much larger than ``/home``; ``/workspace`` is the space left on the local hard drive after room has been allocated for the operating system and network backup. ``/workspace`` is of the order 130GB on older machines and well over 300GB on newer machines. As accessing files on ``/workspace`` does not involve network communications, it is substantially faster. .. important:: The workspace directories are **not** backed up. If you have crucial data stored in the workspace on a particular machine and that machine goes wrong, your data will be lost. Crucial data stored in /workspace should be also stored elsewhere. ``/workspace`` is shared by all users, though most people only need to use ``/workspace`` on their own workstation. Please be considerate. ``/workspace`` is monitored and heavy users are asked and expected to clean up their files on a regular basis. If you accidentally delete the link from your new home directory to your workspace directory, you can recreate it using: .. code-block:: bash $ ln -s /workspace/username ~/workspace As ``/workspace`` is local to each machine, you will see a different workspace directory if you log onto a different workstation. This is particularly important when submitting batch jobs using :ref:`slurm`. Since you cannot predict in advance where your batch job will run, you cannot rely on the contents of the workspace directory. The first part of your job script should copy the necessary data files from your home directory (visible everywhere) into the workspace directory (visible only on the machine running the job), and the last part should copy the results back to your home directory and clean up the workspace. .. index:: filesystem; common common ------ The directory ``/common`` is also mounted from the NFS server on all workstations. This is a read-only filesystem and contains programs and libraries that are not part of a standard linux distribution but have been installed especially for the cluster. Please see :ref:`here ` for more details.