.. index:: modules .. _modules: modules ======= Description ----------- Modules make it possible to easily change between different versions of compilers and other software without having to set various environment variables by hand each time. User instructions ----------------- It is extremely complicated to setup multiple versions of programs, libraries and the like, especially when some exist in different versions (for example, the OpenMPI libraries for GNU and Intel compiler platforms and FFTW libraries for different compilers). The purpose of the modules system is to simplify this for users and, hopefully, reduce the need to use long, hard-coded paths on the command-line, in scripts, in makefiles and so on. This system is common on the compute clusters, including the cx1 and cx2 clusters at Imperial and ARCHER, the national supercomputer. However, unlike most systems, the CMTH cluster uses `Lmod `_, which offers some compelling advantages for active software developers. You can load a default set of modules in your bash login files. The main command is ``module``, with several several subcommands. And there is also an ``ml`` command which gives a shorter set of commands. For example ``ml help`` will output help text on the various module commands. To list the modules you currently have loaded, run: .. code-block:: bash $ module list $ # or equivalently you can just use 'ml' Perhaps more usefully, you can also see the list of available modules: .. code-block:: bash $ module avail $ # or equivalently 'ml av' Each module represents a version of a package (ie program or library) that is installed on the NFS server and can be added to your environment by loading that module. Loading certain modules (e.g. a compiler or MPI implmentation) results may result in additional modules being available (see below for more details). You can see the **entire** set of modules using .. code-block:: bash $ module spider $ or equivalently 'ml spider' If you don't know what a module is, then you can check: .. code-block:: bash $ module whatis mathematica/10.2.0 You can see what effect loading a module has: .. code-block:: bash $ module show mathematica/10.2.0 You can load a module using the 'load' subcommand: .. code-block:: bash $ module load mathematica/10.2.0 $ # or equivalently 'ml mathematica/10.2.0' Leaving off the version loads the default version for that module (often, but not always, the latest version of that module). .. note:: In addition to loading a package into the environment, all modules also export an environment variable, MODULE_ROOT, where MODULE is the name of the program/library (with all hyphens replaced by underscores), which points to the root directory where the program/library is installed. You can now use Mathematica 10.2.0 (using either the command 'math' or 'mathematica' from the command-line). Note that the environment is only changed in the current terminal, so you won't be able to use Mathematica from other terminals without first loading the mathematica module. Some modules conflict with other modules. For example, an ``fftw`` module may not be loaded if an ``fftw-mpi`` module is already loaded. The solution is to first unload the conflicting module before loading the new module. .. code-block:: bash $ module unload fftw-mpi/3.3.6 $ module load fftw/3.3.6 Further, some modules require other modules to be loaded. For example, any software compiled to use the MKL libraries needs to have an ``mkl`` module loaded. If this condition is not met, an error will be given when loading the software module. Finally, loading a library module extends your LD_LIBRARY_PATH variable. Unless you compile your program using static linking, you will need the relevant library module(s) to be loaded both at compile-time and at run-time. The modules system can do much more than what I've covered above. The module command has built in help: run ``module help`` and there is good documentation on the `Lmod website `_. Note tab completion in bash works on both the module subcommands and the module names. Hierarchical modules ^^^^^^^^^^^^^^^^^^^^ Modules are arranged in a hierarchical scheme (see `Module Hierarchies `_ for details). The default set of modules available contains utilities and compilers. Loading a specific compiler adds a further set of modules that were compiled for that specific compiler and that specific compiler version. Loading a module providing an MPI implementation (currently just Open MPI is installed) makes a further set of modules available for that compiler and MPI implementation. This makes it clearer which programs and libraries are compatible with each compiler and MPI implementation and their versions. Note that at most one compiler and at most one MPI implementation can be loaded at once. In order to provide this cleanly, the ``gcc/6.3.0`` module is a dummy module which simply makes available modules compiled against gcc 6.3.0, which is already installed from the Debian repositories. Python packages compiled with both (repository) versions of python 2 or python 3 are made available in the same way. This provides a very nice advantage when changing compiler or MPI implementation (or even just a version): modules compiled against the compiler and/or MPI implementation will be automatically reloaded with the versions compiled against the new compiler and/or MPI implementation. For example: .. code-block:: bash $ module list Currently Loaded Modules: 1) intel-suite/2017.4.196 2) mkl/2017.4.196 3) fftw/3.3.6 4) openmpi/2.1.1 5) hdf5-mpi/1.8.19 means that the Intel compilers and Intel MKL are loaded along with fftw and OpenMPI (compiled for Intel 2017.4) and hdf5-mpi (compiled for Intel 2017.4 and Open MPI 2.1). If I wish to change to the gcc compiler, I don't have to unload all the libraries first, but instead just load the relevant module: .. code-block:: bash $ module load gcc/6.3.0 Lmod is automatically replacing "intel/2017.4.196" with "gcc/6.3.0". Due to MODULEPATH changes, the following have been reloaded: 1) fftw/3.3.6 2) hdf5-mpi/1.8.19 3) openmpi/2.1.1 $ module list Currently Loaded Modules: 1) mkl/2017.4.196 2) gcc/6.3.0 3) fftw/3.3.6 4) openmpi/2.1.1 5) hdf5-mpi/1.8.19 MKL is unaffected as it is compatible with both GCC and Intel compilers. fftw and Open MPI modules are automatically switched for ones compiled against gcc instead of Intel and hdf5-mpi is automatically switched for one compiled against gcc and the gcc version of Open MPI. .. note:: Many (scientific) programs are only compiled using the ``intel-suite/2017.4.196`` module and (if relevant) the ``openmpi/2.1.1`` module. Programs and libraries can be compiled against other compiler and MPI versions as required. Use in scripts ^^^^^^^^^^^^^^ Bash scripts will inherit the current environment but child shells can not affect the parent shell. This means that changing modules in a script will only affect the environment the script runs in and won't affect the parent environment unless the script is sourced. Compare: .. code-block:: bash $ cat test_modules.sh module load gcc mkl $ bash test_modules.sh $ module list No Modulefiles Currently Loaded. $ source test_modules.sh $ module list Currently Loaded Modules: 1) gcc/6.3.0 2) mkl/2017.4.196 However, running module commands from cron will fail because cron picks up a very minimal PATH (specifically, no startup scripts are read, including /etc/profile, /etc/bash.bashrc, ~/.bashrc and /.bash_profile). In order to use the modules system in such a case, you must first source the initialisation script in /common/lmod/lmod/init/bash. The same folder contains initialisation scripts for several languages, including perl, which can be used in a similar fashion. Personal modules ^^^^^^^^^^^^^^^^ You can create and use your own modulefiles. You need to place your modulefile in e.g. ~/privatemodules and then run .. code-block:: bash $ module use $HOME/privatemodules to add your private modules to the module search path. Module files can be either in TCL or in lua, in which case they must have filenames containing a ``.lua`` suffix. For example, a simple module file in lua for adding a ``$HOME/local/bin`` to $PATH (ie containing executable files) would be: .. code-block:: lua prepend_path('PATH', '/home/USERNAME/local/bin') where USERNAME is your own username. See the https://www.tacc.utexas.edu/research-development/tacc-projects/lmod/advanced-user-guide for more details. Source ------ https://github.com/TACC/Lmod License ------- MIT Admin notes ----------- I compiled to /common as everyone needs access: .. code-block:: bash $ ./configure --prefix=/common/ $ make install This places lmod in ``/common/lmod/``, where ```` is the version of Lmod. ``/common/lmod/lmod`` is a symbolic link to the most recent version installed (which can be avoided by using ``make pre-install`` instead of ``make install``). This makes having multiple versions of Lmod installed relatively straightforward. We use, by default, ``/common/lmod/lmod``. Modulefile location ^^^^^^^^^^^^^^^^^^^ Given the above install location, Lmod defaults to looking in the following directories for module files:: /common/modulefiles/Linux /common/modulefiles/Core /common/lmod/lmod/modulefiles/Core I ignored the first and left the third for Lmod-specific modules (typically supplied by Lmod). ``/common/modulefiles/Core``. Desktop machines have additional modules (ie GUI programs) available under ``/common/modulefiles/Desktop``. Compiling software ^^^^^^^^^^^^^^^^^^ The key is to compile software into directory trees for each compiler and MPI version. Note that minor versions are typically compatible with each other, so I just compile against major versions of each. I place software into different directory trees based on this: ``/common/debian/9.1/Core`` Generic programs, libraries and compilers. Programs and libraries here either don't need to be compiled or are not computationally intensive enough to benefit from being compiled with Intel compilers. ``/common/debian/9.1/Desktop`` Programs specific to desktop usage (e.g. mail clients, visualisation software). ``/common/debian/9.1/Compiler//`` Programs and libraries compiled for the specific compiler and compiler version. ``/common/debian/9.1/MPI////`` Programs and libraries compiled for the specific compiler and compiler version and MPI implementation and implementation version. ``<...>`` is replaced with the appropriate value. Under the directory tree I install a package to the ``package/package-version`` subdirectory (e.g. ``fftw/fftw-3.3.6``). Zero effort modulefiles ^^^^^^^^^^^^^^^^^^^^^^^ I placed several useful functions into ``/common/modulefiles/cmth.site/SitePackage.lua`` (following Lmod documentation). The strategy is to: #. Convert the module filename into the 'root' directory of the package, based upon the naming scheme used above. #. Given the root directory, add the bin subdirectory to PATH (if bin/ exists), include subdirectory to CPATH (if include/ exists), lib to LIBRARY_PATH and LD_LIBRARY_PATH (if lib/ exists), and so on. This means that the layout of the modulefiles needs to be at least close to that of the packages. I took the approach of placing all module files in ``/common/modulefiles/module_templates``. The tree of modulefiles is then a set of symlinks to files in this directory; if ``package.lua`` exists, then that package uses that file, otherwise it uses a 'standard' base file which just does the environment manipulations described above. I created package-specific module files if other variables need to be set or if prerequisites set. Initialisation issues ^^^^^^^^^^^^^^^^^^^^^ We must be very careful when initialising the modules system and loading a default set of modules in the system startup files. This is for two reasons: #) if the NFS server is unavailable, then attempting to access and load modules can make root logins problematic. #) loading default modules can conflict with modules users then load in their shell startup files. Thus the safe thing to do is to only initialise the module system if the NFS server is accessible (always a good strategy with any script which accesses the NFS server) and to only load modules from user-level startup scripts. The modules system is initialised from ``/etc/profile.d/modules.sh``, which is managed by puppet to be a symlink to ``/common/lmod/lmod/init/bash``. In order to detect the CMTH-specific functions used in modulefiles, ``/etc/profile.d/lmod_pkg.sh`` is sourced first and contains: .. code-block:: bash export LMOD_PACKAGE_PATH=/common/lmod/cmth Some modulefiles add to MANPATH. Unfortunately, it seems that if MANPATH is set, then man *only* searches paths in MANPATH. If MANPATH is empty, then man searches the paths in the output from 'manpath'. This sucks, as it is nice to add (e.g.) the Intel compiler manpages to MANPATH if an Intel module is loaded. My solution is to set MANPATH in /etc/profile.d/manpath.sh: .. code-block:: bash # Only directories listed in MANPATH are searched for man pages if MANPATH is not empty. # This creates problems if MANPATH is dynamically altered (eg by module). if [ -z $MANPATH ]; then MANPATH=$(manpath) fi export MANPATH This is also managed by puppet and sourced during the shell startup process, both in ``/etc/profile.d`` (login interactive shells) and in ``/etc/bash.bashrc`` (interfactive shells with the approriate guards to prevent modules being initialised multiple times). Similarly init scripts are also present for csh/tcsh. Library modules ^^^^^^^^^^^^^^^ There is a problem with setting LD_LIBRARY_PATH to point to a location on an NFS server: if the NFS server becomes unavailable then LD_LIBRARY_PATH will cause the shell to hang. However, as we have the home directories mounted on the same NFS server as the libraries, this is the least of our worries in such a situation... Module load tracking ^^^^^^^^^^^^^^^^^^^^ It is useful to be ableto tell which older modules are still in use, and if whether some are no longer needed so their space can be reclaimed on ``/common``. One way to do this is documented at http://lmod.readthedocs.io/en/latest/300_tracking_module_usage.html and I have implemented the first part of this on our system. This is set up so that module load commands generate a log message, and this message is then passed off to rayleigh1 where they are all collected in ``/var/log/moduleUsage.log``. I had intended to setup a database for this, but it was a bit fiddly, and not really necessary: the information can be found in the logs in a straightforward way. The setup involves the following: #. Add a section to ``/common/modulefiles/cmth.site/SitePackage.lua`` that generates a log message. #. Add ``/etc/puppet/code/environments/production/modules/lmod/files/etc/rsyslog.d/module_usage_tracking.conf`` and set it to be propagated to all machines via puppet. #. Add ``/etc/rsyslog.d/moduleTracking.conf`` on rayleigh1 to tell it where to put the log messages it gets sent. #. Add ``/etc/logrotate.d/moduleUsage`` on rayleigh1 to tell it to rotate monthly and keep 12 older logs. A year of information should be plenty.