.. index:: CMTH PyML Stack .. _CMTH PyML Stack: CMTH PyML Stack ================ Description ----------- As we have a (small) number of workstations with GPUs capable of being used in calculations, we have put together a set of libraries with a python stack that can potentially be useful for people experimenting with machine learning. If you are interested in using this, I suggest checking with me to see if your workstation will support it. I believe the software in the module will still work on any workstation, but only some will be able to use GPU acceleration. CUDA libraries are a little picky about compiler and library versions, so this module contains python compiled against (and depending on) gcc/5.3.0. This is a full `SciPy stack `_. It contains python 3.6.4 as well as recent versions of the following packages and tools: - numpy - scipy - matplotlib - cython - pandas - ipython - sympy - nose - jupyter - numba - bokeh - scikit-learn - pytorch - mxnet - keras - fastai - CUDA toolkit (8.0) User instructions ----------------- The module can be loaded as: .. code-block:: bash module load cmth-pyml-stack/2018.01 Note - you need to have the gcc/5.3.0 module loaded first. If you need a different version of a package, or to compile a package yourself, this module can be used as a base that provides a recent python version and compatible CUDA installation for a virtual environment. Source ------ - http://www.python.org - https://www.scipy.org - https://pypi.python.org - http://www.pytorch.org - https://developer.nvidia.com/cuda-80-ga2-download-archive License ------- Python has the PSF License (Python Software Foundation). From version 2.2 on this is GPL compatible. The CUDA toolkit has its own `license `_. Admin notes ----------- Compilation was all done on a workstation with bind mount. It's likely the additional dependencies installed for the scipy stack previously were also needed, but were already installed: ``libssl-dev``, ``libsqlite3-dev`` and ``tk-dev``. First python was compiled. This was done with the gcc/5.3.0 as follows .. code-block:: bash ml purge ml gcc/5.3.0 ./configure \ --prefix=/common/debian/9.1/Compiler/gcc/5.3/cmth-pyml-stack/cmth-pyml-stack-2018.01 \ --enable-optimizations > configure.log & make -j4 &> make.log & make install The install was synchronized with the main server and made into a module before proceeding with the rest. The CUDA toolkit and bugfix patch wer installed with .. code-block:: bash export PERL5LIB=. sh cuda_8.0.61_375.26_linux-run --override sh cuda_8.0.61.2_linux-run with the directory ``/common/debian/9.1/Compiler/gcc/5.3/cmth-pyml-stack/cmth-pyml-stack-2018.01`` selected for the installation, and subdirectory ``cuda-8.0-samples`` for the samples. For this to work correctly the module template also sets the ``CUDA_HOME`` environment variable when loaded. The various "basic" python packages were installed with: .. code-block:: bash PYTHONUSERBASE=/common/debian/9.1/Compiler/gcc/5.3/cmth-pyml-stack/cmth-pyml-stack-2018.01 pip3 install --upgrade packagename This was done for - pip - wheel - numpy - scipy - matplotlib - cython - pandas - ipython - sympy - nose - jupyter - numba - bokeh - scikit-learn Pytorch was then installed as follows: .. code-block:: bash PYTHONUSERBASE=/common/debian/9.1/Compiler/gcc/5.3/cmth-pyml-stack/cmth-pyml-stack-2018.01 pip3 install http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp36-cp36m-linux_x86_64.whl PYTHONUSERBASE=/common/debian/9.1/Compiler/gcc/5.3/cmth-pyml-stack/cmth-pyml-stack-2018.01 pip3 install torchvision I wanted to install tensorflow also, but this seems to have an issue where it installs a package that breaks pip, so was omitted. Users can install it in a virtual environment if needed. MXNet was installed via the ``mxnet-cu80`` package - this unfortunately downgrades the numpy version, but hopefully doesn't break anything. The ``fastai`` package pulled in quite a number of other packages also.