Python 2016
Introduction
Python is a widely used high-level programming language for general-purpose programming, created by Guido van Rossum and first released in 1991. An interpreted language, Python has a design philosophy which emphasizes code readability (notably using whitespace indentation to delimit code blocks rather than curly braces or keywords), and a syntax which allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java. The language provides constructs intended to enable writing clear programs on both a small and large scale.
Installed version
The following versions of Python are currently installed on the cluster:
- 2.7.12
- 3.5.2
- 3.7.0 (hpc-env/6.4)
Further, there are modules for Biopython(version: 1.68) and Scientifcpython (version:2.9.4 based on Python 2.7.12) installed. Biopython is a set of freely available tools for biological computation written in python. It is distributed to develop python libraries and applications which address the needs of current and future work in bioinformatics. Scientificpython is a collection of python modules for scientific computing. It contains support for geometry, mathematical functions, statistics, pyhsical units, IO, visualization and parallelization.
There are even more additional packages installed:
- numpy
- scipy
- nose
- ..
A complete list, including the version numbers of the installed packages can be obtained with the command
pip list
Python has to be loaded in order for this command to work. If any packages that you need are missing, you can either contact the Scientific Computing and we will install it or you can install it on your own in your $HOME-directory (instructions at the end of this article!).
Using Python
If you want to use python on the cluster, you can simply do that by using the following command
module load Python
Since there is more than one verson of Python installed, you can further specify which version you would like to load, e.g.
module load Python/2.7.12
This will, obviously, load python in version 2.7.12. Loading Python without specifying a version will load version 3.5.2!
Using Python and MPI
For parallel scripts the Python installation contains the package mpi4py. To launch a parallel Python script inside an SLURM jobscript, use command line
mpirun YOUR_PYTHON_SCRIPT.py
An easy example could look like this:
1. Create a new file called hello_world.py and add the following lines of code:
from mpi4py import MPI comm = MPI.COMM_WORLD rank = comm.Get_rank() print "hello world from process ", rank
2. Create a new file called hello_world.job and add the following lines of code:
#!/bin/bash #SBATCH --ntasks=1 #SBATCH --mem=2G #SBATCH --time=0-2:00 #SBATCH --job-name PYTHON-MPI-TEST #SBATCH --output=python-mpi-test.%j.out #SBATCH --error=python-mpi-test.%j.err module load Python mpirun -n 5 python hello_world.py
3. Make sure both files are in the same directory
4. Submit the job with the command
sbatch -p carl.p hello_world.job
Installing Packages in your HOME-Directory
Most Python packages can be installed easily using the pip-command. In that case, the --user-option allows a local user based installation. E.g., to install the package PYPACKAGE use the command (after loading the python module):
pip install --user PYPACKAGE
If the installation is successful, the corresponding files are installed in
$HOME/.local/lib/pythonx.y/site-packages
where x.y corresponds to the Python version currently loaded. Also, pip list should show the new package in the list.
Documentation
The full documentation of Python can be found on the following website: Python Documentation