Difference between revisions of "Python 2016"

From HPC users
Jump to navigationJump to search
 
(16 intermediate revisions by 3 users not shown)
Line 5: Line 5:
== Installed version ==
== Installed version ==


The following version of Python are currently installed on the cluster:
The following versions of Python are currently installed on the cluster:
*'''2.7.12'''
*'''3.5.2'''


Further, there are modules for '''Biopython'''(version: 1.68) and '''Scientifcpython''' (version:2.9.4 based on Python 2.7.12) installed. Biopython is a set of freely available tools for biological computation written in python. It is distributed to develop python libraries and applications which address the needs of current and future work in bioinformatics. Scientificpython is a collection of python modules for scientific computing. It contains support for geometry, mathematical functions, statistics, pyhsical units, IO, visualization and parallelization.
== Installed version ==
 
The currently installed versions are
On environment ''hpc-uniol-env''
'''Python/2.7.3'''-goolf-5.2.01
'''Python/2.7.11'''-intel-2016b
'''Python/2.7.12'''
'''Python/3.5.2'''
'''Python/3.6.1'''-goolf-5.2.01
 
  On environment ''hpc-env/6.4''
'''Python/2.7.14'''-foss-2017b
'''Python/2.7.14'''-intel-2018a
'''Python/3.6.3'''-foss-2017b
'''Python/3.6.3'''-intel-2018a
'''Python/3.7.0'''-foss-2017b
'''Python/3.7.0'''-intel-2018a
 
 
Further, there are modules for '''[[Biopython]]''' and '''ScientificPython''' (version:2.9.4 based on Python 2.7.12) installed. Biopython is a set of freely available tools for biological computation written in python. It is distributed to develop python libraries and applications which address the needs of current and future work in bioinformatics. Scientificpython is a collection of python modules for scientific computing. It contains support for geometry, mathematical functions, statistics, pyhsical units, IO, visualization and parallelization.


There are even more additional packages installed:
There are even more additional packages installed:
Line 17: Line 34:
*..
*..


A complete list, including the version numbers on the installed packages can be obtained with the command
A complete list, including the version numbers of the installed packages can be obtained with the command
  pip list
  pip list
Python has to be loaded in order for this command to work. If any packages that you need are missing, you can either contact the {{sc}} and we will install it or you can install it on your own in your $HOME-directory (instructions at the end of this article!).
Python has to be loaded in order for this command to work. If any packages that you need are missing, you can either contact the {{sc}} and we will install it or you can install it on your own in your $HOME-directory (instructions at the end of this article!).
Line 28: Line 45:
  module load Python/2.7.12  
  module load Python/2.7.12  
This will, obviously, load python in version 2.7.12. Loading Python without specifying a version will load version 3.5.2!
This will, obviously, load python in version 2.7.12. Loading Python without specifying a version will load version 3.5.2!
On the other hand, if you want to use the latest version of Python, you have to change the environment first.
module load hpc-env/6.4
After that, you just have to choose which Toolchain should be underlying:
You can either use Foss or Intel:
module load Python/3.7.0-intel-2018a
''or''
module load Python/3.7.0-foss-2017b


== Using Python and MPI ==
== Using Python and MPI ==
Line 35: Line 61:


An easy example could look like this:
An easy example could look like this:
#Create a new file called <tt>hello_world.py</tt> and add the following lines of code:
 
hello.py
1. Create a new file called <tt>hello_world.py</tt> and add the following lines of code:
  from mpi4py import MPI
  from mpi4py import MPI
  comm = MPI.COMM_WORLD
  comm = MPI.COMM_WORLD
  rank = comm.Get_rank()
  rank = comm.Get_rank()
  print "hello world from process ", rank
  print "hello world from process ", rank
#:Create a new file called <tt>hello_world.job</tt> and add the following lines of code:
2. Create a new file called <tt>hello_world.job</tt> and add the following lines of code:
  #!/bin/bash
  #!/bin/bash
                  
                  
Line 47: Line 73:
  #SBATCH --mem=2G                   
  #SBATCH --mem=2G                   
  #SBATCH --time=0-2:00   
  #SBATCH --time=0-2:00   
  #SBATCH --job-name MAPLE-TEST               
  #SBATCH --job-name PYTHON-MPI-TEST               
  #SBATCH --output=maple-test.%j.out         
  #SBATCH --output=python-mpi-test.%j.out         
  #SBATCH --error=maple-test.%j.err           
  #SBATCH --error=python-mpi-test.%j.err           
    
    
  module load python
  module load Python
   
   
  mpirun -n 5 python hello_world.py
  mpirun -n 5 python hello_world.py
#:Make sure both files are in the same directory
3. Make sure both files are in the same directory
#:submit the job with the command
 
4. Submit the job with the command
  sbatch -p carl.p hello_world.job
  sbatch -p carl.p hello_world.job


== Installing Packages in your HOME-Directory ==
== Installing Packages in your HOME-Directory ==
=== Using pip ===


Most Python packages can be installed easily using the <tt>pip</tt>-command. In that case, the <tt>--user</tt>-option allows a local user based installation. E.g., to install the package <tt>PYPACKAGE</tt> use the command (after loading the python module):
Most Python packages can be installed easily using the <tt>pip</tt>-command. In that case, the <tt>--user</tt>-option allows a local user based installation. E.g., to install the package <tt>PYPACKAGE</tt> use the command (after loading the python module):
Line 68: Line 97:
   $HOME/.local/lib/pythonx.y/site-packages
   $HOME/.local/lib/pythonx.y/site-packages


where <tt>x.y</tt> corresponds to the Python version currently loaded. Also, <tt>pip list</tt> should show the new package in the list.
where <tt>x.y</tt> corresponds to the Python version currently loaded. Also, <tt>pip list</tt> should show the new package in the list. Note, that you can only manage one set of package this way. If you want to use different environments, for example because different packages require different versions of the same dependency, you need to use e.g. virtual environments (<tt>venv</tt>, see below) or even <tt>conda</tt> (see [[Anaconda 2016|Anaconda]]).
 
=== Virtual Environments ===
 
The <tt>venv</tt> module supports creating lightweight virtual environments, each with their own independent set of Python packages (see <tt>venv</tt>-[https://docs.python.org/3/library/venv.html documentation] for more details). To use <tt>venv</tt> you can run the following commands:
module load hpc-env/8.3                        # optional, select the hpc-env you want to use to load Python
module load Python                            # load the most recent Python available in hpc-env
python -m venv $HOME/venvs/my_env              # create a virtual environment in the given path
source $HOME/venvs/my_env/bin/activate        # activate the environment
python -m pip install -U pip setuptools wheel  # update pip, setuptools, and wheel
 
After that you are ready to install the packages you need with <tt>pip</tt>. For example, if you want to install <tt>scipy</tt> (although, we do have a module for that called SciPy-bundle):
python -m pip install scipy
This will automatically install all required dependencies as well. Other useful commands are
python -m pip list                            # show all installed packages
python -m pip uninstall scipy                  # uninstall packages
deactivate                                    # leave the virtual environment
 
Note that there is a limitation of the number of files you can have in <tt>$HOME</tt>, so you should not install too many virtual environments in your account. If you no longer need an environment, you can completely remove it by deleting the corresponding folder:
rm -r $HOME/venvs/my_env
If you do this accidentally, you can still recover the directory from the snapshots for up to 30 days.


== Documentation ==
== Documentation ==


The full documentation of Python can be found on the following website: [https://docs.python.org/3/ Python Documentation]
The full documentation of Python can be found on the following website: [https://docs.python.org/3/ Python Documentation]

Latest revision as of 14:28, 9 August 2023

Introduction

Python is a widely used high-level programming language for general-purpose programming, created by Guido van Rossum and first released in 1991. An interpreted language, Python has a design philosophy which emphasizes code readability (notably using whitespace indentation to delimit code blocks rather than curly braces or keywords), and a syntax which allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java. The language provides constructs intended to enable writing clear programs on both a small and large scale.

Installed version

The following versions of Python are currently installed on the cluster:

Installed version

The currently installed versions are
On environment hpc-uniol-env
Python/2.7.3-goolf-5.2.01
Python/2.7.11-intel-2016b
Python/2.7.12
Python/3.5.2
Python/3.6.1-goolf-5.2.01
 On environment hpc-env/6.4
Python/2.7.14-foss-2017b
Python/2.7.14-intel-2018a
Python/3.6.3-foss-2017b
Python/3.6.3-intel-2018a
Python/3.7.0-foss-2017b
Python/3.7.0-intel-2018a


Further, there are modules for Biopython and ScientificPython (version:2.9.4 based on Python 2.7.12) installed. Biopython is a set of freely available tools for biological computation written in python. It is distributed to develop python libraries and applications which address the needs of current and future work in bioinformatics. Scientificpython is a collection of python modules for scientific computing. It contains support for geometry, mathematical functions, statistics, pyhsical units, IO, visualization and parallelization.

There are even more additional packages installed:

  • numpy
  • scipy
  • nose
  • ..

A complete list, including the version numbers of the installed packages can be obtained with the command

pip list

Python has to be loaded in order for this command to work. If any packages that you need are missing, you can either contact the Scientific Computing and we will install it or you can install it on your own in your $HOME-directory (instructions at the end of this article!).

Using Python

If you want to use python on the cluster, you can simply do that by using the following command

module load Python

Since there is more than one verson of Python installed, you can further specify which version you would like to load, e.g.

module load Python/2.7.12 

This will, obviously, load python in version 2.7.12. Loading Python without specifying a version will load version 3.5.2!

On the other hand, if you want to use the latest version of Python, you have to change the environment first.

module load hpc-env/6.4

After that, you just have to choose which Toolchain should be underlying: You can either use Foss or Intel:

module load Python/3.7.0-intel-2018a
or
module load Python/3.7.0-foss-2017b

Using Python and MPI

For parallel scripts the Python installation contains the package mpi4py. To launch a parallel Python script inside an SLURM jobscript, use command line

mpirun YOUR_PYTHON_SCRIPT.py

An easy example could look like this:

1. Create a new file called hello_world.py and add the following lines of code:

from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
print "hello world from process ", rank

2. Create a new file called hello_world.job and add the following lines of code:

#!/bin/bash
               
#SBATCH --ntasks=1                  
#SBATCH --mem=2G                  
#SBATCH --time=0-2:00  
#SBATCH --job-name PYTHON-MPI-TEST              
#SBATCH --output=python-mpi-test.%j.out        
#SBATCH --error=python-mpi-test.%j.err          
 
module load Python

mpirun -n 5 python hello_world.py

3. Make sure both files are in the same directory

4. Submit the job with the command

sbatch -p carl.p hello_world.job

Installing Packages in your HOME-Directory

Using pip

Most Python packages can be installed easily using the pip-command. In that case, the --user-option allows a local user based installation. E.g., to install the package PYPACKAGE use the command (after loading the python module):

 pip install --user PYPACKAGE

If the installation is successful, the corresponding files are installed in

 $HOME/.local/lib/pythonx.y/site-packages

where x.y corresponds to the Python version currently loaded. Also, pip list should show the new package in the list. Note, that you can only manage one set of package this way. If you want to use different environments, for example because different packages require different versions of the same dependency, you need to use e.g. virtual environments (venv, see below) or even conda (see Anaconda).

Virtual Environments

The venv module supports creating lightweight virtual environments, each with their own independent set of Python packages (see venv-documentation for more details). To use venv you can run the following commands:

module load hpc-env/8.3                        # optional, select the hpc-env you want to use to load Python
module load Python                             # load the most recent Python available in hpc-env
python -m venv $HOME/venvs/my_env              # create a virtual environment in the given path
source $HOME/venvs/my_env/bin/activate         # activate the environment
python -m pip install -U pip setuptools wheel  # update pip, setuptools, and wheel

After that you are ready to install the packages you need with pip. For example, if you want to install scipy (although, we do have a module for that called SciPy-bundle):

python -m pip install scipy

This will automatically install all required dependencies as well. Other useful commands are

python -m pip list                             # show all installed packages
python -m pip uninstall scipy                  # uninstall packages
deactivate                                     # leave the virtual environment

Note that there is a limitation of the number of files you can have in $HOME, so you should not install too many virtual environments in your account. If you no longer need an environment, you can completely remove it by deleting the corresponding folder:

rm -r $HOME/venvs/my_env

If you do this accidentally, you can still recover the directory from the snapshots for up to 30 days.

Documentation

The full documentation of Python can be found on the following website: Python Documentation