KNIME 2016
Introduction
KNIME Analytics Platform is the open source software for creating data science applications and services.
KNIME stands for KoNstanz Information MinEr.
Installed version
The currently installed version is available on the environment hpc-env/6.4:
KNIME/3.6.2
KNIME Module
If you want to find out more about KNIME on the HPC Cluster, you can use the command
module spider KNIME
This will show you basic informations e.g. a short description and the currently installed version.
To load the desired version of the module, use the command, e.g.
module load KNIME
Always remember: this command is case sensitive!
Using KNIME on the HPC Cluster
Basically, you have two options to use KNIME on the HPC cluster: 1) you can start KNIME within a job script and execute a prepared workflow or 2) you can use the SLURM Cluster Execution from your local work station to offload selected nodes from your workflow to the cluster. Both option are described briefly below.
Using KNIME with a job script
This approach is straight-forward once you have prepared a workflow for the execution on the cluster. That means you need to copy all the required files to a directory on the cluster (the worflowDir). After that you need to write a job script which calls KNIME and runs your workflow. A minimal example is
#!/bin/bash #SBATCH --partition carl.p knime -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION -workflowDir="$HOME/knime-workspace/Example Workflows/Basic Examples/Simple Reporting Example"
The workflow in this example is available once you started the KNIME gui on the cluster (which is recommended to do once). Additional SLURM option may used to request memory, run time and other resources (see elsewhere in this wiki for details). Furthermore, there are also other option to run KNIME in batch mode, e.g. to request memory for the Java Virtual Machine. Please refer to the documentation of KNIME for details.
Using KNIME with the SLURM Cluster Execution Plugin
The SLURM Cluster Execution Plugin allows you to offload some nodes in your workflow to the cluster. To use the plugin you need to install and configure it first as described here. Please note, that the plugin is not officially supported by KNIME. It can be used as it is, however if you question please send them to Scientific Computing.
Prerequisites
You need to install the same version of KNIME locally that you want to use on the cluster (older versions might be ok, but newer version may fail). You also need to be able to connect to the cluster with ssh, optionally you can prepare an identity file for the login. It is also recommended to start the KNIME gui once on the cluster to create the default workflowDir in your HOME directory. Alternatively, you can create one manually.
Installing the Plugin
- Download the plugin [[[media:Cluster-exec-slurm-20190807-directory.zip|zip]]]
Documentation
To find out more about KNIME Analytics Platform, you can take a look at this overview.
The full documentation and more learning material can be found here.