Difference between revisions of "STATA 2016"

From HPC users
Jump to navigationJump to search
Line 25: Line 25:
   mkdir stata
   mkdir stata
you might thus create the folder <tt>stata</tt> in the top level of your home directory for this purpose (you might even go further and create a subdirectory <tt>mp13</tt> specifying the precise version of STATA).  
you might thus create the folder <tt>stata</tt> in the top level of your home directory for this purpose (you might even go further and create a subdirectory <tt>mp13</tt> specifying the precise version of STATA).  
=== Using STATA in batch mode ===
On the local HPC system the convention is to use applications in ''batch'' mode rather than ''interactive'' mode as you would do on your local workstation. This requires you to
list the commands you would otherwise interactively type in STATAs interactive mode in a file, called ''do-file'' in STATA jargon, and to call STATA in conjunction with the <tt>-b</tt> option on that do-file.
To illustrate how to use STATA in batch mode on the HPC system, consider the basic linear regression example contained in
[http://www.ats.ucla.edu/stat/stata/webbooks/reg/chapter1/statareg1.htm Chapter 1] of the STATA Web Book [http://www.ats.ucla.edu/stat/stata/webbooks/reg/ Regression with STATA].
For this linear regression example you might further create the subdirectory <tt>linear_regression</tt> and put the data sets on which you would like to work and all further supplementary files and scripts there.
A do-file corresponding to the basic [http://www.ats.ucla.edu/stat/stata/webbooks/reg/chapter1/statareg1.htm lienear regression example], here called <tt>linReg.do</tt>, reads:
  <nowiki>
use elemapi
regress api00 acs_k3 meals full
  </nowiki>
For the do-file to run properly, the data file available as http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi needs to be stored in the directory <tt>linear_regression</tt>.
Further, if you did not load the [[STATA 2016#Using Stata on the HPC cluster| STATA module]] yet, you need to load it via
  module load stata
before you attempt to use the STATA application.
In principle you could now call STATA in batch mode by typing
  stata -b linReg.do
Albeit this is fully okay for small test programs that consume only few resources (in terms of running time and memory), the convention on the HPC system rather is to submit your job to
the scheduler (here we use [[SGE_Job_Management_(Queueing)_System| Sun grid engine (SGE)]] as scheduler) which
assigns it to a proper execution host on which the actual computations are carried out. Therefore you have to setup a job submission file by means of which
you allocate certain resources for your job. This is common practice on HPC systems on which multiple users access the available resources at a given time.
Examples of such job submission scripts for both, single-core and multi-core usage, are detailed below.


== Documentation ==
== Documentation ==


An user guide for STATA in version 13 can be found [http://www.stata.com/manuals13/u.pdf here] (PDF-Viewer required!).
An user guide for STATA in version 13 can be found [http://www.stata.com/manuals13/u.pdf here] (PDF-Viewer required!).

Revision as of 08:08, 16 March 2017

Introduction

STATA comprises a complete software package, offering statistical tools for data analysis, data management and graphics. On the local HPC System we offer a multiprocessor variant of STATA/MP 13, licensed for up to 12 cores. The license allows up to 5 users to work with STATA at the same time. STATA/MP uses the paradigm of symmetric multiprocessing (SMP) to benefit from the parallel capabilities offered by many modern computers and HPC systems to speed up computations.

Installed version

The currently installed version of STATA is 13.0.

Using Stata on the HPC cluster

Like every module on the cluster, STATA can be loaded by typing

module load stata

Then you can find the following STATA variants in your user environment:

  • stata: a version of STATA that handles small datasets
  • stata-se: a version of STATA for large datasets
  • stata-mp: a fast version of STATA for multicore/multiprocessor machines

More details on the different version can be found here

To facilitate bookkeeping, a good first step towards using STATA on the HPC system is to create a directory in which all STATA related computations are carried out. Using the command

 mkdir stata

you might thus create the folder stata in the top level of your home directory for this purpose (you might even go further and create a subdirectory mp13 specifying the precise version of STATA).

Documentation

An user guide for STATA in version 13 can be found here (PDF-Viewer required!).