Difference between revisions of "Non-public templates: Coordination of Scientific Computing"
Line 389: | Line 389: | ||
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Memory_Overestimation | http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Memory_Overestimation | ||
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Debugging | |||
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Profiling_using_gprof |
Revision as of 18:11, 24 September 2013
Here, for documentation, completeness and availability I will list some templates of e-mails and further things I used on a regular basis.
Application for a new user account
So as to apply for a new user account, an eligible user needs to specify three things:
- his/her anonymous user-name in the form abcd1234,
- the working group (or ideally the unix-group) he will be associated to, and
- an approximate data until when the user account will be needed.
No university user account, yet
If the user has no university-wide anonymous user account, yet, he first needs to apply for one. An exemplary e-mail with advice on how to get such a (guest) user account is listed below
Sehr geehrter Herr NAME, um einen Nutzeraccount für das HPC System erhalten zu können müssen Sie bereits über einen universitätsweiten, anonymen Nutzeraccount verfügen. Als Gast einer Arbeitsgruppe können sie einen entsprechenden Guest-Account bei den IT-Diensten beantragen. Besuchen Sie dazu bitte die Seite http://www.uni-oldenburg.de/itdienste/services/nutzerkonto/gaeste-der-universitaet/ und wählen Sie die Option "Gastkonto einrichten". Starten sie den Workflow für das Anlegen eines Gastkontos. Tragen Sie als Verantwortlichen den Leiter der universitären Organisationseinheit ein, der Ihr Vorhaben unterstützt. Bitten Sie diesen, die E-Mail die er erhält zu öffnen, den darin enthaltenen Link aufzurufen und den Antrag zu genehmigen. Das Konto wird dann automatisch erstellt. Ihr anonymer Nutzeraccount wird die Form "abcd1234" haben. Um nun ihren Nutzeraccount für das HPC System freischalten zu können senden Sie mir bitte folgende Details: 1) den anonymen Nutzernamen für den der HPC account erstellt werden soll, 2) den Namen der Arbeitsgruppe der Sie zugeordnet werden sollen, 3) einen voraussichtlichen Gültigkeitszeitraum für den benötigten HPC account. Sobald Ihr HPC account aktiviert ist werde ich mich mit weiteren Informationen bei Ihnen melden. Mit freundlichen Grüßen Oliver Melchert
User account HPC system: Mail to IT-Services
Once the user supplied the above information, you can apply for a HPC user account at the IT-Service using an e-mail similar to:
Mail to: felix.thole@uni-oldenburg.de; juergen.weiss@uni-oldenburg.de Betreff: [HPC-HERO] Einrichtung eines Nutzeraccounts Sehr geehrter Herr Thole, sehr geehrter Herr Weiss, Hiermit bitte ich um die Einrichtung eines HPC Accounts für Herrn NAME abcd124; UNIX-GROUP der Account wird voraussichtlich bis DATUM benötigt. Mit freundlichen Grüßen Oliver Melchert
If no proper unix group exists, yet, send instead an email similar to the following:
Mail to: felix.thole@uni-oldenburg.de; juergen.weiss@uni-oldenburg.de Betreff: [HPC-HERO] Einrichtung eines Nutzeraccounts Hallo Felix, hallo Jürgen, Hiermit bitte ich um die Einrichtung eines HPC Accounts für Herrn NAME abcd1234 der Account wird voraussichtlich bis DATUM benötigt. Herr NAME ist Mitarbeiter der AG "AG-NAME" (AG-URL) von Herrn Prof. NAME AG-LEITER. Die entsprechede AG hat noch keine eigene Unix Group! Kann daher eine neue Unix Group für die AG angelegt und in die bestehende Gruppenhierarchie eingebunden werden? Ich schlage hier den Namen agUNIX-GROUP-NAME für die Unix Gruppe vor. Die AG gehört zur Fak. FAKULTAET. Mit freundlichen Grüßen Oliver Melchert
User account HPC system: Mail back to user
As soon as you get feedback from the IT-Services that the account was created, send an email to the user similar to the following:
Betreff: [HPC-HERO] HPC user account Sehr geehrter Herr NAME, die IT-Dienste haben Ihren HPC Account bereits freigeschaltet. Ihr Loginname ist abcd1234 und Sie sind der Unix-gruppe UNIX-GROUP-NAME zugeordnet. Sie verfügen über 100GB Plattenspeicher auf dem lokalen Filesystem (mit vollem Backup). Wenn Sie über einen begrenzten Zeitraum mehr Speicherplatz benötigen können Sie mich gerne diesbezüglich anschreiben. Ihren aktuellen Speicherverbrauch auf dem HPC System können Sie mittels "iquota" einsehen. An jedem Sonntag werden Sie eine Email mit dem Betreff "Your weekly HPC Quota Report" erhalten, die Ihren aktuellen Speicherverbrauch zusammenfasst. Anbei sende ich Ihnen einen Link zu unserem HPC user wiki, auf dem Sie weitere Details über das lokale HPC System erhalten http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Main_Page Der Beitrag "Brief Introduction to HPC Computing" unter http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Brief_Introduction_to_HPC_Computing illustriert einige einfache Beispiele zur Nutzung der verschiedenen (hauptsächlich parallelen) Anwendungsumgebungen die auf HERO zur Verfügung stehen und ist daher besonders zu empfehlen. Er diskutiert außerdem einige andere Themen, wie z.B. geeignetes Alloziieren von Ressourcen und Debugging. Wenn Sie planen die parallelen Ressourcen von MATLAB auf HERO zu nutzen kann ich Ihnen die Beiträge "MATLAB Distributed Computing Server" (MDCS) unter http://wiki.hpcuser.uni-oldenburg.de/index.php?title=MATLAB_Distributing_Computing_Server und "MATLAB Examples using MDCS" unter http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Matlab_Examples_using_MDCS empfehlen. Der erste Beiträge zeigt wie man das lokale Nutzerprofil für die Nutzung von MATLAB auf HERO konfigurieren kann und der Zweite beinhaltet einige Beispiele und diskutiert gelegentlich auftretende Probleme im Umgang mit MDCS. Viele Grüße Oliver Melchert
english variant of the above email:
Betreff: [HPC-HERO] HPC user account Dear NAME, the IT-Services were now able to activate your HPC account. Your login name to the HPC system is abcd1234 and you are integrated in the group UNIX-GROUP-NAME Per default you have 100GB of storage on the local filesystem which is fully backed up. If you need some more storage over a limited period in time you can contact me. Note that you can check your memory consumption on the HPC system via the command "iquota". In addition, on each Sunday you will receive an email, titled "Your weekly HPC Quota Report", summarizing your current memory usage. Below I sent you a link to the HPC user wiki where you can find further details on the HPC system http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Main_Page In particular I recommend the "Brief Introduction to HPC Computing" at http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Brief_Introduction_to_HPC_Computing which illustrates several basic examples related to different (mostly parallel) environments the HPC system HERO offers and discusses a variety of other topics, as, e.g., proper resource allocation and debugging. Further, if you plan to use the parallel capabilities of MATLAB on HERO, I recommend the "MATLAB Distributed Computing Server" (MDCS) page at http://wiki.hpcuser.uni-oldenburg.de/index.php?title=MATLAB_Distributing_Computing_Server and the "MATLAB Examples using MDCS" wiki page at http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Matlab_Examples_using_MDCS These pages summarize how to properly set up your profile for using MATLAB on HERO and discuss some of the frequently appearing problems. With kind regards Oliver
User account HPC system: Mail back to user; Fak 2 (STATA users)
New users from Fak 2 most likely want to use the STATA software. An adapted version of the above email reads
Dear MY_NAME, the IT-Services activated your HPC account already. Your login name to the HPC system is LOGIN_NAME and you are associated to the unix group UNIX_GROUP This is also reflected by the structure of the filesystem on the HCP system. Per default you have 100GB of storage on the local filesystem which is fully backed up. If you need some more storage over a limited period in time you can contact me. Note that you can check your memory consumption on the HPC system via the command "iquota". In addition, on each Sunday you will receive an email, titled "Your weekly HPC Quota Report", summarizing your current memory usage. Below I sent you a link to the HPC user wiki where you can find further details on the HPC system: http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Main_Page If you plan to use the parallel capabilities of STATA on HERO, I recommend the "STATA" entry at Main Page > Application Software and Libraries > Mathematics/Scripting > STATA see: http://wiki.hpcuser.uni-oldenburg.de/index.php?title=STATA The above page summarizes how to access the HPC System and how to successfully submit a STATA job. With kind regards Dr. Oliver Melchert
Temporary extension of disk quota
Sometimes a user from the theoretical chemistry group needs an temporary extension of the available backed-up disk space. Ask him to provide
- the total amount of disk space needed (he might check his current limit by means of the unix command iquota)
- an estimated data until the extension is required
Mail to IT-Servies
Then send an email similar to the one listed below to the IT-Service
Mail to: felix.thole@uni-oldenburg.de; juergen.weiss@uni-oldenburg.de Betreff: [HPC-HERO] Erhöhung des verfügbaren Festplattenspeichers eines Nutzers Hallo Felix, hallo Jürgen, der HPC User NAME abcd1234; UNIX-GROUP hat darum gebeten seinen Disk Quota vorübergehend zu erhöhen. Er bittet um eine Erhöhung auf ein Gesamtvolumen von 500GB die bis Ende Dezember 2013 benötigt wird. Danach kann er die Daten entsprechend archivieren und der Disk Quota könne wider zurückgesetzt werden. Viele Grüße, Oliver
List of users with nonstandard quota
Users that currently enjoy an extended disk quota:
NAME ID MEM LIMIT jan.mitschker@uni-oldenburg.de dumu7717 1TB no limit given hendrik.spieker@uni-oldenburg.de rexi0814 300GB Ende September 2013 wilke.dononelli@uni-oldenburg.de juro9204 700GB Ende Dezember 2013
Cluster downtime
In case there needs to be a maintenance downtime for the cluster, send an email similar to the following to the mailing list of the HPC users
Mail to: hpc-hero@listserv.uni-oldenburg.de Betreff: [HPC-HERO] Maintenance downtime 11-13 June 2013 (announcement) Dear Users of the HPC facilities, this is to inform you about an overly due THREE-DAY MAINTENANCE DOWNTIME FROM: Tuesday 11th June 2013, 7 am TO: Thursday 13th June 2013, 16 pm This downtime window is required for essential maintenance work regarding particular hardware components of HERO. Ultimately, the scheduled downtime will fix longstanding issues caused by malfunctioning network switches. Please note that all running Jobs will be killed if they are not finished up to 11th June 7 am. During the scheduled downtime, all queues and filesystems will be unavailable. We expect the HPC facilities to resume on Thursday afternoon. I will remind you about the upcoming three-day maintenance downtime in unregular intervals. Please accept my apologies for any inconvenience caused Oliver Melchert
In case the downtime needs to be extended send an email similar to:
Mail to: hpc-hero@listserv.uni-oldenburg.de Betreff: [HPC-HERO] Delay returning the HPC system HERO to production status Dear Users of the HPC Facilities, we currently experience a DELAY RETURNING THE hpc SYSTEM TO PRODUCTION STAUTS since the necessary change of the hardware components took longer than originally expected. The HPC facilities are expected to finally resume service by Friday 14th June 2013, 15:00 We will notify you as soon as everything is back online. With kind regards Oliver Melchert
you do not need to supply much details, yet. However, if another extension is necessary, you should provide some details otherwise prepare for complaints by the users. So, your email could look similar to:
Mail to: hpc-hero@listserv.uni-oldenburg.de Betreff: [HPC-HERO] Further delay returning the HPC system HERO to production status Dear Users of the HPC Facilities, as communicated already yesterday, we currently experience a DELAY RETURNING THE hpc SYSTEM TO PRODUCTION STATUS. The delay results from difficulties related to the maintenance work on the hardware components of HERO. The original schedule for the maintenance work could not be kept. Some details of the maintenance process are listed below: According to the IT-services, the replacement of the old (malfunctioning) network switches by IBM engineers worked out well (with no delay). However, the configuration of the components by pro-com engineers took longer that the previously estimated single day, causing the current delay. Once the configuration process is completed, the IT-service staff needs to perform several tests, firmware updates and application test which will take approximately one day. After the last step is completed, the HPC facilities will finally return to production status. In view of the above difficulties we ask for your understanding that the HPC facilities will not be up until today 15:00. We hope that the HPC facilities resume service by Monday 17th June 2013, 16:00 We will notify you as soon as everything is back online and apologize for the inconvenience. With kind regards Oliver Melchert
once the HPC is up and ready send an email similar to:
Mail to: hpc-hero@listserv.uni-oldenburg.de Betreff: [HPC-HERO] HPC systems have returned to production Dear Users of the HPC Facilities, this is to inform you that the maintenance work on the HPC systems have been completed and the HPC component HERO has returned to production: HERO accepts logins and has already started to process jobs. Thank you for your patience and please accept my apologies for the extension of the maintenance downtime and any inconvenience this might have caused Oliver Melchert
List of user wiki pages
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Brief_Introduction_to_HPC_Computing
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Matlab_Examples_using_MDCS
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Queues_and_resource_allocation
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Unix_groups
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Mounting_Directories_of_FLOW_and_HERO#OSX
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=File_system (Snapshot functionality)
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=STATA
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Memory_Overestimation
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Debugging
http://wiki.hpcuser.uni-oldenburg.de/index.php?title=Profiling_using_gprof