Number crunching at ESO Vitacura

For scientific computing ESO offers in Vitacura two number crunching machines, chapman and chapman2. The main aim of these machines is the implementation of workflows for the processing of data coming from Paranal and ALMA. It is optimized for large size files, such as those produced by the MUSE workflow. This document describes the software available and the use policies.

Note that both machines chapman and chapman2 share the same home user directories. But, the system directories live in the individual machines. In addition, chapman also have 38TB of fast disks reachable only from chapman from a link in every user home directory. These should be used for processes needing acces to fast disks and all files on it should be moved to the gluster soon after computation is finished. chapman2 does not have this fast set of disks and you will only see a broken link.

Usage policy

In the near future we expect to implement htcondor for better process allocation. Until then we count on users being considerate with their usage:

Who to ask what

Requesting an account for number crunching: cl-servicedesk@eso.org

To request an account please follow the steps:

Questions regarding system and high level software, e.g. installation: cl-servicedesk@eso.org

Please approach one of the IT persons.

Astronomical or ESO software

Please send an email to chapman@eso.org

Questions regarding an ESO pipeline or CASA

Contact the Instrument Fellow of the Instrument Scientist.

Quick start instructions

# User specific aliases and functions
# Place gasgano in the PATH
export PATH=/opt/gasgano/bin:$PATH

# Java 1.8 development environment
export LD_LIBRARY_PATH=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64:$LD_LIBRARY_PATH:/opt/esopipes/lib

# to use casa
export PATH=$PATH:/opt/casa/bin

Transfering large amounts of data to the gluster

IMPORTANT: no large download jobs should be run during the day. Wait until after 6PM and stop it before 7AM next day!!!

Large anount: > 5 GB in one day

To help with the adherence to this policy we have created a script that can be used for transfers of ESO telescopes data from the archive.

The script vitacura_wget_timewindow, found in both chapmans at /usr/local/bin, should replace wget in the archive download scripts. When requesting data from the archive the users should use the provided script downloadRequestXXXXscript.sh and edit it as follows:

The line in downloadRequestXXXXscript.sh that reads

else xargs $xargsopts wget $download_opts

should be changed to

else xargs $xargsopts vitacura_wget_timewindow $download_opts

We support only the transfer of data from the ESO archive via this script. If a user will transfer data from other sources he/she should take care of doing it in a similar way. The important point is that before the transfer of every file the script should verify whether it is the right time for the transfer, if not the script should stop or start waiting until the transfer window arrives again. We provide the body of the function used in the bash script:

function sleepuntil() {
  local target_time="$1"
  today=$(date +"%m/%d/%Y")
  current_epoch=$(date +%s)
  target_epoch=$(date -d "$today $target_time" +%s)
  sleep_seconds=$(( $target_epoch - $current_epoch ))

  echo "${bold}vitacura_wget: sleeping $sleep_seconds seconds until $1"
  if [[ $sleep_seconds > 0 ]]; then
     sleep $sleep_seconds
  fi
}
function  my_wget_timewindow()
{
   bold=$(tput bold)
   normal=$(tput sgr0)
   start_time="18:00"
   end_time="07:00"
   DONE=0
   while [ $DONE -eq 0 ]; do
       currenttime=$(date +%H:%M)
       if [[ ($start_time < $end_time &&  ("$currenttime" > $start_time  &&  "$currenttime" < $end_time)) ||
             ($start_time > $end_time &&  ("$currenttime" > $start_time  ||  "$currenttime" < $end_time)) ]]; then
         echo
         echo "${bold}vitacura_wget: Current time $currenttime in allowed time window $start_time to $end_time${normal}"
         echo  "${bold}vitacura_wget: wget $@${normal}"
         wget $@
         echo "${bold}--------------------${normal}"
         echo
         sleep 1s
         DONE=1
       else
         echo " ${bold}vitacura_wget: Current time $currenttime not in allowed time window $start_time to $end_time${normal}"
         echo "${bold}vitacura_wget: WAITING FOR  wget $@${normal}"
         sleepuntil $start_time
         DONE=0
       fi
   done
}

my_wget_timewindow "$@"

Hardware

gluster box

A decision was taken early that the most important consideration was to give users the illusion of infinite space. This decision was motivated by the shift working schedule of most astronomers at ESO Chile which does not permit continuity of their work. Large disk space was esential for this otherwise we would need to request users to continually move large amounts of data in and out of the system. The system is optimized for the handling of very large files (100s of Mb or GBs), and the response time when operating with many small files is long.

The current hardware consist of a gluster box with 250 TB of space, upgradable in quantums of 250TB.

Chapman

RAM: 512 GBytes

* [science@chapman ~]$ cat /proc/meminfo
* MemTotal:       528336468 kB
* MemFree:        36697824 kB
* MemAvailable:   336734620 kB

CPU: Intel(R) Xeon(R) CPU E7-4850 v2 @ 2.30GHz

4 CPUs 12 cores/24 threads each allowing a total of 96 threads

Chapman2

RAM: 512 GBytes

CPU: Intel® Xeon® Processor E5-2697 v4 @ 2.30GHz

2 CPUs with 18 cores/36 threads each allowing a total of 72 threads

Software

System tools

chapman.sc.eso.org

chapman.sc.eso.org has currently an outdated OS and configuration. Therefore I will not put a lot of information here as it will soon change. chapman2 is the reference installation.

* [science@chapman ~]$ cat cat /etc/*-release
* NAME="Scientific Linux"
* VERSION="7.5 (Nitrogen)"
* ID="rhel"
* ID_LIKE="scientific centos fedora"
* VERSION_ID="7.5"
* PRETTY_NAME="Scientific Linux 7.5 (Nitrogen)"
* ANSI_COLOR="0;31"
* CPE_NAME="cpe:/o:scientificlinux:scientificlinux:7.5:GA"
* HOME_URL="http://www.scientificlinux.org//"
* BUG_REPORT_URL="mailto:scientific-linux-devel@listserv.fnal.gov"

* REDHAT_BUGZILLA_PRODUCT="Scientific Linux 7"
* REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
* REDHAT_SUPPORT_PRODUCT="Scientific Linux"
* REDHAT_SUPPORT_PRODUCT_VERSION="7.5"
* Scientific Linux release 7.5 (Nitrogen)
* Scientific Linux release 7.5 (Nitrogen)
* Scientific Linux release 7.5 (Nitrogen)

Development environment

High level software

Astronomical tools

ESO tools

chapman2.sc.eso.org

We use the ESO standard CentOS operating system for chapman2:

* [science@chapman2 sof_chapman2]$ cat /etc/*-release
* CentOS Linux release 7.5.1804 (Core) 
* NAME="CentOS Linux"
* VERSION="7 (Core)"
* ID="centos"
* ID_LIKE="rhel fedora"
* VERSION_ID="7"
* PRETTY_NAME="CentOS Linux 7 (Core)"
* ANSI_COLOR="0;31"
* CPE_NAME="cpe:/o:centos:centos:7"
* HOME_URL="https://www.centos.org/"
* BUG_REPORT_URL="https://bugs.centos.org/"
* CENTOS_MANTISBT_PROJECT="CentOS-7"
* CENTOS_MANTISBT_PROJECT_VERSION="7"
* REDHAT_SUPPORT_PRODUCT="centos"
* REDHAT_SUPPORT_PRODUCT_VERSION="7"
* CentOS Linux release 7.5.1804 (Core) 
* CentOS Linux release 7.5.1804 (Core)

Development environment

* yum installed development environment: 
* Group: development
* Group-Id: development
* Installed Packages:
*   =autoconf
*   =automake
*   =bison
*   =byacc
*   =cscope
*   =ctags
*   =diffstat
*   =doxygen
*   =elfutils
*   =flex
*   =gcc
*   =gcc-c++
*   =gcc-gfortran
*   =git
*  =indent
*  =intltool
*   =libtool
*   =patch
*   =patchutils
*   =rcs
*   =redhat-rpm-config
*   =rpm-build
*   =rpm-sign
*   =subversion
*   =swig
*   =systemtap

High level software

Astronomical tools

ESO tools