AMiCO – Centro di Calcolo

HELP: amico-troubles@fisica.unimi.it
MANAGEMENT: amico-resp@fisica.unimi.it
HTCONDOR: condor-troubles@mi.infn.it
 ACCOUNT
 JOBS SUBMISSION
 CLUSTERS OVERVIEW
 HOWTO
USE CASE
CLUSTERS DETAILS
Acknowledge the use of AMiCO

AMiCO is “Apparato MIlanese per il Calcolo Opportunistico”
Implemented by Dipartimento di Fisica and INFN, AMiCO is a project aiming to federate heterogeneous computing clusters at Physics Department of Università degli Studi and INFN Milano.
Clusters can

spill out their excess jobs to unused resources when they are overloaded
preserving the possibility for cluster owners to preempt alien jobs.

We use the starting and flocking features of HTCondor. We added support for:

dynamic slots
parallel scheduling
Docker jobs

For jobs needing data access while running outside their home cluster we provide:

a CEPH readable/writeable storage accessible via S3 on RADOS gateway
CVMFS, mounted on worker nodes and inside Docker containers.

ACCOUNT: requirement and howto request access

You need to have an “idefix” (INFN – Sezione di Milano) or an UNIMI account (@ unimi.it or @ studenti.unimi.it). An “idefix” account request must be forwarded to the INFN Administration: (on this web site) follow menu ACCOUNTING E POSTA, menu item RICHIESTA DI ACCOUNT.
To request access to submit jobs with HTCondor you can

contact one of the following Cluster managers:
Leonardo Carminati
Particle Physics –> David Rebatto
Nuclear Physics –> Giovanna Benzoni
Cosmology-LSS (Large-Scale Structure) –> Luigi Guzzo
Theoretical Physics –> Alessandro Vicini / Marco Zaro
Department’s cluster (“MAGI”) –> Franco Leveraro
Undergraduates and graduate students and guests without a research group manager can ask to amico-troubles@mi.infn.it (specifying email, role and research activities) an access authorization to the clusters “MAGI” (network point of entry: “gaspare”) with a user-quota of 50 GB and a storage quota on CEPH.

SUBMISSION of the JOBS:

TUTORIAL:
AMiCO: practical introduction di Francesco Prelz (INFN – MI)
AMiCO: special cases di Francesco Prelz (INFN – MI)
Shortly…
The computing resources are organised in a number of privately owned and operated computing clusters.
Inside each cluster, one “head node” is usually charged with co-ordinating the cluster, and sometimes also acts as a single network point of entry.
Other nodes in the cluster execute jobs (“execution nodes”). In the AMiCO’s infrastructure, all executing nodes in any cluster can communicate directly over the local area network.
In the TABLE below we’ll shortly take an overall look at the list of the available clusters. Just an idea about the computer power.
Nodes where jobs are submitted and queued are called “submit nodes”.
Typically, users who need to submit jobs share some interests with cluster owners, so they have priority access to some cluster.
Interactive execution and (possibly) various batch systems are used to organise the workload in each cluster.
Typically with less than 100% resource occupancy.
AMiCO wants to be friendly to local cluster owners, and will suspend, then migrate jobs, when local workload appears.
Current default policy:
Suspend after 2 minutes of local activity.
Vacate and migrate if the job cannot be restarted within 10 minutes.
An upper-tier service (or “Central Manager“, codename: superpool-cm) matching available computing resources with pending job requests can compensate load peaks across clusters and increase goodput.
The semantics of this resource sharing service is opportunistic: HTCondor is a specialised solution for this scenario.
If HTCondor is also used as a local cluster ‘batch system’, then local and AMiCO’s jobs can be handled in a uniform way.
This scenario cannot be serviced with any number of FIFO (first-in-first-out) queues.

Main characteristics of the AMICO’s federate CLUSTER

Pool name	Nodes	Total memory	Total cpus	Total disk	Max memory	Max cpus	Max disk
doraemon	4	201G	60	2262G	62G	24	1074G
proof-pool	9	1171G	240	1054G	755G	40	323G
magi-pool	3	312G	160	406G	125G	64	141G
teor-pool	54	2989G	3088	42982G	125G	128	1815G
gamma-pool	1	126G	32	2G	126G	32	2G

Pool name	Research group	Head node name (central manager)
doraemon	Cosmology-LSS
proof-pool	HEP - ATLAS
magi-pool	Department of Physics	gaspare
teor-pool	Theoretical Physics
gamma-pool	Nuclear Physics

HOWTO:

1) HOWTO AMICO
SLIDE di Francesco Prelz (INFN – MI)
1a) Conceptual introduction: Distributed computing and storage, Opportunistic computing Practical introduction: available tools, AMICO distributed storage (CEPH), AMICO distributed computing (HTCondor).
Job examples: “Hello world”, File transfer via sandbox, Multiple/parametric job submission and control, File access via Object Storage, Script submission, Object Storage file staging, Interactive Jobs
AMiCO: practical introduction
1b) More complex cases: Common dependences and how to require them, Docker and HTCondor, Parallel jobs (MPI)
AMiCO: special cases
2) HOWTO HTCondor
HTCondor – web site:
http://research.cs.wisc.edu/htcondor/
HTCondor – readthedocs:
https//htcondor.readthedocs.io/en/latest/
Howto submit, monitor and manage a job by Miguel Villaplana (Dip. di Fisica e INFN – MI): howto_condor.pdf (Sept. 2017 – download PDF)
3) An overview: POSTER di David Rebatto (INFN – MI)

Use Case:

Guide for future users of the Milan computational cluster
(version 2, October 2019) by courtesy of Enrico Ragusa and Benedetta Veronesi

Clusters details:

CLUSTER GALILEO: page with some info HW / SW by LCM’s admins (LCM: “Laboratorio di Calcolo & Multimedia”, Department of Physics, UNIMI).

Acknowledge the use of AMiCO:

If you wish to acknowledge the use of AMiCO in a paper or report we suggest the following text:
We gratefully acknowledge the computing resources provided by AMiCO (http://amico.fisica.unimi.it, http://amico.mi.infn.it), an opportunistic resource cluster operated by the IT service of the Physics Department of Università degli Studi and INFN Milano – Italy.

AMiCO
PROJECT MANAGEMENT: Leonardo Carminati
TEAM: David Rebatto, Franco Leveraro, Francesco Prelz
Collaborators: Francesca Milanini
Previous collaborators: Paolo Salvestrini, Miguel Villaplana