HNSciCloud use cases
HNSciCloud Early Adopter group of research organisations tested the HNSciCloud pilot platform services in different use cases. Goals, challenges and benefits of each use case are presented below.
LOFAR (Low Frequency Array) is the first of a new kind of telescope that uses an array of simple omnidirectional antennas as opposed to a dish antenna for mechanical signal processing. The electronic signals obtained from the antennas are digitized and transported to a central digital processor. The antennas are simple enough, but there are a lot of them – and indeed about 7,000 in the full LOFAR design. The main goal of the LOFAR use case is to test, and later put into production, separate location of data storage and computing resources distributed throughout Europe. The Helix Nebula science cloud would grant this opportunity due to the availability of the highspeed network connections of the Geant Cloud VRF infrastructure, and the pricing model based on compute and storage alone. The transparent data access functionality would even increase ease of use.Astronomy
The Crystfel framework is used for the technique of Serial Femtosecond Crystallography (SFX) and comprises programs for data processing, simulation and visualization. It is a part of a complex, nonredistributable software stack, which is free to use by academia and non-profit organizations. The crystfel framework is increasingly used at various synchrotrons and FELs to analyze date from serial (femto-second) x-ray crystallography. The nature of these experiments make a cloud-based distributed pipeline particularly appealing, since the framework can fully exploit large computational resources with tunable demands. The objective of the CrystFEL use case is to run ‘medium’ data intensive data analysis tasks (of one of the demanding photon science experiments).Photon/Neutron Sciences
The 3DIX project produces 3D Imaging with X-rays. The objective is to make 3D images (volumes) of nanoscale objects to study their characteristics on a nanoscale. These studies can then be applied to a wide variety of objects and scientific fields such as chemical studies, life sciences, structure of materials etc. The calculations done by the FDMNES programme require very little data transfer and storage. They are, however, often very compute intensive. Running such computations therefore uses considerable computing resources, but this load varies considerably over time depending on the experiments done and the research interests of the scientists. Instead of building upon the computing facilities in an institute for a rarely occurring maximum demand, the 3DIX project wants to offload such peak loads into the cloud.Photon/Neutron Sciences
The Pan‐Cancer initiative primary goal is to compare 12 tumor types profiled in the context of The Cancer Genome Atlas Research Network. Cancer can take hundreds of different forms, depending on external factors such as localization and cell type. PanCancer currently represents the most comprehensive computational study dealing with cancer genomics, with roughly 1 PB of data to be processed. This has forced researchers to implement new pipelines able to cope with the massive quantity of data, with a focus on leveraging cloud resources provided by public and commercial clouds, including the EMBL‐EBI Embassy Cloud. Using HNSciCloud, the PanCancer project will be able to determine genetic variation for more than 5000 tumor samples with more coming in on a monthly basis.Life Sciences
CERN does not have the computing or financial resources to crunch all of the data on site. Recently cloud computing has come to public attention due to the promise of providing as much computing power as people require, simplifying management and reducing TCO. CMS is one of the LHC experiments in the Worldwide LHC Computing Grid (WLCG) collaboration, linking up 170 computing centres in 42 countries. Demonstrating that HNSciCloud can satisfy the requirements of LHC experiments has an enormous potential impact on cloud adoption in the particle physics environment. It could also be a driver for non-LHC experiments that do not have the human resources to move their computing on the cloud, by providing guidelines and best practices in order to simplify this process.High Energy Physics
HADDOCK (High Ambiguity Driven protein-protein DOCKing) uses an information-driven, flexible docking approach for the modelling of biomolecular complexes. Given the computational demands of the HADDOCK user group, the availability of easily accessible, affordable compute resources is important. The portal is used by a large worldwide community and availability/reliability of the service is crucial. The Helix Nebula science cloud would provide the opportunity to add these resources in a way that is seamless and invisible to the end user. To obtain this outcome, the resources in the Helix Nebula cloud would need to be made available to the current job scheduler of the HADDOCK project.Life Sciences