Training life scientists to analyse genomic data – can cloud make life easier
Training life scientists to analyse genomic data – can cloud make life easier?
Bioinformatics trainers learn to use cloud
Bioinformatics trainers teach life scientists like doctors and biologists to analyse data computationally. Bioinformatics skills are currently in high demand, because novel high-throughput measurement technologies like genomic sequencing produce large amounts of data. Trainers typically travel to different institutes and even abroad to give courses. It is challenging to set up a suitable computing environment at every site in a very short time.
"Typical analyses like finding genomic variants or detecting genes activated in cancer require a large set of software packages and sufficient computing resources to run them. During a training course, 20–30 participants need to run their analyses simultaneously, so the computational demands are much harder", says Dr Eija Korpelainen, bioinformatics specialist from CSC and the ELIXIR Finland training coordinator.
None of the usual options for setting up a bioinformatics environment in a training site is ideal: installing the required set of software on the classroom computers or on the host institute's cluster is time-consuming and requires local IT support. Asking participants to install everything on their own laptops may cause precious training time being wasted in sorting out installation problems, and the laptops might not be powerful enough anyway. Trainers need a way to wrap all the software in an easily portable package, and an elastic computing environment which can temporarily grow to handle tens of analysis jobs at the same time.
"It seems that cloud computing and virtual machine images of the software sets could provide the answer, so I invited bioinformatics trainers and technical specialists to discuss the possibilities and needs", says Korpelainen. This ELIXIR-EXCELERATE funded workshop "Using clouds and VMs in bioinformatics training" at CSC in May gathered over 30 people from 13 different countries, including Australia. It is important to compare experiences from different countries, as we can all learn from each other.
Solutions virtualized in the cloud
Cloud services can make life a whole lot easier for bioinformatics trainers as well as for system administrators. When all the software and data needed in the course can be wrapped in a virtual machine or a container, it saves both time and nerves.
"Local environments work just fine, if you need to teach only one tool. But think about the situation where you would have to install ten different software to twenty different student computers, as well as the big data sets needed for analysis… in that kind of situation virtualization and cloud services are much more convenient", explains Dr Allegra Via, ELIXIR Italy training coordinator.
"When I use virtual machines in training, I feel I'm in control: I know that everything is going to work, and I can control the environment myself", says Dr Nicolas Delhomme, the bioinformatics facility Manager from Umeå Plant Science Center.
Using virtual machines and cloud services also benefits the students. They allow the students to go back to the course assignments and materials after the course, as they can have the virtual machine image with all the software and data.
Sharing knowledge and seeking for best practices
The workshop was organized for the first time, and the participants seemed to be very satisfied with the discussions and outcomes of the course.
"At the moment we're using mostly local resources, so I was very interested in getting to know virtual machine and cloud services developed by others", says Bert Overduin, Training and Outreach Bioinformatician at the Edinburgh Genomics.
"During the course I understood that creating virtual machines isn't necessarily that hard. You can learn it yourself, and you don't always have to ask help from the IT experts", Allegra Via says.
"It is really important that you don't have to be afraid of breaking stuff, so you can try things more easily yourself", continues Fotis E. Psomopoulos, Academic Fellow at the Aristotle University of Thessaloniki.
Based on the experiences shared and discussions in the workshop, ELIXIR-EXCELERATE will provide guidelines for bioinformatics trainers on how to use virtual machines and cloud services. All the presentations are available as slides and videos on the course website.
Allegra Via, Fotis E. Psomopoulos, Nicolas Delhomme and Bert Overduin participated the ELIXIR-EXCELERATE workshop at CSC. Photo: Heta Koski /CSC
ELIXIR (www.elixir-europe.org) is a European distribute infrastructure for life science information. It coordinates, integrates and sustains bioinformatics resources across its member states and enables users in academia and industry to access vital data, tools, standards, compute and training services for their research.
EXCELERATE funding (www.elixir-europe.org/excelerate) helps ELIXIR coordinate and extend national and international data resources to ensure the delivery of world-leading life-science data services. It supports a pan-European training programme, anchored in national infrastructures, to increase bioinformatics capacity and competency.
comments powered by Disqus