Storing biological research information has changed considerably during the past 25 years. The notebooks and academic publications that used to be the means for storage have been replaced with world-wide open-access databases, indispensable resources in modern bioscience. Until today, however, the databases have been subject to uncertainties in their funding and the valuable data has been endangered at the end of each funding period. The price of arranging storage and handling for biological information is quite small compared with the cost of reproduction of the data – we cannot afford to let this information vanish.
The infrastructure being built by ELIXIR supports research into bioscience and its applications for the needs of medicine, agriculture, the environment, biotechnology, and society. The preliminary phase of this EU-funded project will continue to the end of 2010. The preparations involve 32 partners from 13 countries, including CSC – IT Center for Science, Ltd. from Finland.
“The meetings organized by ELIXIR for stakeholders have brought together funding agencies, suppliers of data resources and tools, data producers and users, all striving to attain a shared vision and create recommendations to build a sustainable model for funding and organization”, says Development Manager Tommi Nyrönen, head of the ELIXIR project in Finland.
The planned organizational structure of ELIXIR will be based on a network consisting of various research institutions and national centers, coordinated by the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) in the UK.
Professor Janet Thornton, project coordinator for ELIXIR, visited
Finland
in autumn 2008 and met representatives of science, scientific funding and
administration. Broad-based discussions provided some indication about the kind
of role Finland
could aim for within the European community.
Project costs to exceed EUR 220 million
On the list of significant infrastructures needing to be built in Europe, ELIXIR is one of the infrastructures identified by the European Strategy Forum for Research Infrastructures, ESFRI. However, creating sustainable structures will require that the member states are committed to provide funding for the infrastructures. The estimated costs of establishing ELIXIR will rise beyond EUR 220 million.
The UK has already awarded funding amounting to ten million pounds (approximately EUR 11 million) for the EMBL-EBI. The confirmation of this funding is the first step for the EMBL-EBI towards its planned role as the central hub of the emerging ELIXIR. It looks as though Sweden is also about ready to confirm long-term funding for ELIXIR, because the Swedish Government is planning a substantial increase in their investments in research funding in several fields, including national infrastructures. Final decisions on this are expected during 2009.
In addition to the European Roadmap for Research Infrastructures created by ESFRI, the results of a survey on the research infrastructures in Finland and Finland’s role in international research infrastructures were reported this year. The survey, by the Federation of Finnish Learned Societies, was funded by the Ministry of Education. The survey explored the existing infrastructures and created a roadmap on the needs for new national-level infrastructures and participation in international research infrastructures or initiatives to build new ones. The building phase of ELIXIR is included in the national roadmap proposal. A similar recommendation was also made for the projects relating to biobanks for biosciences and translational medicine.
“In Finland the objectives of ELIXIR have now been tied to two other ESFRI projects of bioscience, namely biobanks (BBMRI) and medicine (EATRIS). In this umbrella project, the work is carried out by CSC, the National Institute for Health and Welfare (THL) and the Institute of Molecular Medicine for Finland (FIMM). The objective is to transfer expertise from these Finnish organizations to international life science communities by presenting activities that are reliable and sufficiently broad-scale to allow, for example, building EU-level IT solutions,” Nyrönen emphasizes.
Measuring methods generating enormous data
For bioscientists, like Sampsa Hautaniemi, adjunct professor in system biology and bioinformatics, and his research group, the successful completion of projects like ELIXIR is almost a matter of life and death.
“Research into biology and medicine has changed so radically in recent years. The Director of the British Institute of Cancer Research in Cell Biology, Chris Marshall, stated some time ago that during the past two years he had seen more changes in their operations than the sum total of changes during the previous 20 years,” says Hautaniemi in his office at Biomedicum at the University of Helsinki.
“Today we increasingly use measuring methods like microchip analysis and sequencing, which generate enormous amounts of data. One study can involve making more than a billion individual observations. The change is considerable compared with the situation 10–20 years ago, when a few tens or hundreds of observations were processed,” reckons Hautaniemi.
Research has accumulated huge amounts of data and hence, there is a need to manage all that information systematically, and this is exactly what many of the ESFRI projects aim at. The objective of ELIXIR is to harmonize data storage, handling, and analyses. Hautaniemi knows the plans within ELIXIR, because he has participated in the stakeholder meetings and the working group activities.
“In biomedical research, especially at the results analysis phase, databases are essential. Since there are so many and different types of these databases, it is extremely awkward for the user if communication in each of them occurs with a different protocol,” explains Hautaniemi.
In many respects, databases have become vitally important for bioscience research, but in a way, their maintenance has been sustained as a sideline job, based on fixed-term research funding. In the worst scenarios, there is a danger that at the end of the fixed-term funding period critical resources will disappear, which would leave researchers, including Hautaniemi and his group, in trouble.
“If, for example, some of the specialized databases, such as KEGG for annotating cell signaling pathways, were to become slower, it might not put a stop to the research but it would considerably hinder results interpretation. However, there are a few critical resources, such as the reference database of medical publications, PubMed, or genome databases Ensembl and NCBI; if they were to disappear, we would be out of work fairly soon,” says Hautaniemi.
One of the most significant goals of ELIXIR is to secure funding for the most critical databases containing biological research information. A more sustainable infrastructure will enable a continuous evaluation process, which is needed not only to ensure that critical resources remain up-to-date and available but also to eliminate out-dated services. The resources that are selected for ELIXIR will be managed in a professional manner.
A survey of user needs conducted by ELIXIR explored, for example, which services and databases are needed by European bioscientists. The inquiry clearly indicated that the use of resources are scattered. Indeed, the user numbers of a few services and databases were clearly higher than others, but 193 services and databases were classified as having been indispensable for the research only once. Does this mean that research would not be carried out without these resources? According to Hautaniemi, it is not a question of research being impossible but rather, inefficient.
“Certainly, even today research is being performed where databases do not play such a central role. Without databases and other modern resources many studies would simply take an unreasonable amount of time."
Should the sustainable infrastructure have been created before? "Ten years ago we did not have this type of problem. At that point databases like this were only being planned and dreamt of. Changes have occurred so rapidly only during the past few years, so it is no wonder that a shared infrastructure has not been built. After all, ELIXIR is only at the preliminary phase, but of course, it is common to wish that things would happen faster. But we are not too late”, says Hautaniemi.
There are many challenges to be faced when building an infrastructure. “ELIXIR is burdened with, for example, a fairly heavy organization and several stakeholders, and reconciliation of their objectives is not easy. Regarding distribution of the research data collected at the moment centrally by the EMBL-EBI infrastructure and later by the geographically distributed ELIXIR infrastructure, we should agree on common standards, and having a shared European structure will facilitate decision-making. Though standardization helps the reconciliation, providing scaling benefits, it has a negative side: it slows down the development work”, says Tommi Nyrönen.
When the system of data collecting and distributing is inherently sustainable, it will serve as the foundation to support research groups’ own activities.
It is not necessary for everyone to do everything
The human genome, representing an individual’s starting point for life, is an important source of information for health care, because in the future the impact of choice of life style on health will be easier to predict. According to Nyrönen, establishing the link between molecular biology data and medical registers is one goal, which will affect our decision on how the IT infrastructure should be built.
In the future, significant international data could be processed with CSC’s data and computing resources.
“Scalable application-specific information technology can represent a totally new advantage for research groups for their collaboration work and developing analysis tools. To guarantee optimal utilization of the sizable resources invested in the infrastructures, high-level training on the use of databases and tools must also be offered to researchers. When implemented, ELIXIR will also offer a system and funding mechanism for services developed in Finland with cross-European significance,” Tommi Nyrönen lists future potentials.
These visions can be realized provided that Finland can build an infrastructure that is compatible with the framework plans of ELIXIR and other ESFRI projects. The specialists at Biocenter Finland, the national network of biocenters, amongst others, have taken into account the results attained from the user survey conducted by ELIXIR as to which services are considered important. Through wider international collaboration the idea is to offer high-quality services to the scientific community, but also to eliminate activities that have been performed elsewhere.
“The field of bioinformatics is so extensive that no laboratory is alone able to offer a full scale of services. The implementation of the infrastructures of Finland and ESFRI will bring forth a certain clarity and improved information flow. This will keep us informed of what is being planned and done elsewhere,” concludes Sampsa Hautaniemi.
Heta Kero
Additional information
Most of the ELIXIR’s preliminary phase proposals are
public and available for viewing at ELIXIR's www-pages.