Metagenomics data analysis
Date: 03.04.2017 9:00 - 06.04.2017 17:00
Location details: The course takes place at the training facilities of CSC. The address is Keilaranta 14, Espoo, Finland. You can reach us by public transportation, read more.
Language: English
Lecturers: Nils Willassen (ELIXIR-Norway)
Espen Robertsen (ELIXIR-Norway)
Erik Hjerde (ELIXIR-Norway)
Inge Alexander Raknes (ELIXIR-Norway)
Rob Finn (EBI)
Anu Mikkonen (University of Jyväskylä)
Jenni Hultman (University of Helsinki)
Petri Auvinen (University of Helsinki)
Jarno Tuimala (RS training)
Maria Lehtivaara (CSC)
Kimmo Mattila (CSC)
Eija Korpelainen (CSC)
  • Free for Finnish universities, polytechnics and governmental research institutes.
  • Free for others.
Registration is closed.
Participants have been selected based on their motivation description in the registration form.
Additional information

Metagenomics investigates the composition and function of microbial communities in different environments based on direct isolation of genetic material. It has been accelerated by the advances in high-throughput sequencing technologies, and the increasing data sizes require efficient analysis methods and advanced computing approaches.

This international workshop covers metagenomics analysis from quality control, filtering and assembly to taxonomic classification, functional assignment and comparative metagenomics. In addition to covering the analysis of whole genome shotgun sequencing data, the workshop has also an optional day on community analysis of amplicon sequencing data. Finally, international databases and standards for storing the data are introduced.

The workshop is organized in collaboration with the ELIXIR EXCELERATE project and PRACE, and it is part of the PRACE Advanced Training Centre activity.

The workshop consists of lectures and hands-on exercises. User-friendly analysis platforms META-pipe and Chipster are used in the exercises, so no programming skills are required, and the workshop is thus suitable for everybody.

  • META-pipe has been developed by ELIXIR-Norway, and it offers tools for pre-processing (Prinseq, FastQC), assembly (MetaRay/Ray, Mira), taxonomic classification (LC-classifier), gene prediction (Glimmer/MGA), functional analysis (Blast, Priam, InterProScan) and visualization (Metarep). META-pipe analysis jobs use a Spark cluster in the cloud for computation.
  • Chipster offers over 360 analysis tools for different kinds of high-thoughput data. For community analysis of amplicon sequencing data it has tools for pre-processing (Prinseq, Trimmomatic, FastQC, Mothur), taxonomic classification (Mothur), and statistical analysis for marker gene studies (R packages vegan, rich, biodiversityR, pegas and labdsv). 



Please note that there might be still small changes to the schedule.

Monday 3.4.2017: OPTIONAL MODULE: Community analysis of amplicon sequencing data

9:00-10:45 Lectures

  • Who's there? Community analysis by amplicon sequencing (Anu Mikkonen)
  • Short demo: Using QIIME in parallel fashion in HPC environments (Kimmo Mattila)

10:45-11:00 Coffee break

11:00-12:30 Hands-on session using the Chipster software to analyze 16S rRNA data (Eija Korpelainen, Maria Lehtivaara, Anu Mikkonen)

  • Introduction to Chipster
  • Data preprocessing with Mothur-based tools and FastQC

12:30-13:30 Lunch

13:30-16:30 Hands-on session using Chipster continues, includes a coffee break (Eija Korpelainen, Maria Lehtivaara, Anu Mikkonen and Jarno Tuimala)

  • Data preprocessing with Mothur-based tools
  • Taxonomic classification with Mothur-based tools
  • Statistical analysis of marker gene data: Comparing diversity and abundance between groups
  • Visualization

Tuesday 4.4.2017: Sample preparation, sequencing, quality control, filtering and assembly

9:00-12:30 Lectures

  • 9:00-10:15 Selecting and preparing samples for metaomics (Anu Mikkonen)

    Your samples are only as good as your sample preparation routines are, and your results are only as valid as your experimental design is. This talk will discuss, showing plenty of real-life examples, all the things that can go wrong before you even send your DNA samples out for the sequencing. The talk includes lots of practical recommendations on what to do before, or sometimes instead of, metagenomics analysis.

  • 10:15-10:45 Sequencing technologies for metagenomics (Petri Auvinen)

  • 10:45-11:00 Coffee break

  • 11:00-11:30 Metagenomics analysis – an overview (Nils Willassen)

  • 11:30-12:00 META-pipe analysis platform overview (Espen Robertsen, Erik Hjerde)

  • 12:00-12:30 Metagenomics in the cloud (TBD)

12:30-13:15 Lunch

13:15-16:30 Hands-on exercises using the META-pipe platform, includes a coffee break (Nils Willassen, Espen Robertsen, Erik Hjerde)

  • Quality control and filtering
  • Assembly
  • Validation

Wednesday 5.4.2017: Taxonomic and functional analysis

9:00-11:30 Lectures

  • 9:00-9:30 Providing the generalist EBI metagenomics platform: challenges and issues (Rob Finn) 
  • 9:30-9:50 The metagenomics data life-cycle: standards and best practice (Rob Finn)
  • 9:50-10:20 New data resources for marine metagenomics (Nils Willassen)
  • 10:20-10:35 Coffee break 
  • 10:35-11:15 Taxonomic classification and functional assignment (Espen Robertsen, Erik Hjerde)


11:15-12:00 Hands-on exercises using the META-pipe platform (Nils Willassen, Espen Robertsen, Erik Hjerde)

  • Introduction, taxonomic classification

12:00-13:00 Lunch

13:00-16:30 Hands-on exercises using the META-pipe platform continued, includes coffee break (Nils Willassen, Espen Robertsen, Erik Hjerde)

  • Taxonomic classification continued
  • Functional assignment

Thursday 6.4.2017: Comparative metagenomics and other analysis

9:00-12:00 Hands-on exercises using the META-pipe platform continued, includes coffee break (Nils Willassen, Espen Robertsen, Erik Hjerde)

  • Visualization of data
  • Comparative metagenomics

12:00-13:00 Lunch

13:00-14:30 Lectures and wrap-up

  • 13:00-13:35 Arctic metagenomes as a scaffold for understanding metatranscriptomic data (Jenni Hultman)
  • 13:35-14:00 Genome assembly from metagenomic reads (Jenni Hultman)
  • 14:00-14:30 Feedback and wrap up