Metagenomics data analysis
Date: 03.04.2017 9:00 - 06.04.2017 17:00
The course takes place at the training facilities of CSC in Espoo, Finland.
Language: English
Lecturers: Nils Willassen (ELIXIR-Norway)
Espen Robertsen (ELIXIR-Norway)
Erik Hjerde (ELIXIR-Norway)
Inge Alexander Raknes (ELIXIR-Norway)
Rob Finn (EBI)
Anu Mikkonen (University of Jyväskylä)
Jenni Hultman (University of Helsinki)
Petri Auvinen (University of Helsinki)
Jarno Tuimala (RS training)
Maria Lehtivaara (CSC)
Kimmo Mattila (CSC)
Eija Korpelainen (CSC)
Additional information

NEW: Course materials and videos are now available. You can also run the analysis tools of the META-pipe pipeline in CSC's Taito cluster, see instructions.

Metagenomics investigates the composition and function of microbial communities in different environments based on direct isolation of genetic material. It has been accelerated by the advances in high-throughput sequencing technologies, and the increasing data sizes require efficient analysis methods and advanced computing approaches.

This international workshop covers metagenomics analysis from quality control, filtering and assembly to taxonomic classification, functional assignment and comparative metagenomics. In addition to covering the analysis of whole genome shotgun sequencing data, the workshop has also an optional day on community analysis of amplicon sequencing data. Finally, international databases and standards for storing the data are introduced.

The workshop is organized in collaboration with the ELIXIR EXCELERATE project and PRACE, and it is part of the PRACE Advanced Training Centre activity.

The workshop consists of lectures and hands-on exercises. User-friendly analysis platforms META-pipe and Chipster are used in the exercises, so no programming skills are required, and the workshop is thus suitable for everybody.

  • META-pipe has been developed by ELIXIR-Norway, and it offers tools for pre-processing (Trimmomatic, FastQC), assembly (MegaHit), taxonomic classification (LCA-classifier), gene prediction (Glimmer/MGA), functional analysis (Blast, Priam, InterProScan) and visualization (Metarep). META-pipe analysis jobs use a Spark cluster in the cloud for computation.
  • Chipster offers over 360 analysis tools for different kinds of high-thoughput data. For community analysis of amplicon sequencing data it has tools for pre-processing (Prinseq, Trimmomatic, FastQC, Mothur), taxonomic classification (Mothur), and statistical analysis for marker gene studies (R packages vegan, rich, biodiversityR, pegas and labdsv). 



Monday 3.4.2017: OPTIONAL MODULE: Community analysis of amplicon sequencing data 

9:00-10:45 Lectures

  • Who's there? Community analysis by amplicon sequencing (Anu Mikkonen)
  • Short demo: Using QIIME in parallel fashion in HPC environments (Kimmo Mattila)

10:45-11:00 Coffee break

11:00-12:30 Hands-on session using the Chipster software to analyze 16S rRNA data (Eija Korpelainen, Maria Lehtivaara, Anu Mikkonen)

  • Introduction to Chipster
  • Data preprocessing with Mothur-based tools and FastQC

12:30-13:30 Lunch

13:30-16:30 Hands-on session using Chipster continues, includes a coffee break (Eija Korpelainen, Maria Lehtivaara, Anu Mikkonen and Jarno Tuimala)

  • Data preprocessing with Mothur-based tools
  • Taxonomic classification with Mothur-based tools
  • Statistical analysis of marker gene data: Comparing diversity and abundance between groups
  • Visualization

Tuesday 4.4.2017: Sample preparation, sequencing, quality control, filtering and assembly

9:00-12:30 Lectures (auditorium)

  • 9:00-10:15 Selecting and preparing samples for metaomics (Anu Mikkonen)

    Your samples are only as good as your sample preparation routines are, and your results are only as valid as your experimental design is. This talk will discuss, showing plenty of real-life examples, all the things that can go wrong before you even send your DNA samples out for the sequencing. The talk includes lots of practical recommendations on what to do before, or sometimes instead of, metagenomics analysis.

  • 10:15-10:45 Sequencing technologies for metagenomics (Petri Auvinen)

  • 10:45-11:00 Coffee break

  • 11:00-11:30 Metagenomics analysis – an overview (Nils Willassen)

  • 11:30-12:00 META-pipe analysis platform overview (Espen Robertsen)

  • 12:00-12:30 Metagenomics in the cloud (Inge Alexander Raknes)

12:30-13:15 Lunch

13:15-16:30 Hands-on exercises using the META-pipe platform, includes a coffee break (Espen Robertsen, Erik Hjerde)

  • Quality control and filtering
  • Assembly
  • Validation

Wednesday 5.4.2017: Taxonomic and functional analysis

9:00-12:30 Lectures (auditorium)

  • 9:00-9:30 Providing the generalist EBI metagenomics platform: challenges and issues (Rob Finn) 
  • 9:30-9:50 The metagenomics data life-cycle: standards and best practice (Rob Finn)
  • 9:50-10:20 New data resources for marine metagenomics (Nils Willassen)
  • 10:20-10:35 Coffee break
  • 10:35-11:10 Arctic metagenomes as a scaffold for understanding metatranscriptomic data (Jenni Hultman)
  • 11:10-11:35 Genome assembly from metagenomic reads (Jenni Hultman)
  • 11:35-12:15 Taxonomic classification and functional assignment (Erik Hjerde)

12:15-13:00 Lunch

13:00-16:30 Hands-on exercises using the META-pipe platform, includes coffee break (Espen Robertsen, Erik Hjerde)

  • Taxonomic classification
  • Functional assignment

Thursday 6.4.2017: Comparative metagenomics 

9:00-12:00 Hands-on exercises using the META-pipe platform continued, includes coffee break (Espen Robertsen)

  • Visualization of data
  • Comparative metagenomics

12:00-13:00 Lunch

13:00-14:30 Hands-on exercises using the META-pipe platform continued and wrap-up

  • 13:00-14:00 Comparative metagenomics
  • 14:00-14:30 Feedback and wrap up
Course materials