CSC's trainings and events have moved

Find our upcoming trainings and events at

This site is an archive version and is no longer updated.

Go to CSC Customer trainings and Events


Variant analysis with GATK
Date: 17.06.2016 9:00 - 18.06.2016 17:00
Location details: The event is organised at CSC, Keilaranta 14, Espoo, Finland. The lecture day 16.6 takes place in the auditorium, and the hands-on day 17.6 is organized in the computer classroom. You can reach CSC by public transportation; more detailed travel tips are available.
Language: english-language
lecturers: Trainers from the Broad Institute:
Geraldine Van der Auwera
Kate Noblett
Charlotte Tolonen
Price: Registration fee is 100 euros + 24% VAT per day. You can cancel your attendance without a charge 5 business days prior the course, otherwise the full fee will be invoiced.
The fee includes morning and afternoon coffee and lunch.

This course covers the core steps involved in calling variants with the Broad Institute's Genome Analysis Toolkit, using the Best Practices developed by the GATK team. You will learn why each step is essential to the variant discovery process, what are the operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset. Please note that CSC hosted 13.-15.6 an international Variant analysis workshop which covers several tools for variant discovery, effect prediction and prioritization, and course videos from both courses are now available.

This course highlights GATK's key functionalities such as the GVCF workflow for joint variant discovery in cohorts, RNAseq specific processing, and somatic variant discovery using MuTect2. The trainers also preview capabilities of the upcoming GATK version 4, including a new workflow for CNV discovery. Please note that this course is focused on human data analysis. The majority of the materials presented does apply equally to non-human data, and the trainers will address some questions regarding adaptations that are needed for analysis of non-human data, but they will not go into much detail on those points.

The course materials are available here.

The course consists of a lecture day and an optional hands-on day:

16.6.2016 Lectures (including many opportunities for Q&A)
The trainers explain the rationale, theory and application of the Best Practices for Variant Discovery in high-throughput sequencing data. The lecture day is aimed at a mixed audience of people who are new to the topic of variant discovery or to GATK, seeking an introductory course into the tools, or who are already GATK users seeking to improve their understanding of and proficiency with the tools. The only prerequisite for attending this day is that you are familiar with the basic terms and concepts of genetics and genomics. Lecture abstracts.

17.6.2016 Hands-on exercises
The participants learn how to manipulate the standard data formats involved in variant discovery and how to apply GATK tools appropriately to various use cases and data types. In the course of these exercises, the trainers demonstrate useful tips and tricks for interacting with GATK, dealing with problems, and using thirdparty tools such as Samtools, IGV and RStudio. The hands-on day is aimed at novice to intermediate users who are seeking detailed guidance with GATK and related tools. The prerequisite for this part is that you have basic familiarity with the command line environment. You can get these skills for example by attending the CSC course Introduction to Linux.

Aims: During this course you will learn about:

  • Variant detection for next-generation DNA sequencing
  • Data pre-processing
  • Variant discovery
  • Variant functional annotation and evaluation

Objectives: After this course you should be able to:

  • Understand the overall variant discovery workflow rationale and requirements
  • Understand key methods and functionalities in light of the latest research
  • Apply the Best Practices tools and workflow to a real data set
  • Interpret analysis results and troubleshoot common problems

16.6.2016 Lectures  

  • Introduction to variant discovery analysis and GATK Best Practices  

Coffee break

  • Marking Duplicates 
  • Indel Realignment 
  • Base Recalibration  

Lunch break  

  • Variant Calling and Joint Genotyping 
  • Filtering variants with VQSR
  • Genotype Refinement Workflow  

Coffee break  

  • Callset Evaluation 
  • Somatic variant discovery with MuTect2 
  • Preview of CNV discovery with GATK4  


17.6.2016 Hands-on exercises  

  • Working with standard data formats and data types: BAM, VCF, WGS, WEx, RNAseq 
  • Running Picard and GATK tools to process sequence data and collect QC metrics  

Coffee break  

  • Variant calling with HaplotypeCaller and the GVCF workflow  

Lunch break  

  • Variant callset evaluation and filtering 1  

Coffee break  

  • Variant callset evaluation and filtering 2

Course materials