This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) instances. All the software and data used in the workshop are hosted on an Amazon Machine Image (AMI).
To run your own instance of the server used for this workshop, launch a t2.medium
instance in the N. Virginia region with AMI ami-373ab74d “Data Carpentry Genomics release 1.0”, available under “Community
AMIs” in the Amazon EC2 Management Console.
If you are taking a Genomics Data Carpentry workshop, instances will be set up for you. Follow the instructions on connecting to Data Carpentry Genomics Amazon instances to connect to the instance.
If you’re an instructor or maintainer or want to contribute to these lessons, please get in touch with us team@carpentries.org and we will start instances for you.
You can also start your own instance if you’re using these lessons for self-guided learning. Use the information on creating an Amazon instance. The cost of using this AMI for a few days, with the t2.medium instance type is very low.
While not recommended, it is possible to work through the lessons on your local machine (i.e. without using AWS). To do this, you will need to install all of the software used in the workshop and obtain a copy of the dataset. Instructions for doing this are below.
The data used in this workshop is available on the Open Science Framework (OSF). Because this workshop works with real data, be aware that file sizes for the data are large.
This includes the data used in the exercises, as well as solutions to the exercises. These solutions can be useful if you’re working through the lessons, starting at a later module and need the solutions from previous exercises.
There are two directories:
You can also access the data by starting the Amazon AMI that has the data.
| Software | Install | Manual | Available for | Description |
|---|---|---|---|---|
| FastQC | Link | Link | Linux, MacOS, Windows | Quality control tool for high throughput sequence data. |
| Trimmomatic | Link | Link | Linux, MacOS, Windows | A flexible read trimming tool for Illumina NGS data. |
| BWA | Link | Link | Linux, MacOS | Mapping DNA sequences against reference genome. |
| SAMtools | Link | Link | Linux, MacOS | Utilities for manipulating alignments in the SAM format. |
| BCFtools | Link | Link | Linux, MacOS | Utilities for variant calling and manipulating VCFs and BCFs. |
| IGV | Link | Link | Linux, MacOS, Windows | Visualization and interactive exploration of large genomics datasets. |
These are the QuickStart installation instructions. They assume familiarity with the command line and with installation in general. As there are different operating systems and many different versions of operating systems and environments, these may not work on your computer. If an installation doesn’t work for you, please refer to the installation instructions for that software, listed in the table above.
MacOS
To install FastQC, type:
$ brew install fastqcor
$ conda install -y fastqc
FastQC Source Code Installation
If you prefer to install from source, follow the directions below:
$ cd ~/src $ curl -O http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.5.zip $ unzip fastqc_v0.11.5.zipLink the fastqc executable to the ~/bin folder that you have already added to the path.
$ ln -sf ~/src/FastQC/fastqc ~/bin/fastqcDue to what seems a packaging error the executable flag on the fastqc program is not set. We need to set it ourselves.
$ chmod +x ~/bin/fastqc
Test your installation by running:
$ fastqc -h
MacOS
brew install trimmomaticor
conda install -y trimmomatic
Trimmomatic Source Code Installation
If you prefer to install from source, follow the directions below:
$ cd ~/src $ curl -O http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.36.zip $ unzip Trimmomatic-0.36.zipThe program can be invoked via:
$ java -jar ~/src/Trimmomatic-0.36/trimmomatic-0.36.jarThe ~/src/Trimmomatic-0.36/adapters/ directory contains Illumina specific adapter sequences.
$ ls ~/src/Trimmomatic-0.36/adapters/
Test your installation by running: (assuming things are installed in ~/src)
$ java -jar ~/src/Trimmomatic-0.36/trimmomatic-0.36.jar
Simplify the Invocation
To simplify the invocation you could also create a script in the ~/bin folder:
$ echo '#!/bin/bash' > ~/bin/trimmomatic $ echo 'java -jar ~/src/Trimmomatic-0.36/trimmomatic-0.36.jar $@' >> ~/bin/trimmomatic $ chmod +x ~/bin/trimmomaticTest your script by running:
$ trimmomatic
MacOS
brew install bwaor
conda install -y bwa
BWA Source Code Installation
If you prefer to install from source, follow the instructions below:
$ cd ~/src $ curl -OL http://sourceforge.net/projects/bio-bwa/files/bwa-0.7.15.tar.bz2 $ tar jxvf bwa-0.7.15.tar.bz2 $ cd bwa-0.7.15 $ make $ export PATH=~/src/bwa-0.7.15:$PATH
Test your installation by running:
$ bwa
MacOS
$ brew install samtoolsor
$ conda install -y samtools
SAMtools Versions
SAMtools has changed the command line invocation (for the better). But this means that most of the tutorials on the web indicate an older and obsolete usage.
Use only SAMtools 1.3 or later.
SAMtools Source Code Installation
If you prefer to install from source, follow the instructions below:
$ mkdir ~/src $ cd ~/src $ curl -OkL https://github.com/samtools/samtools/releases/download/1.3/samtools-1.3.tar.bz2 $ tar jxvf samtools-1.3.tar.bz2 $ cd samtools-1.3 $ makeAdd directory to the path if necessary:
$ echo export 'PATH=~/src/samtools-1.3:$PATH' >> ~/.bashrc $ source ~/.bashrc
Test your installation by running:
$ samtools
MacOS
$ brew install bcftoolsor
$ conda install bcftools
BCF tools Source Code Installation
If you prefer to install from source, follow the instructions below:
$ cd ~/src $ curl -OkL https://github.com/samtools/bcftools/releases/download/1.5/bcftools-1.5.tar.bz2 $ tar jxvf bcftools-1.5.tar.bz2 $ cd bcftools-1.5 $ makeAdd directory to the path if necessary:
$ echo export 'PATH=~/src/bcftools-1.5:$PATH' >> ~/.bashrc $ source ~/.bashrc
Test your installation by running:
$ bcftools