CoLIde

Available from Version: 3.0

This tool infers the location of significant biological units known as sRNA loci, by combining genomic location with the analysis of other information such as variation in expression levels (expression pattern) and size class distribution. In the CoLIde tool we define a locus as a union of regions sharing the same pattern, located in close proximity on the genome. Biological relevance, detected though the analysis of size class distribution is presented for each locus.

This tool can be used on ordered (e.g. time-dependent) or un-ordered (e.g. organ, mutant) serie of samples both with and without biological/technical replicates. The tool reliably identifies known types of loci and shows improved performance on sequencing data from both plants (e.g. A. Thaliana, S. Lycopersicum) and animals (e.g. D. Melanogaster) when compared to existing locus detection techniques.

Setup

The tool first requires information on how many samples form the experiment.

CoLide Setup
First select the sample count

From here it will require you to input the files that relate to each sample:

Required Parameters:

  • Genome File: The location of the genome file in FASTA format.
  • Sample Names: The locations of the sRNA samples and their optional replicates

Input files are entered using the box displayed below:

Either use the history browser, file browser or type the path to the list of files for each sample
Either use the history browser, file browser or type the path to the list of files for each sample

Each sample can be modified (i.e. have files added and removed) individually by selecting the desired sample number from the table below:

The tabbed interface allows you to access all the samples and modify them individually
The tabbed interface allows you to access all the samples and modify them individually

Series Type Parameters

  • Ordered Series: (select this option if order is important to the experiment e.g. time series)
  • Unordered Series: (select this option if order is not important to the experiment e.g. organ series)

The Confidence Interval (CI)s are also controlled using the following parameters which represents the percentage of replicated
measurements to be included in each CI.

Non Replicate Data – Confidence Interval Control

  • Percentage CI: This determines the percentage to add to either side of the normalised expression

Replicate Data – Confidence Interval Control

  • Min Max: Use the minimum and maximum normalised expression value to determine the confidence interval (100%)
  • +-SD: CI is mean +- 1 standard deviation (67%)
  • +-r(2)SD: CI is mean +- standard deviation divided by the square root of 2 (50%)
  • +-2SD: Ci is mean +- 2 X standard deviation

Percentage Overlap: controls the amount each confidence interval must overlap to be considered a straight pattern

The results are presented in a Table as shown in the image below.

The table presents itself in a tree structure, click the icon on each chromosome you are interested in to view the loci
The table presents itself in a tree structure, click the icon on each chromosome you are interested in to view the loci

The headers for each column contains the description of the data and the name of the sample file.
Locus-data is shown in a table with the following columns:

  • ID: Split by chromosome/scaffold: each locus is numbered per chromosome.
  • Start: Start coordinate for locus
  • End: End coordinate for locus
  • Length: Locus length
  • P-Val: The probability value for the locus as calculated from the chi-square statistic
  • Sample 1-n: The expression series for this locus
  • Chromosome: The chromosome this locus resides on
  • Differential Expression: The absolute differential expression for this locus

The context menu operates on the currently selected result line.

CoLIDE Right Click

  • Export individual sequences: Export the sequences that form the locus to FASTA
  • Output entire locus: Export the entire locus sequence from the genome to FASTA
  • Show locus in genome view: display the selected locus using standard arrow view in VisSR
  • Show locus in aggregate genome view: display the selected locus as a compressed view in VisSR

Viewing Results

Users have two options when viewing loci predicted in CoLIDE. The standard arrow view as shown below:

The classic view of a small RNA alignment
The classic view of a small RNA alignment

Or the aggregated view as shown below (same data and location)

The aggregated view available from CoLide
The aggregated view available from CoLide

The aggregated view groups all small RNAs in the locus into windows of 100nt and generates a  histogram showing the abundance of all small RNAs within that window.

A suite of tools for analysing micro RNA and other small RNA data from High-Throughput Sequencing devices