ta-si prediction

The trans-acting RNA prediction tool identifies phased 21nt sRNAs characteristic of ta-siRNA loci, from a supplied sRNA dataset and an associated genome.

This tool requires an sRNA dataset and a genome file from a plant species for input (ta-si RNAs have not been found in animals so far to the best of our knowledge). The paths to the files representing the sample and the genome are entered at the top of the main dialog box. In addition, the ta-si prediction tool requires the user to specify the p-value cutoff (to control the tool’s senstivity) and the minimum sRNA abundance, although these text boxes are automatically set with default values.

When the user has configured all the input parameters to their satisfaction they can start the ta-si prediction tool by clicking on the “Start” button on the main dialog, or by selecting the “Start” menu item from the run menu. Once running, the tool can be cancelled at any time by clicking on the “Cancel” button or menu item. Note: cancelling a run may not be instant as execution must reach a safe position in the code before cleanly stopping the run.

The first step during execution aligns sRNAs to the genome. sRNAs not matching the genome are discarded. It implements the algorithm described by Chen et al to calculate the probability of the phasing being significant based on the hypergeometric distribution (see figure below). Our implementation differs slightly as we take into account the length of the input sRNA sequences, only using 21nt sRNAs in the phasing analysis.

After the run has completed the results are available in the scrollable table underneath the input boxes (as shown in the screenshot below). Each predicted TAS gene is displayed as a row in the table. The columns represent:

  • The chromosome from which the predicted TAS originated.
  • The start position of the predicted TAS gene in the chromosome.
  • The end position of the predicted TAS gene in the chromosome.
  • The number of distinct sRNAs that align to the region described in the previous columns.
  • The number of distinct phased sRNAs that were detected in the regions described in the previous columns
  • The P-Value which was calculated by the algorithm described by Chen et al

Each predicted TAS gene can be copied to clipboard by raising the context menu for that row in the table and clicking on the “Copy to Clipboard” button. In addition, the phased sRNAs for each TAS gene can be displayed in a separate popup dialog by clicking the “Show phased sRNAs” in the context menu for that row in the table.

All TAS loci can be visualised in VisSR using the “Show in VisSR” menu item located in the View menu. Alternatively, each locus can be sent individually to VisSR by bringing up a context menu on a particular locus and selecting the “Show in VisSR” button. An example of a TAS locus in VisSR is shown below.

Now that the results are available to the user, the tool can export two types of file: a results table detailing all predicted TAS loci in .csv format and a list of phased sRNAs for each TAS loci in txt format. This csv file can be loaded into any good spreadsheet program.

A suite of tools for analysing micro RNA and other small RNA data from High-Throughput Sequencing devices