The Version 4.0 Database model
Version 4.0 of the workbench introduced a new back end to the entire software disregarding the previous solution to handling the volume of sequence data that was being run through the software.
This system is based on an SQL database that can be run in either in memory (for high memory computers) or disk based storage (where all tables are persisted to a set of files on the hard disk). More information on this will be available shortly, however, for now in the current alpha setup only the high memory version is available (although this still has a lower memory footprint in comparison to the old software setup).
Setting up the Database
All workflows in the sRNA Workbench begin with the database node.
This allows users to organise their sequence data into samples consisting of replicate files that have previously been converted from raw FASTQ format to FASTA and had any remaining adapter fragments removed (a tool in the workbench is available to perform this task prior to configuring a workflow).
Users should setup their data using the Wizard.
After file configuration users can tune some options using the Database Settings menu in the sidebar. These options include the ability to filter out large numbers of repeated hits (default is 200). And to set the maximum number of gaps and mismatches that will be allowed during the sequence alignment phase.