Config files¶
There are three types of config files, General use, Module specific and System parameters.
- General use
- Are those that are used by the main script to feed specific locations or general configuration parameters.
- Module specific
- Are configurations that are used by the module. The names of these config files start with the wbench prefix.
- System parameters
- These configs hold the values of colors and others miscellaneous variables for ease of access.
General use¶
These two config files should be properly configured, to ensure the program runs. The install script will fill out all variables.
software_dirs.cfg:
#Path were install script will install software SOFTWARE= #Path to workbench http://srna-workbench.cmp.uea.ac.uk/ WBENCH_DIR= #Path to java use 1.7 or greater JAVA_DIR= #Number of times program has been run RUN=0The SOFTWARE variable is the path to the directory were the install script will install all necessary dependancies. WBENCH_DIR
workdirs.cfg:
#LINES THE SWITH # ARE INFORMATIONAL ONLY #Workdir is the path to the directory where this program will run data #workdir must end with trailing "/" workdir=${HOME}/miRPursuit_Projects/miRtest/ #Path to the mirbase database. Go to http://www.mirbase.org or download latest from: ftp://mirbase.org/pub/mirbase/CURRENT/ MIRBASE=${source_data}/mirbase/mature.fa #Used by java MEMORY="4g" #Set this to the max number of processed that can be used THREADS=2 #Path to the directory where input data is located #Test directory in [pathToMiRPursuit]/testDataset INSERTS_DIR=${SOURCE_DATA}/sRNA/ #Path to the genome to be used #Test genome can be found here [pathToMiRPursuit]/testDataset/Genome/Arabidopsis_thaliana.TAIR10.dna_rm.chromosome.4.fa GENOME=${SOURCE_DATA}/genomes/my_genome.fa #Path to the genome to be used by mircat. Leave this, as ${GENOME} if no memory resctrictions apply to your case. Check manual on using parts GENOME_MIRCAT=${GENOME/.fa/part-1.fa} #The suffix of the filter to be used. Check /config/workbench_filter_*.cfg FILTER_SUF=18_26_5 #Adaptor trimming #You must set the --trim flag ADAPTOR="TGGAATTCTCGGGTGCCAAGG" #Deprecated - Soon removed LCSCIENCE_LIB= #These var are only used for target prediction (PAREsnip) TRANSCRIPTOME= DEGRADOME=
Module specific¶
There is a config file for each module in the miRPursuit/config directory. The default values are posted, for further reference, please consult the website of the respective tool.
wbench_filter.cfg - Filter your sRNA sequences. Length, abundance, T/R RNA:
#Broad range default values min_length=18 max_length=26 min_abundance=5 max_abundance=2147483647 norm_abundance=false filter_low_comp=true filter_invalid=true trrna=true trrna_sense_only=false filter_genome_hits=false filter_norm_abund=false filter_kill_list=false add_discard_log=false genome=null kill_list=null discard_log=nullwbench_mircat.cfg - miRCat predict novel miRNAs through alignment with genome to find putative precursors:
#Default values (Broad) extend=100.0 min_energy=-25.0 min_paired=17 max_gaps=3 max_genome_hits=16 min_length=18 max_length=26 min_gc=20 max_unpaired=60 max_overlap_percentage=80 min_locus_size=1 orientation=80 min_hairpin_len=60 complex_loops=true pval=0.05 min_abundance=1 cluster_sentinel=200 Thread_Count=12 #Default (plants) extend=100.0 min_energy=-25.0 min_paired=17 max_gaps=3 max_genome_hits=16 min_length=20 max_length=22 min_gc=20 max_unpaired=50 max_overlap_percentage=80 min_locus_size=1 orientation=80 min_hairpin_len=60 complex_loops=true pval=0.05 min_abundance=1 cluster_sentinel=200 Thread_Count=20wbench_mirprof.cfg - miRProf identifies conserved miRNA, through alignment to the miRBase database of miRNA:
#Default values mismatches=0 overhangs=true group_mismatches=true group_organisms=true group_variant=true group_mature_and_star=false only_keep_best=true min_length=18 max_length=26 min_abundance=5wbench_tasi.cfg - ta-si predictor, identifies phased 21nt sRNAs characteristic of ta-siRNA loci:
#Default values p_val_threshold=1.0E-4 min_abundance=2paresnip.cfg - PAREsnip validates targets of regulation by sRNAs requires degradome and a transcriptome sequences:
#Default values min_sRNA_abundance=5 subsequences_are_secondary_hits=false output_secondary_hits_to_file=false use_weighted_fragments_abundance=true category_0=true category_1=true category_2=true category_3=true category_4=false discard_tr_rna=true discard_low_complexity_srnas=false discard_low_complexity_candidates=false min_fragment_length=20 max_fragment_length=21 min_sRNA_length=19 max_sRNA_length=24 allow_single_nt_gap=false allow_mismatch_position_11=false allow_adjacent_mismatches=false max_mismatches=4.0 calculate_pvalues=true number_of_shuffles=100 pvalue_cutoff=0.05 do_not_include_if_greater_than_cutoff=true number_of_threads=23 auto_output_tplot_pdf=falsepatman_genome.cfg - Patman a pattern matcher for short sequences:
#Default values #Set maximum edit distance to N (Default: 0) EDITS=0 #Set maximum number of gaps to N (default: 0) GAPS=0 #Do not match reverse-complements (default: FALSE) SINGLESTRAND=FALSE #Prefetch N nodes (default: 3) Related with performance PREFETCH=3 ################# #Not implemented# ################# #Interpret ambiguity codes in patterns (Flag for using ambicodes) #ambicodes=FALSE
System parameters¶
These are generally hardcoded, don’t change these unless you know what you are doing.
- term-colors.cfg - Colors for terminal and other useful vars.