TelomereHunter is a software for the detailed characterization of telomere maintenance mechanism footprints in the genome. The tool is implemented for the analysis of large cancer genome cohorts and provides a variety of diagnostic diagrams as well as machine-readable output for subsequent analysis.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=16g --gres=lscratch:30 -c 8
[user@cn3200 ~]$module load telomerehunter
[user@cn3200 ~]$telomerehunter -h
TelomereHunter Copyright (C) 2015 Lina Sieverling, Philip Ginsbach, Lars Feuerbach
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions. For details see the GNU General Public License
in the license copy received with TelomereHunter or .
TelomereHunter 1.1.0
usage: telomerehunter [-h] [-ibt TUMOR_BAM] [-ibc CONTROL_BAM] -o OUTPUT_DIR
-p PID [-b BANDING_FILE] [-rt REPEAT_THRESHOLD_SET]
[-rl] [-mqt MAPQ_THRESHOLD] [-d]
[-r REPEATS [REPEATS ...]] [-con] [-gc1 LOWERGC]
[-gc2 UPPERGC] [-nf]
[-rc TVRS_FOR_CONTEXT [TVRS_FOR_CONTEXT ...]]
[-bp BP_CONTEXT] [-pl] [-pff {pdf,png,svg,all}] [-p1]
[-p2] [-p3] [-p4] [-p5] [-p6] [-p7] [-p8] [-prc]
Estimation of telomere content from WGS data of a tumor and/or a control
sample.
optional arguments:
-h, --help show this help message and exit
-ibt TUMOR_BAM, --inputBamTumor TUMOR_BAM
Path to the indexed input BAM file of the tumor
sample.
-ibc CONTROL_BAM, --inputBamControl CONTROL_BAM
Path to the indexed input BAM file of the control
sample.
-o OUTPUT_DIR, --outPath OUTPUT_DIR
Path to the output directory into which all results
are written.
-p PID, --pid PID Sample name used in output files and diagrams
(required).
-b BANDING_FILE, --bandingFile BANDING_FILE
Path to a tab-separated file with information on
chromosome banding. The first four columns of the
table have to contain the chromosome name, the start
and end position and the band name. The table should
not have a header. If no banding file is specified,
the banding information of hg19 will be used.
-rt REPEAT_THRESHOLD_SET, --repeatThreshold REPEAT_THRESHOLD_SET
The number of repeats needed for a read to be
classified as telomeric. If no repeat threshold is
defined, TelomereHunter will calculate the
repeat_threshold depending on the read length with the
following formula: repeat_threshold =
floor(read_length * 6/100)
-rl, --perReadLength Repeat threshold is set per 100 bp read length. The
used repeat threshold will be: floor(read_length *
repeat_threshold/100) E.g. Setting -rt 8 -rl means
that 8 telomere repeats are required per 100 bp read
length. If the read length is 50 bp, the threshold is
set to 4.
-mqt MAPQ_THRESHOLD, --mappingQualityThreshold MAPQ_THRESHOLD
The mapping quality needed for a read to be considered
as mapped (default = 8).
-d, --removeDuplicates
Reads marked as duplicates in the input bam file(s)
are removed in the filtering step.
-r REPEATS [REPEATS ...], --repeats REPEATS [REPEATS ...]
List of telomere repeat types to search for. Reverse
complements are automatically generated and do not
need to be specified! By default, TelomereHunter
searches for t-, g-, c- and j-type repeats (TTAGGG
TGAGGG TCAGGG TTGGGG).
-con, --consecutive Search for consecutive repeats.
-gc1 LOWERGC, --lowerGC LOWERGC
Lower limit used for GC correction of telomere
content. The value must be an integer between 0 and
100 (default = 48).
-gc2 UPPERGC, --upperGC UPPERGC
Upper limit used for GC correction of telomere
content. The value must be an integer between 0 and
100 (default = 52).
-nf, --noFiltering If the filtering step of TelomereHunter has already
been run previously, skip this step.
-rc TVRS_FOR_CONTEXT [TVRS_FOR_CONTEXT ...], --repeatsContext TVRS_FOR_CONTEXT [TVRS_FOR_CONTEXT ...]
List of telomere variant repeats for which to analyze
the sequence context. Reverse complements are
automatically generated and do not need to be
specified! Counts for these telomere variant repeats
(arbitrary and singleton context) will be added to the
summary table. Default repeats: TCAGGG TGAGGG TTGGGG
TTCGGG TTTGGG ATAGGG CATGGG CTAGGG GTAGGG TAAGGG).
-bp BP_CONTEXT, --bpContext BP_CONTEXT
Number of base pairs on either side of the telomere
variant repeat to investigate. Please use a number
that is divisible by 6.
-pl, --parallel The filtering, sorting and estimating steps of the
tumor and control sample are run in parallel. This
will speed up the computation time of TelomereHunter.
-pff {pdf,png,svg,all}, --plotFileFormat {pdf,png,svg,all}
File format of output diagrams. Choose from pdf
(default), png, svg or all (pdf, png and svg).
-p1, --plotChr Make diagrams with telomeric reads mapping to each
chromosome. If none of the options p1/p2/p3/p4/p5/p6
are chosen, all diagrams will be created.
-p2, --plotFractions Make a diagram with telomeric reads in each fraction
(intrachromosomal, subtelomeric, junction spanning,
intratelomeric). If none of the options
p1/p2/p3/p4/p5/p6 are chosen, all diagrams will be
created.
-p3, --plotTelContent
Make a diagram with the gc corrected telomere content
in the analyzed samples. If none of the options
p1/p2/p3/p4/p5/p6 are chosen, all diagrams will be
created.
-p4, --plotGC Make a diagram with GC content distributions in all
reads and in intratelomeric reads. If none of the
options p1/p2/p3/p4/p5/p6 are chosen, all diagrams
will be created.
-p5, --plotRepeatFreq
Make histograms of the repeat frequencies per
intratelomeric read. If none of the options
p1/p2/p3/p4/p5/p6 are chosen, all diagrams will be
created.
-p6, --plotTVR Make plots for telomere variant repeats.
-p7, --plotSingleton Make plots for singleton telomere variant repeats.
-p8, --plotNone Do not make any diagrams. If none of the options
p1/p2/p3/p4/p5/p6/p7/p8 are chosen, all diagrams will
be created.
-prc, --plotRevCompl Distinguish between forward and reverse complement
telomere repeats in diagrams.
Contact Lina Sieverling (l.sieverling@dkfz-heidelberg.de) for questions and
support.
[user@cn3200 ~]$ mkdir output_dir
[user@cn3200 ~]$ telomerehunter -o output_dir -p test_sample -ibt $TH_DATA/test_sample_control.bam -ibc $TH_DATA/test_sample_tumor.bam -p test_sample -b $TH_DATA/hg38_cytoband_telomerehunter.txt -pl 2> test_sample.err.txt
TelomereHunter Copyright (C) 2015 Lina Sieverling, Philip Ginsbach, Lars Feuerbach
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions. For details see the GNU General Public License
in the license copy received with TelomereHunter or .
TelomereHunter 1.1.0
Repeat threshold was not set by user. Setting it to 6 reads per 100 bp read length.
Calculating repeat threshold for tumor sample: Read length is 151. Repeat threshold is set to 9.
Calculating repeat threshold for control sample: Read length is 151. Repeat threshold is set to 9.
------ Tumor Sample: started filtering telomere reads ------
------ Control Sample: started filtering telomere reads ------
...
End the interactive session:
[user@cn3200 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$