Tractor: Local Ancestry Aware GWAS

Tractor is a statistical framework and software package to facilitate the inclusion of admixed individuals in association studies by leveraging local ancestry. It generates accurate ancestry-specific effect-size estimates and P values, can boost genome-wide association study (GWAS) power and improves the resolution of association signals

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive
[user@cn0799 ~]$ module load Tractor
[+] Loading java 17.0.3.1  ...
[+] Loading singularity  4.2.2  on cn0799
[+] Loading nextflow  24.10.3
[+] Loading Tractor 1.4.0  ...
[user@cn0799 ~]$ ls $TRACTOR_BIN
flare      phase_common_static  rfmix     simulate_static  xcftools_static
gnomix.py  phase_rare_static    simulate  switch_static

[user@cn0799 ~]$ cp -r $TRACTOR_DATA . 
[user@cn0799 ~]$ tree test_data
test_data
├── admixed_cohort
│   ├── ASW.deconvoluted.fb.tsv
│   ├── ASW.deconvoluted.msp.tsv
│   ├── ASW.deconvoluted.rfmix.Q
│   ├── ASW.deconvoluted.sis.tsv
│   ├── ASW.phased.vcf.gz
│   └── ASW.phased.vcf.gz.csi
├── phenotype
│   ├── Phe_linear_covars.txt
│   ├── Phe_linear.txt
│   └── Phe_logistic.txt
└── references
    ├── chr22.b37.gmap.gz
    ├── chr22.genetic_map.modified.txt
    ├── TGP_HGDP_QC_hg19_chr22.vcf.gz
    ├── TGP_HGDP_QC_hg19_chr22.vcf.gz.csi
    └── YRI_GBR_samplemap.txt
3 directories, 14 files
[user@cn0799 ~]$ phase_common_static    \
                        --input test_data/admixed_cohort/ASW.phased.vcf.gz  \
                        --reference test_data/references/TGP_HGDP_QC_hg19_chr22.vcf.gz \
                        --region 22   \
                        --map test_data/references/chr22.b37.gmap.gz    \
                        --output test_data/admixed_cohort/ASW.phased.bcf 

[SHAPEIT5] phase_common (jointly phase multiple common markers)
  * Author        : Olivier DELANEAU, University of Lausanne
  * Contact       : olivier.delaneau@gmail.com
  * Version       : 5.1.1 / commit = 990ed0d / release = 2023-05-08
  * Run date      : 29/05/2025 - 11:19:37

Files:
  * Input         : [admixed_cohort/ASW.phased.vcf.gz]
  * Reference     : [test_data/references/TGP_HGDP_QC_hg19_chr22.vcf.gz]
  * Genetic Map   : [test_data/references/chr22.b37.gmap.gz]
  * Output        : [test_data/admixed_cohort/ASW.phased.bcf]
  * Output format : [bcf]

Parameters:
  * Seed    : 15052011
  * Threads : 8 threads
  * MCMC    : 15 iterations [5b + 1p + 1b + 1p + 1b + 1p + 5m]
  * PBWT    : [window = 4cM / depth = auto / modulo = auto / mac = 5 / missing = 0.1]
  * HMM     : [window = 4cM / Ne = 15000 / Recombination rates given by genetic map]

Reading genotype data:
[W::hts_idx_load3] The index file is older than the data file: test_data/references/TGP_HGDP_QC_hg19_chr22.vcf.gz.csi
  * VCF/BCF scanning done (9.04s)
      + Variants [#sites=182525 / region=22]
         - 11249 sites removed in reference panel [not in main panel]
      + Samples [#target=61 / #reference=1572]
[W::hts_idx_load3] The index file is older than the data file: test_data/references/TGP_HGDP_QC_hg19_chr22.vcf.gz.csi
  * VCF/BCF parsing done (10.94s)
      + Genotypes [0/0=76.122%, 0/1=16.366%, 1/1=7.512%, ./.=0.000%, 0|1=0.000%]
      + Reference haplotypes [0=85.090%, 1=14.910%]

Setting up genetic map:
  * GMAP parsing [n=45329] (0.02s)
  * cM interpolation [s=37946 / i=144579] (0.02s)
  * Region length [34861020 bp / 72.8 cM]
  * HMM parameters [Ne=15000 / Error=0.0001 / #rare=8659]

Initializing data structures:
  * Impute monomorphic [n=330498] (0.01s)
  * HAP update (0.04s)
  * H2V transpose (0.80s)
  * PBWT parameters auto setting : [modulo = 0.045 / depth = 6]
  * PBWT initialization [#eval=173447 / #select=1550 / #chunk=11] (0.01s)
  * PBWT phasing sweep (1.72s)
  * Build genotype graphs [seg=607426] (0.11s)

Burn-in iteration [1/5]
  * PBWT selection (0.90s)
  * HMM computations [K=716.5+/-235.1 / W=3.54Mb / US=0 / UP=0] (36.17s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.06s)

Burn-in iteration [2/5]
  * PBWT selection (0.90s)
  * HMM computations [K=635.5+/-222.1 / W=3.52Mb / US=0 / UP=0] (32.56s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.04s)

Burn-in iteration [3/5]
  * PBWT selection (0.90s)
  * HMM computations [K=629.1+/-207.3 / W=3.47Mb / US=0 / UP=0] (31.67s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.05s)

Burn-in iteration [4/5]
  * PBWT selection (0.90s)
  * HMM computations [K=645.1+/-227.3 / W=3.57Mb / US=0 / UP=0] (33.06s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.06s)

Burn-in iteration [5/5]
  * PBWT selection (0.90s)
  * HMM computations [K=631.5+/-212.7 / W=3.50Mb / US=0 / UP=0] (31.98s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.06s)

Pruning iteration [1/1]
  * PBWT selection (0.91s)
  * HMM computations [K=648.8+/-219.8 / W=3.61Mb / US=0 / UP=0] (34.58s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.04s)
  * Trimming [pc=43.57%]

Burn-in iteration [1/1]
  * PBWT selection (0.90s)
  * HMM computations [K=632.6+/-210.8 / W=3.50Mb / US=0 / UP=0] (29.10s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.05s)

Pruning iteration [1/1]
  * PBWT selection (0.90s)
  * HMM computations [K=642.6+/-213.5 / W=3.57Mb / US=0 / UP=0] (30.67s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.05s)
  * Trimming [pc=68.12%]

Burn-in iteration [1/1]
  * PBWT selection (0.87s)
  * HMM computations [K=635.6+/-225.6 / W=3.55Mb / US=0 / UP=0] (28.82s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.04s)

Pruning iteration [1/1]
  * PBWT selection (0.90s)
  * HMM computations [K=650.4+/-232.0 / W=3.66Mb / US=0 / UP=0] (29.86s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.05s)
  * Trimming [pc=80.64%]

Main iteration [1/5]
  * PBWT selection (0.90s)
  * HMM computations [K=619.3+/-210.4 / W=3.44Mb / US=0 / UP=0] (27.01s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.04s)

Main iteration [2/5]
  * PBWT selection (0.90s)
  * HMM computations [K=623.0+/-213.4 / W=3.48Mb / US=0 / UP=0] (27.55s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.05s)

Main iteration [3/5]
  * PBWT selection (0.90s)
  * HMM computations [K=633.2+/-209.4 / W=3.51Mb / US=0 / UP=0] (27.49s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.05s)

Main iteration [4/5]
  * PBWT selection (0.90s)
  * HMM computations [K=638.1+/-219.3 / W=3.59Mb / US=0 / UP=0] (28.06s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.06s)

Main iteration [5/5]
  * PBWT selection (0.90s)
  * HMM computations [K=627.7+/-213.9 / W=3.49Mb / US=0 / UP=0] (27.20s)
  * IBD2 tracks [#inds=0 / #tracks=0 / #merged = 0]
  * HAP update (0.03s)
  * H2V transpose (0.03s)

Finalization:
  * HAP solving (0.06s)
  * HAP update (0.03s)
  * H2V transpose (0.03s)
  * VCF/BCF writing [N=61 / L=182525] (0.52s)
  * Indexing files
  * Total running time = 493 seconds


[user@cn0799 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$