Inferring resolved species tree using fastoma

Consider a scenario that NCBI tree was used with FastOMA. Since such tree is not resolved (binary), we can use gene markers (Orthologous Groups) found with FastOMA to infer a high quality species tree.

This is based on the F1000 paper here and mostly codes from here. Updated codes are available in here.

step1: run fastoma

After running FasTOMA, you should see a folder OrthologousGroupsFasta in the output folder.

step2: multiple sequence alignment

Usually there many small OrthologousGroups (OGs), using the code filter_groups_fastoma_v2.py from here we can filter and select big ones (with more species).

You need to change min_species=50 in the following bash script, the goal is to have >100 or >1000 gene markers in the output folder OrthologousGroupFasta_gt*. One suggestion is to use 0.8*proteomes.

cp -r ../out/OrthologousGroupsFasta .
cd OrthologousGroupsFasta
gunzip -k * 
cd ..

min_species=64 
python filter_groups_fastoma_v2.py --min-nr-species ${min_species} --input OrthologousGroupsFasta --output OrthologousGroupFasta_gt${min_species}


cd OrthologousGroupFasta_gt${min_species}
mkdir ../aligned/

thread=20

for i in $(ls -1 *.fa); do 
mafft --maxiterate 1000 --localpair --thread ${thread} ${i}  > ../aligned/${i}.aln
done

Note that in code here I changed the function get_species_from_header to get fastoma species names in ||speciesNames||

concatenate MSA per gene marker (supermatrix) and trim it

cd ..
python concat_alignments_v2.py  aligned/*aln --format-output fasta > concat.fasta
trimal -in  concat.fasta -out concat_gappyout.fasta -gappyout

run iqtree

thread=48
iqtree  -T ${thread}  -s concat_gappyout.fasta -bb 1000 #  -m LG+I+G

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inferring resolved species tree using fastoma

step1: run fastoma

step2: multiple sequence alignment

concatenate MSA per gene marker (supermatrix) and trim it

run iqtree

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally