Protein alignments, ART, and buggy Nexus Files

by

Here are some notes about a strange problem with ART and Nexus files

If you attempt to run ART, and see an error message like this:

ART is veriyfing that your tree file corresponds with your sequence file. .

-> ART says: some of the taxa in your treefile do not exist in your sequence file.

Missing taxa:
SpADH2 KlADH2 SkADH4 ScADH3 KmADH2 KLADH4 KwADH3 SkADH3 SpADH5 SbADH2 SbADH5 SpADH1 SkADH2 PsADH2 KlADH3 KwADH4 SbADH3 PsADH1 KmADH1 ScADH2 SpADH3 ScADH5 KlADH1 SkADH1 KwADH1 SbADh1 ScADH1

Error!
ART was unable to import your data.  See previous errors.

The problem is that ALL of your taxa are not found in your sequence file.  Obviously, you might feel like you’re going crazy, because your alignment looks fine.

SOLUTION: In your Nexus alignment, find a line which looks like this:

FORMAT DATATYPE=PROTEIN  SYMBOLS = ” 1 2 3 4″  MISSING=? GAP=- ;

Remove the “SYMBOLS” parameter.  Apparently, this parameter trips-up the BioNexus parser.

FORMAT DATATYPE=PROTEIN MISSING=? GAP=- ;

. . . and hopefully that should solve the problem!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: