NCL v2.1 Documentation

2.1

What is the NCL?

The NEXUS Class Library (NCL) is an integrated collection of C++ classes that parses several file formats used in evolutionary biology (NEXUS, PHYLIP, relaxed PHYLIP, and FASTA). NCL does not diagnose the file format, but if you configure a parser to read several formats, then you will be able to parse multiple formats and extract the data from NCL's data structures using the same API regardless of the file format.

This documentation is written for C++ programmers.

Version 2.0 of NCL itself was published as:

Lewis, P. O. 2003. NCL: a C++ class library for interpreting data files in NEXUS format. Bioinformatics 19 (17): 2330-2331. [link to online resource]

See http://hydrodictyon.eeb.uconn.edu/ncl for documentation on version 2.0.

The NEXUS data file format was specified in the publication cited below (this is a link to pdf of that paper). Please read this paper for further information about the format specification itself; the documentation for the NCL does not attempt to explain the structure of a NEXUS data file.

Maddison, D. R., D. L. Swofford, and Wayne P. Maddison. 1997. NEXUS: an extensible file format for systematic information. Systematic Biology 46(4): 590-621.

Versions of the library.

Backward compatibility

Despite several fundamental changes in the implementation of the library, we strive to keep NCL v2.1 backward compatible with version 2.0. A programmer that relied on version 2.0 should still work. If you discover that your client code works with version 2.0 of NCL, but not 2.1 please let us know.

Why is version 2.1 so different from version 2.0?

Version 2.1 extends the functionality significantly by allowing NCL to parse files that use extended forms of NEXUS. Both Mesquite and MrBayes rely on extensions to NEXUS. Particularly difficult to handle are Mesquite's support for multiple blocks of the same type within a file (accompanied by linking blocks by title).

Version 2.0 of NCL followed a model of creating a NxsReader object and adding NxsBlock objects which handle parsing of particular types of NEXUS content. Client code would typically inherit from base classes such as NxsCharactersBlock, or would extract the information when a block was completely read. The NxsBlock instance would be reset (by NxsBlock::Reset call) before it was asked to handle another block.

Unfortunately, not all NEXUS blocks are autonomous (for example commands in an ASSUMPTIONS block may rely on information in a CHARACTERS block). Combining inter-block dependencies with the need to store information from multiple blocks of the same type means that NCL's version 2.0 API can be quite cumbersome. To read a file with a single instance of a NxsCharactersBlock, the client code must carefully offload all of the information in a block before allowing parsing to proceed to the next block. Subsequent references between blocks have to be corrected so that blocks refer to the new location of the information (rather than the NxsBlock instance that originally held the data).

A more natural design pattern for processing files which may have multiple blocks of the same type is to use a factory method. Users of NCL v2.1 can register NxsBlockFactory instances with a NxsReader. Then the reader can parse an entire file before having to pull the parsed information from the blocks.

Version 2.2

The svn branch for version 2.2 is very similar to 2.1. It only contains changes necessary to allow NCL to be callable from other languages via SWIG. Some of the changes needed to ensure this were not backward-compatible. So they were added to branch 2.2.

C++ programmers should probably use 2.1 (although their code will almost certainly work on v2.2 as well)

Block Type ID	Commands	NCL Block Reader type
ASSUMPTIONS	CharPartition, CharSet, CodeSet, CodonPosSet, Options, TaxSet, TaxPartition, TreeSet, TreePartition, TypeSet, UserType, WtSet	NxsAssumptionsBlock
CHARACTERS	Dimensions, Format, TaxLabels, CharStateLabels, CharLabels, StateLabels, Matrix	NxsCharactersBlock
CODONS	(same as ASSUMPTIONS)	NxsAssumptionsBlock
DATA	(same as CHARACTERS)	NxsCharactersBlock
DISTANCES	Dimensions, Format, TaxLabels, Matrix	NxsDistancesBlock
SETS	(same as ASSUMPTIONS)	NxsAssumptionsBlock
TAXA	Dimensions, TaxLabels	NxsTaxaBlock
TREES	Translate, Trees	NxsTreesBlock
UNALIGNED	Dimensions, Format, TaxLabels, Matrix	NxsUnalignedBlock

Next see The Basics of the NCL API for discussion of how to get started using NCL.

Brief Directory:

NCL v2.1 Documentation

2.1

Table of contents

What is the NCL?

Versions of the library.

Backward compatibility

Why is version 2.1 so different from version 2.0?

Version 2.2

Obtaining the NCL?

Portability

Cross-platform features

What parts of NEXUS are supported by NCL ?