GeLL Driver

GeLL comes with a driver that should be capable of doing many basic analyses. The usage of the driver and the format of associated files is described here.

Usage Show Hide

where settings is the name of a settings file. var1 etc are variables that can be "passed" to the setting file. A $ followed by a number in the settings file will be replaced by the corresponding command line argument. The format of the settings file is described below.


Settings File Show Hide

A run of the of the driver is controlled by the settings file. The settings file has four options sections. The start of each of these sections should begin with the sections name in square brackets, e.g. [Control]. Each section is optional. The control section contains general control sections while the likelihood section controls likelihood optimisation. The ancestral and simulation sections control the expected processes. Although each section is optional if a section it may have settings that must be set. These are shown by the darker background.

Control section

DebugLevel The amount of debug information that is displayed when an error occurs. Valid values are:
  • 1 -  Default. Just an error message is logged.
  • 2 -  An error message and stack trace is logged.
  • 3 - An error message and stack trace is logged along with the message and trace of any underlying exception.
DebugFile File to log debug information to. If no file is given debug information is printed to screen.
Distributions How stationary and quasi-stationary distributions are calculated. Valid values are:
  • Repeat -  Default. Distributions are calculated by repeated application of a P-matrix to a distribution.
  • Eigen  -  Distributions are calculated using Eigendecompositions.
MatrixExponentation How matrix exponentiations are calculated. Valid values are:
  • Taylor -  Default. Exponentiations are calculated by a Taylor expansion.
  • Eigen  -  Exponentiations are calculated using Eigendecompositions.
ForceSquare The minimum number of repeated squaring steps to use when calculating matrix exponentiations using the Taylor method. Defaults to 0.

Likelihood section

AlignmentType The type of alignment input. See alignment files below for a description of the file formats. Valid values are:
  • Sequence    -  The "alignment" is in sequence format.
  • Duplication -  The "alignment" is in duplication format.
Alignment Path to the alignment file.
TreeInput Path to the input tree file. This file should contain one line containing a tree in Newick format.
Model Path to the model file. See the Model file description below for format.
ParameterInput Required unless Restart is used. Path to the parameters input file. See the Parameter file description below for format.
Ambig Path to a file describing any ambiguous states in the alignment.
Missing Path to an alignment that gives the unobserved data. In the same format as the alignment.
MissingAmbig Path to a file describing any ambiguous states in the missing alignment.
Optimizer The optimiser to use. Valid values are:
  • GoldenSection -  Golden section search.
  • NelderMead    -  Neader-Mead optimisation.
Checkpoint File to write checkpoints to. This allows the optimization to be restarted using the Restart setting should it be interrupted.
CheckpointFreq How often (in minutes) the checkpoint file should be written.
Restart Checkpoint file to restart optimisation from.
TreeOutput File to output the estimated tree to. If this option is not given then no output is written.
ParameterOutput File to output the estimated parameters to. If this option is not given then no output is written.
Rescale Whether to rescale the matrix to one event pet time unit. Any value beginning with f is false, all other values are true. Defaults to true.
OptimizeTree Whether to optimize the tree branch lengths or use those provided. Any value beginning with f is false, all other values are true. Defaults to true.

Ancestral section

AlignmentType Required if no Likelihood section. Same meaning as in Likelihood section.
Alignment Required if no Likelihood section. Same meaning as in Likelihood section.
Tree Required if no Likelihood section. Same meaning as TreeInput in Likelihood section.
Model Required if no Likelihood section. Same meaning as in Likelihood section.
Parameters Required if no Likelihood section. Same meaning as ParameterInput in Likelihood section.
Type The type of reconstruction to do. Valid values are:
  • Joint    -  Default. Joint reconstruction.
  • Marginal -  Marginal reconstruction.
Output File to write the reconstructed alignment to.

Simulate section

AlignmentType Required if no Likelihood section. Same meaning as in Likelihood section.
Tree Required if no Likelihood section. Same meaning as TreeInput in Likelihood section.
Model Required if no Likelihood section. Same meaning as in Likelihood section.
Parameters Required if no Likelihood section. Same meaning as ParameterInput in Likelihood section.
Missing Path to an alignment that gives the unobserved data. In the same format as the alignment.
Length The length of the simulate alignment.
Output File to write the reconstructed alignment to.
Alignment Files Show Hide

Alignment files can be in one of two different formats:

Parameters File Show Hide

Each line represents a single parameter. Lines are tab separated. The first field is the type of the parameter and the second is the name of the parameter. Subsequent fields depend on the parameter type. Type values are:

Model File Show Hide

The first line controls the type of model. Possible types and the subsequent format of the rest of the file are:

To use different models for different site classes the format of this file is different. In this instance each line of the file will represent one class and will contain two fields tab separated. The first field is the class identifier while the second is the file name of a file in the normal model format (above) that defines the model for that class.

Rate Category File Show Hide

The file format is described below. See Equation Format below for a description of the format of the equations that can be in the rate matrix and root distribution.

Ambiguous File Show Hide

File should be a tab delimited file with one ambiguous character per line. The first field on each line is the ambiguous character while also subsequent field represents a character that could be represented by it

Equation Format Show Hide

Variables are represented by a letter followed by any number of alphanumeric characters. Multiply (represented by *) should be stated explicitly, e.g. a * b NOT a b or ab (the later of which would be parsed as a single variable). Functions should be represented by f[a,b,...] where f is the function name and a,b etc. are inputs. Inputs cannot contain other functions but can otherwise contain an expression. The following functions are defined:

Back to Top Level Documentation