This is a putative overall requirements list for the Open Tree of Life project, focused on software competence. It is intended to complement, not replace, the Trello board, which is for short term planning and to act as a wish list.


This list is meant as a management and planning tool. Possible uses:
  1. Help set priorities as we plan work, and manage triage as we run out of time. Activities not directed toward meeting requirements should be given lower priority than those that do.
  2. Assist architectural design decisions. Designs need to accommodate items on the requirements list, and needn't deal with things not on the list.

This list will be filled in by JAR as he continues to learn about the project.

Key:
Defender
Short description
How to test
URI(s)
Reason
Importance (must/should)
Urgency / phase
Notes
Consensus
SS
Generate a tree of life
Comprehensive, connected, and no cycles?

Main point of project
Deal breaker
I


KC
Browse draft tree/graph
At web site, can find tree easily, see all of it? Bare minimum would be NCBI-like
FAQ

Deal breaker
I


KC
Download entire draft tree/graph
Working hyperlink(s) to file(s) in some format that anyone can download?
Collaborators
Essential service
Deal breaker
I
EOL acceptance would be nice

KC
Extract subtree / subgraph via UI
Is subtree induced by given leaf set visible/extractable by non-programmers? (minimal: Newick string or NEXML given list in textarea)

Community need
Must
I
phylotastic

KC
Extract subtree / subgraph via API
Can get/use subtrees using R (or equivalent language)? Do we show up on ROpen's list of data packages?

Community need
Must
I?
e.g. ROpen

KC
Provenance drill-down
For any node or edge, can easily find supporting trees/data/taxonomies/synthesis methods (in UI, API, and dump)?

Transparency
Must
I


KC
Synthesis tool
Synthesis tool/service - offline
FAQ, summary
Service
Must
I



Community contributions
High latency (manual / reviewed) input to opentree

Growth

I


KC
Track changes to external taxonomies
Are recent additions to NCBI, union4, CoL, etc. reflected in opentree with a latency 6 months?

Freshness, credibility
Must
I


JR
Attribution
Is information attributed to its source per academic community standards?

Right Thing
Must to extent reasonable
I
potential problems around specificty and attribution stacking


Enable reproducibility
Keep a sequence of old versions of the graph and synthetic tree, and query software, for citation purposes (but not of the synthesis methods?)

Reproducibility

I?
but community standards around reproducibility are low

KC
Pull from Dryad
Do new phylogenies+data sets entered via Dryad show up (following request initiated in Dryad)?
Collaborators
NESCent
?
I?


KC
Push to Dryad
Do data+phylogeny entered in opentree show up in Dryad (on request from opentree)?
Collaborators
NESCent
?
I?
Risky to delay this one. Solves embargo issue

SS
Level of support within source tree
Is level of support available (in UI, API, dump)?



II?
bootstrap

SS
Level of conflict between source trees
Can distinguish agreed parts of graph / draft tree from conflict-ridden (in UI, API, and dump)?
proposal
Transparency
Must
II?



User interface for phylogeny ingest
Low-latency update of the total tree by biologists

Scalability

II+


KC
Annotation database
Can add annotations to an annot. layer, and see? Can select annotation layers?
proposal
Utility(?)

II?
google earth, phyloreferencing


Auto-track changes to external taxonomies
Automatic tracking of NCBI, union4, etc.? (Latency <2 days)

Freshness, credibility

III?


MH
Automatic update of gene tree estimates
(??) Does the system watch for new gene data from Genbank, re-run analyses when changes come in, and make the results available in the system?
home page, proposal
Scaling
Must
II


MH
Version control
Can a non-programmer update one of the source trees? Are preceding versions of source trees kept for comparison purposes?
proposal
Utility
Really should
III?
git
need clarity around purpose









?









Push to Treebase
Does an uploaded phylogeny also show up in Treebase (on request from OToL)?
summary

Nice to have
II?
depends on help from treebase personnel
?
RR
Synthesize trees
Can a user vary tree generation parameters, such as priorities of methods or sources, to get different trees over the same tips?
proposal


II+

?
RR
Tree drawing tool
SVG-based phylogenetic tree illustration tool to help systematists prepare figures for print publication
proposal

Independent project
II

?
RR
Taxonomy of life
Feed phylogenies back to taxonomy, to generate improved overall hypothetical tree (taxonomy) of life
see here


II+

?
DS
Coordinate with PLoS Current ToL
[acceptance test for this feature?]


Nice to have
II+

?




























(JAR's thoughts on possible requirements: offsite backups (still alive after fire in any single location?); longevity (still running 5 years post grant period?))