Discussion on Versioning

Definition and intended usage
A version number is used to identify a specific release of a BioPAX specification. Examples:
 * From the OWL file for BioPAX level1: "This is version 1.4 of the BioPAX Level 1 ontology"
 * From the OWL file for BioPAX level2:"This is version 1.0 of the BioPAX Level 2 ontology"

Current practice

 * BioPAX development is structured in levels, which in part overlap with the concept of version (they can be thought as a "major" version number.
 * Versions within a level are supposed to be compatible with the same tools.
 * Only the last version for each level is available online (in fact, all the revisions are available online, in the CVS or Mercurial SCM) usually used by people
 * Information on version is stored as a comment in the ontology header

Observations and problems

 * Each new level introduced changes all URIs in the ontology, and this make difficult queries across different levels of the ontology, even when the "concept" they refer to is the same. (e.g.: http://www.biopax.org/release/biopax-level1.owl#protein is different in SPARQL queries from http://www.biopax.org/release/biopax-level2.owl#protein; however, L2 'protein' is for sure different from the L3 'Protein'; as we know, 'protein' (L2) best corresponds to 'ProteinReference' (L3), and 'sequenceParticipant' - to Protein, DnaRegion, Dna, etc...)
 * There is no way to refer to a subversion of the ontology: Version 1.0, 1.2 and 1.4 of BioPAX level1 don't have distinct URIs. This is a problem both for tools that need to import the ontology, and more in general for tools which "may" be affected by changes in the sub-version.
 * Version information is not properly included in the ontology header [1], and no information on changes are recorded.
 * There are more versions than the ones "declared". each committed file in subversion can in fact be thought as a version . [remark: although you cannot stop people from referring to a changeset in the source code repository, they usually do not, and it does not make sense, would be a very bad practice; what people do is create milestone or minor "releases" and annonce them often...]
 * BioPAX specifies the version of the ontology, but doesn't require to specify the version of the data. [remark: should BioPAX specify it? do OWL, RDF, XML require similar (I think not)? So, it's on data providers side; e.g., they may change the namespace prefix with every release]
 * As the documentation is an integral part of the specification, policies on versions should apply to both.
 * Current URLs for levels include the .owl extension, which is not appropriate (I refer to an entity, rather than a specific file, I may want different representations for it).

Proposals
See also: Talk:Discussion on Versioning

* http://www.biopax.org/release/biopax-level3/ -> latest (level3) * http://www.biopax.org/release/biopax-level3/v1/ * http://www.biopax.org/release/biopax-level3/v2/ * http://www.biopax.org/biopax-level3/ -> the latest (level3) * http://www.biopax.org/v0.93/biopax-level3/ * http://www.biopax.org/v2.0/biopax-level3/ * http://www.biopax.org/biopax/level3/ * http://www.biopax.org/biopax/level3/v1.0 * etc...
 * Adopt the following release URL schema (sub-releases have a persistent URL):
 * - alternative:
 * another alternative:
 * Record information such as release date and version replaced in the ontology header [1]
 * Change URLs for version to drop ".owl" (URL/rewrite could be used to maintain compatibility)

Open questions

 * 1) Should we include in BioPAX a requirement to version the export of a database in BioPAX/RDF ? There are in principle three different versions associated to a pathway file in BioPAX:
 * 2) The version of the Pathway or Pathway database.
 * 3) The version of the exporter to BioPAX
 * 4) The version of BioPAX
 * 5) Should we be so precise ? In case we want to track "data" versions, how can we do it ? Perhaps via metadata associated to an RDF graph ? One option would be to use Dublin Core terms and Named Graphs (we won't be the first to use named graphs for this purpose).
 * 6) Should we keep URIs unchanged across levels, if the "concept" doesn't change ? When can e say that the "concept" doesn't change ?
 * 7) Should we have version information at a finer granularity, e.g.: for BioPAX classes ?
 * 8) Should we extend the BioPAX publishing schema to track all possible sub-versions of BioPAX ? (perhaps via a URL-Rewrite from BioPAX website to Subversion.
 * 9) For individuals parsing version information in their tools, what is the best way to keep revisions that require coding changes separate from minor revisions that may be just be typo fixes or comment additions.

Notes and References
[1] http://www.w3.org/TR/2004/REC-owl-guide-20040210/#OntologyVersioning

OBO ID policy (versions and URIs): http://obofoundry.org/id-policy.shtml