Open Issue : Removing PhysicalEntityParticipant

Definition:
Any additional special characteristics of a physical entity in the context of an interaction or complex. These currently include stoichiometric coefficient and cellular location, but this list may be expanded in later levels.

Comment:
PhysicalEntityParticipants should not be used in multiple interaction or complex instances. Instead, each interaction and complex should reference its own unique set of physicalEntityParticipants. The reason for this is that a user may add new information about a physicalEntityParticipant for one interaction or complex, such as the presence of a previously unknown post-translational modification, and unwittingly invalidate the physicalEntityParticipant for the other interactions or complexes that make use of it.

Example:
In the interaction describing the transport of L-arginine into the cytoplasm in E. coli, the LEFT property in the interaction would be filled with an instance of physicalEntityParticipant that specified the location of L-arginine as periplasm and the stoichiometric coefficient as one.

Problem :
PEPs are very inconvenient. They stop us from defining restrictions, checking interaction identity easily etc. And with the state proposal, location and sequence features are moving to the physical entity where it belongs. So only thing that is left on PEPs are   stoichiometry, which could be represented differently.

Proposed Solution(s)
Matthias has already a proposal on this at : http://www.biopaxwiki.org/cgi-bin/moin.cgi/Removing_PhysicalEntityParticipant

Although I agree on the general principle, I find the extra properties in the form LEFT-A-STOI, LEFT-B-STOI etc. to represent the n-ary relationship rather awkward.

This issue was also discussed in the CSHL F2F meeting and there was a general agreement on a similar pattern called "dangling n-ary", which is simply defining a class called Stoichiometry which keeps track of the Physical entity and the stoichiometric constant ( exactly like PEP), and keep a list of Stochiometries in the conversion. However LEFT and RIGHT slots keep a reference to Physical Entity. This has the disadvantage of data duplication and consistency issues, but still is a big improvement over the current state.

So far the best candidate looks like the "dangling n-ary" but I am not fully comfortable with any of the solutions and I am looking for a better one. Any suggestions ?

-

Feedback from Gary Bader
Gary: I guess the dangling n-ary (that sounds kind of creepy, BTW) works if LEFT and RIGHT are restricted to not allow duplicate entries, correct?

Emek: Right. I previously asked to the list whether there is a legitimate reason to let LEFT or RIGHT to have duplicate entries, none was raised. So I believe we can make this restriction. I am not sure how we can do it OWL-wise though, but I will definitely document it. ( any suggestion to rename the dangling n-ary pattern ? :) )

Gary: What other issues are there with this implementation? Also, there seem to be 2 ways of representing unknown stoichiometry with this approach. 1. don't create a Stoichiometry instance and 2. create a Stoichiometry, but leave the STOICHIOMETRY property empty. So we should probably restrict this to one case e.g. by making the STOICHIOMETRY property of Stoichiometry set to card=1 (required and singular).

Emek: I agree that we should define a single way to define unknown stoichiometry. However that solution would not work, as we need to specify card(STOICHIOMETRY) = card(LEFT) + card(RIGHT) not possible in OWL I guess. We can always validate it programmatically though ( Ranjani? ).

Gary: Would the same issue exist for the associated PhysicalEntity instance? I.e. would there be any case for leaving this unknown, but specifying stoichiometry? e.g. I know the stoichiometry of an unknown participant is 3.

Emek: Now that *is* creepy. ;)

-

Feedback from Alan Ruttenberg
Alan: You can't do it any other way in OWL. Participation in the reaction  is represented as a triple: e.g conversion127 LEFT physicalEntity1 conversion127 LEFT physicalEntity2 conversion127 RIGHT physicalEntity3 conversion127 RIGHT physicalEntity4

You can add duplicates of triples, e.g. saying again:

conversion127 LEFT physicalEntity1

but it isn't adding anything. The model is the same as if you didn't say it.

Emek: But you could define it in level 2, by creating multiple PEPs. For example instead of saying 3A -> something, you could say 2A+1A ->. Without PEPs, you can no longer express this. I believe this was the point Gary was making. But there is no legitimate reason why people would be allowed to do that ( at least none raised) and I can see many disadvantages, so it is another benefit of getting rid of PEP.

Alan:  In response to   ( any suggestion to rename the dangling n-ary pattern ? :)

An association list or mapping is a common name. It is a list of  associations mapping physical entity to stoichiometry.

 In response to  ''What other issues are there with this implementation? Also, there seem to be 2 ways of representing unknown stoichiometry with this approach. 1. don't create a Stoichiometry instance and 2. create a Stoichiometry, but leave the STOICHIOMETRY property empty. So we should probably restrict this to one case e.g. by making the STOICHIOMETRY property of  Stoichiometry set to card=1 (required and singular).''

This says that there is a stoichiometry value. It doesn't say that we know it. You can't currently, in OWL, express the fact that something needs to be known (an integrity constraint). I have suggested an out of OWL practice for this at [1]. I'm also advocating that OWL provide a mechanism for this sooner than later, at the upcoming OWLED workshop.

So basically, absent other assertions (like a closing axioms on the list of stoichiometries or participants) these two amount to the same thing.

 In response to  ''I agree that we should define a single way to define unknown stoichiometry. However that solution would not work, as we need to specify card(STOICHIOMETRY) = card(LEFT) + card(RIGHT) not  possible in OWL I guess. We can always validate it programmatically though. ''

card(LEFT) + card(RIGHT) = card(PARTICIPANTS) for a conversion so no addition is necessary. But you can't do this anyways, at least at the class level. However it turns out you can say, for each particular instance, that the cardinality of two properties is the same.

But as I point out it doesn't really do anything for you in this case. Basically, if you want to know the stoichiometry, you query for it. If you don't get an answer then it is unknown.

Emek : Would the same issue exist for the associated PhysicalEntity instance?

Alan : If you wanted to express that there exist 5 participants, and only assert specifically 4, implying that here is one unknown, you would add a cardinality restriction to the conversion instance.

Individual(Conversion127  type(Restriction(Participants cardinality(5)) value(LEFT physicalEntity1) value(LEFT physicalEntity2) value(RIGHT physicalEntity3) value(RIGHT physicalEntity4) )

Of course, as we have discussed, absent such a restriction, technically, we are not saying anything about how many participants there are in a conversion. Consequence of open-world.

Emek: This is my major counter argument against using open world assumption. IMO, a normal chemical equation is closed. So it would be great if we could have this closure operator.

Alan: We have it. We just have to decide to use it. (just showing participants here to make it easier to read)

       4   

It doesn't even have to be verbose. We can define an OWL class to represent the restrictions and then use that like a macro:    4</owl:cardinality> <owl:onProperty rdf:resource="#PARTICIPANTS"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class>

Now we only have to write:

<conversion rdf:about="#conversion127"> <LEFT rdf:resource="#physicalEntity1"/> <LEFT rdf:resource="#physicalEntity2"/> <RIGHT rdf:resource="#physicalEntity3"/> <RIGHT rdf:resource="#physicalEntity4"/> <rdf:type rdf:resource="#has4participants"/>

So for 44 bytes per conversion, more or less, you have closed chemical equations.

Alan:My guess is that we should specify that asserting a stoichiometry of physical entity where the PE is not one of the known PARTICIPANTS (or absent) is not allowed and verify it outside OWL, via something like [1]

Emek: Agreed.