PaxtoolsPatternModule

= Overview =

The Pattern module in Paxtools is implemented for enabling the search of specific topological structures in a BioPAX model. A pattern is defined by a list of constraints. A pattern match is an array of BioPAXElement objects that satisfy those constraints.

A Quick Example
Assume we are looking for a structure that a Conversion has a PhysicalEntity at its left and another PhysicalEntity at its right, both belong to the same EntityReference. The pattern contains 4 elementes (a Conversion, two PhysicalEntity, and an EntityReference), so its size is 4.

Pattern p = new Pattern(4, Conversion.class); // Conversion is the initial element

p.addConstraint(ConBox.left, 0, 1); // The PhysicalEntity at left is at index 1 p.addConstraint(ConBox.peToER, 1, 2); // The related EntityReference is at index 2 p.addConstraint(ConBox.right, 0, 3); // The PhysicalEntity at right is at index 3 p.addConstraint(ConBox.peToER, 3, 2); // Last constraint makes sure that the last PhysicalEntity has the same EntityReference (at index 2)

Then we can use the class Searcher for searching this pattern in a model.

Model model = ... // Get the model Map> map = Searcher.search(model, p);

In this case, keys of the result map will all be Conversion objects and values will contain list of the related pattern matches.

= Elements of the Architecture =

Pattern
A Pattern object knows the size of the pattern and the type of the initial element. It also keeps the list of the constraints with the indexes that the constraints will be applied. addConstraint method of Pattern takes the constraint and the array of indexes that it will be applied, and uses a MappedConstraint object to bundle them.

Constraint
A constraint is used for ensuring a specific property either for one object or between several objects. Each constraint has a size, i.e. the number of elements that it needs to use. Constraints can be generative or not. Generative constraints have size greater than 1, and they can generate candidate elements for the last index, using the elements in the previous indexes.

Match
A Match simply encapsulates an array of BioPAXElement objects. Its size equals to the related pattern's size. If a Match matches a Pattern, then this means it satisfies all the constraints in that Pattern.

Searcher
The Searcher has some static methods to search occurrence of patterns in a given model. It can either search from an initial object, or the entire model, iterating over all qualifying objects. The searcher iterates the list of constraints in the pattern, and maintains a list of Match objects that is satisfied and/or generated by the constraints. At each step, if the last index in the Match that the current constraint needs is null, then the constraint is used to generate this element (the constraint has to be generative at this point), otherwise constraint is used to check if the Match satisfies the constraint.

ConBox
ConBox provides static methods to prepare frequently used constraints. These are typically single line methods, constructing a specific constraint in a specific way, and returning. An example is below.

public static Constraint erToPE {	return new PathConstraint("EntityReference/entityReferenceOf"); }

= Types of Constraints =

Basic Traversal
PatchConstraint is probably the most frequently needed constraint. Its size is always 2 and it is generative. It simply encapsulates a PathAccessor inside, and provides simple traversal. For instance for traversing from a Control to a controlled TemplateReaction, below PathConstraint can be used.

new PathConstraint("Control/controlled*:TemplateReaction");

This example starts from a Control, and traverses controlled objects recursively while they are other controls, and output if it reaches a TemplateReaction. PathConstraint makes use of the full power of PathAccessor. It is user's responsibility to make sure that the output of the PathAccessor is a BioPAXElement. Otherwise an error will occur in the runtime.

MultiPathConstraint provides a way to aggregate multiple PathAccessors. Following example is for reaching all generic equivalents of a PhysicalEntity.

new MultiPathConstraint("PhysicalEntity/memberPhysicalEntity*", "PhysicalEntity/memberPhysicalEntityOf*");

This example however, only reaches to the equivalent objects, excluding the seed object. If including the seed is desired, then the constraint should be encapsulated with a SelfOrThis constraint.

new SelfOrThis(new MultiPathConstraint("PhysicalEntity/memberPhysicalEntity*", "PhysicalEntity/memberPhysicalEntityOf*"));

Constraining Over a Field
The constraint Field is used for constraining the values of objects fields. It uses a PathAccessor to access the related field, and ensures the given value is among the found values. Following constraint makes sure that AKT1 is among the object's names.

new Field("Named/name", "AKT1");

Optionally, Field can use another object in the pattern as the required value. This feature should be set at the constructor. In that case it works like the PathConstraint, but it cannot be used as a generative constraint.

new Field("Interaction/participant", Field.USE_SECOND_ARG);

Above constraint have size 2, and checks if the object at second position is a participant of the Interaction at first position.

If an empty field is desired, then this could be set in the constructor as well. Below constraint makes sure that Interaction has no participants.

new Field("Interaction/participant", Field.EMPTY);

Constraining Over Size of Options
Size constraint encapsulates any generative constraint and ensures the number of candidates it generates is equal, less than, or greater than a certain size. Below constraint makes sure that the physical entity participates in less than 100 conversions. This can be useful for excluding high-degree molecules from the pattern. Size is always non-generative.

new Size(new PathConstraint("PhysicalEntity/participantOf:Conversion"), 100, Size.LESS);

Logical Operators
Constraints AND, OR, and NOT are used to apply logical operators to any kind of constraints. AND and OR constraints take array of MappedConstraint in the constructor. The mapping in those MappedConstraints do not map to the original indexes, but they map to the index array that is sent to AND or OR. The example below should clarify that. Imagine we have two different Control objects in the pattern, and we want the next object to be a Conversion directly controlled by both of them.

Pattern p = ... // Assume there are two different Control objects at indexes 4 and 6. And we are generating the element at index 10.

Constraint c = new AND(new MappedConstraint(new PathConstraint("Control/controlled:Conversion"), 0, 2),                       new MappedConstraint(new PathConstraint("Control/controlled:Conversion"), 1, 2));

p.addConstraint(c, 4, 6, 10);

In this example, the indexes 0, 1, and 2 in MappedConstraint objects correspond to indexes 4, 6, and 10, respectively, on the original pattern. AND and OR constraints are generative only when all the member constraints are generative.

NOT constraint is used for negating other constraints. NOT cannot be generative. Below constraint makes sure the Control at first index does not control the Conversion at the second index.

new NOT(new PathConstraint("Control/controlled*:Conversion"));

Traversing Conversions
When we reach a Conversion from a participant PhysicalEntity while constructing a pattern, we can continue towards the other side of the conversion using the OtherSide constraint. We can re-write the example in the section, this time starting from a PhysicalEntity instead of the Conversion.

Pattern p = new Pattern(4, PhysicalEntity.class); // PhysicalEntity is the initial element in a pattern match

p.addConstraint(new PathConstraint("PhysicalEntity/participantOf:Conversion"), 0, 1); // The Conversion is at index 1 p.addConstraint(new OtherSide, 0, 1, 2); // The PhysicalEntity at the other side of the Conversion is at index 2 p.addConstraint(ConBox.peToER, 0, 3); // The EntityReference of the initial PhysicalEntity is at index 3 p.addConstraint(ConBox.peToER, 2, 3); // Last constraint makes sure that the last PhysicalEntity has the same EntityReference (at index 3)

Sometimes we need to construct the pattern using inputs and outputs of a Conversion, but they are not explicitly defined. The fields left and right of the Conversion should be mapped to input and outputs according to the direction of Conversion. The constraint ParticipatesInConv can help for the mapping. Below pattern is similar to the above pattern, but makes sure that the first PhysicalEntity is input and the second is output.

Pattern p = new Pattern(4, PhysicalEntity.class); // PhysicalEntity is the initial element in a pattern match

p.addConstraint(new ParticipatesInConv(RelType.INPUT), true, 0, 1); // The Conversion is at index 1 p.addConstraint(new OtherSide, 0, 1, 2); // The output PhysicalEntity is at index 2 p.addConstraint(ConBox.peToER, 0, 3); // The EntityReference of the initial PhysicalEntity is at index 3 p.addConstraint(ConBox.peToER, 2, 3); // Last constraint makes sure that the last PhysicalEntity has the same EntityReference (at index 3)

The second boolean argument of ParticipatesInConv is for treating reversible conversions as if left_to_right. If we set it to false, then all the participants of the reversible conversion will be trated as both input and output.

When we want to traverse a Control --> Conversion --> PhysicalEntity path and if we care about PhysicalEntity being input or output to the Conversion, then we can use ParticipatingPE constraint to enforce it.

Pattern p = ... // Assume there is a Control at index 2 and a controlled Conversion at index 3. We want output PhysicalEntity at index 4.

p.addConstraint(new ParticipatingPE(RelType.OUTPUT, false), 2, 3, 4);

Traversing Recursive Complex Membership and Generic Relationships
While traversing from a PhysicalEntity to an Interaction, many times we also want to traverse over complex-member relations and generic relationships to reach the Interaction. Assume we are looking for the Contol objects that PhysicalEntity A is controller. Sometimes A is a member of a Complex, and this complex is the controller. Sometimes A has a parent generic PhysicalEntity and this one is the controller. And some other times these relations are recursively mixed. The constraint LinkedPE is built for handling such cases. It is a generative constraint of size 2, and traverses other related physical entities recursively, in the user specified direction. The direction only matters in complex-member relations, and can be either TO_COMPLEX or TO_MEMBER.

Assume we want to search for Conversions that modify PhysicalEntity of an EntityReference.

Pattern p = new Pattern(6, EntityReference.class); // Start from an EntityReference int i = 0; p.addConstraint(ConBox.erToPE, i, ++i); // Get a related PhysicalEntity p.addConstraint(new LinkedPE(LinkedPE.Type.TO_COMPLEX), i, ++i); // Include parent complexes and all equivalent generics recursively p.addConstraint(new ParticipatesInConv(RelType.INPUT, true), i, ++i); // Get to the Conversion p.addConstraint(new OtherSide, i-1, i, ++i); // Get to the PhysicalEntity at the other side p.addConstraint(new Equality(false), i-2, i); // Make sure that the PhysicalEntity objects at each sides are different p.addConstraint(new LinkedPE(LinkedPE.Type.TO_MEMBER), i, ++i); // Include complex members and all equivalent generics recursively p.addConstraint(ConBox.peToER, i, 0); // Make sure that the last PhysicalEntity has the same EntityReference

LinkedPE constraint also generates the seed object, so the linked PhysicalEntity can be the same PhysicalEntity. Example above also demonstrates using a variable, i, for specifying the indexes of constraints. This can help sometimes, especially when we need to insert or delete a row in the constraints, so that we don't need to shift all the further indexes.

= Labeling slots in a pattern =

As you may have already noticed, keeping the track of indexes of slots in a pattern becomes hard when the pattern gets a little complicated. Using an index variable in the code (like "i" above) can help to remove hard coding indexes, but referring to a distant slot is still problematic because we need to use indexes like "i-2". Labeling slots of the pattern helps for addressing specific elements in the pattern without using their indexes. We can add a label to a slot of a pattern using the "label" method.

p.label("text", index);

The index can then be retrieved using "indexOf" method.

p.indexOf("text");

The more practical way to add a label is to pass it in the "addConstraint" method. This way, the label is set for the last index sent to the method. Below is the labeled version of the above pattern.

Pattern p = new Pattern(6, EntityReference.class, "ER"); int i = 0; p.addConstraint(ConBox.erToPE, i, ++i); p.addConstraint(new LinkedPE(LinkedPE.Type.TO_COMPLEX), "input PE", i, ++i); p.addConstraint(new ParticipatesInConv(RelType.INPUT, true), "Conv", i, ++i); p.addConstraint(new OtherSide, "output PE", p.indexOf("input PE"), indexOf("Conv"), ++i); p.addConstraint(new Equality(false), p.indexOf("input PE"), p.indexOf("output PE")); p.addConstraint(new LinkedPE(LinkedPE.Type.TO_MEMBER), i, ++i); p.addConstraint(ConBox.peToER, i, p.indexOf("ER"));

= Merging patterns =

Sometimes we can reuse a pattern while constructing another pattern. This is good for modularizing the pattern structure and for coping with code duplication. The method "addPattern" copies the constraint list of the parameter pattern. The first slot of the parameter pattern should be mapped to a non-empty slot in the current pattern. "addPattern" method first makes room for the new slots by increasing the slot size of the current pattern, then copies the constraints of the parameter pattern by mapping the slot indexes to new locations. It also transfers the labels from the parameter pattern. If the are equivalent labels between two patterns, then these slots are mapped to each other.

Assume we already defined the below pattern, which contains a Conversion and a PhysicalEntity at its right.

Pattern p1 = new Pattern(2, Conversion.class); p1.addConstraint(ConBox.right, "right PE", 0, 1);

Then we can reuse it while defining a pattern like "Controller PhysicalEntity -- Control -- Conversion -- Right PhysicalEntity", where two PhysicalEntity objects shoould be different.

Pattern p2 = new Pattern(3, PhysicalEntity.class, "controller"); p2.addConstraint(ConBox.peToControl, 0, 1); p2.addConstraint(ConBox.controlToConv, 1, 2);

p2.addPattern(p1, 2);

p2.addConstraint(new Equality(false), p2.indexOf("controller"), p2.indexOf("right PE"));

Note that the size of p2 is not 3 anymore as it is increased while addition of p1.