EditOperations describe changes made to AbstractMethods, e.g. token insertion, token deletion, token replacement, or some combination of these.

class InsertOperation[source]

InsertOperation(tokenIndex:int, newToken:str)

Creates an InsertOperation which defines the insertion of a token into an AbstractMethod. Applying the operation will insert token newToken at index tokenIndex.

method = AbstractMethod("public void METHOD_1 ( ) { }")
method
public void METHOD_1 ( ) { }
method.applyEditOperation(InsertOperation(1, "static"))
method
public static void METHOD_1 ( ) { }

class DeleteOperation[source]

DeleteOperation(tokenIndex:int)

Creates a DeleteOperation which defines the deletion of a token from an AbstractMethod. Applying the operation will delete the token at index tokenIndex.

method = AbstractMethod("public static void METHOD_1 ( ) { }")
method
public static void METHOD_1 ( ) { }
method.applyEditOperation(DeleteOperation(1))
method
public void METHOD_1 ( ) { }

class ReplaceOperation[source]

ReplaceOperation(tokenIndex:int, newToken:str)

Creates a ReplaceOperation which defines the replacement of a token in an AbstractMethod. Applying the operation will replace the token at index tokenIndex with token newToken.

method = AbstractMethod("public void METHOD_1 ( ) { }")
method
public void METHOD_1 ( ) { }
method.applyEditOperation(ReplaceOperation(0, "private"))
method
private void METHOD_1 ( ) { }

class CompoundOperation[source]

CompoundOperation(operation:Union[InsertOperation, DeleteOperation, ReplaceOperation, ForwardRef('CompoundOperation')])

Creates a CompoundOperation, which represents a combination of EditOperations starting the given operation. More operations can later be added. Applying the CompoundOperation to an AbstractMethod will apply all successfully added operations, in order.

Principles of CompoundOperations

Type

Each CompoundOperation has a type, which is one of the following:

  • InsertOperation -- indicates that no tokens are removed and at least one token is added
  • DeleteOperation -- indicates that at least one token is removed and no tokens are added
  • ReplaceOperation -- indicates one of the following:
    • no tokens are removed or added
    • at least one token is removed and at least one token is added

Loose Compatibility

A squence of EditOperations is said to be loosely compatible if, when the operations are applied to an AbstractMethod directly after one another, the AbstractMethod is modified in one contiguous section. A CompoundOperation is loose when it consists of EditOperations that are loosely compatible with one another. Note that the order in which the EditOperations are applied matters. Take the following examples, which both utilize the same EditOperations:

method = AbstractMethod("A B C D")
method.applyEditOperations([
    DeleteOperation(1),
    DeleteOperation(0)
])
method
C D

The two operations that were applied are loosely compatible because they modified (deleted) a contiguous section of tokens ['A', 'B'].

method = AbstractMethod("A B C D")
method.applyEditOperations([
    DeleteOperation(0),
    DeleteOperation(1)
])
method
B D

In this case, even though the same two operations were applied, the operations are not loosely compatible. The first operation deleted the token 'A', and the second token deleted the token 'C'. Since these tokens were not contiguous, the applied operations are not loosely compatible.

Strict compatibility

A sequence of EditOperations is said to be strictly compatible if it is loosely compatible and all the operations are of the same type. A CompoundOperation is strict when it consists of EditOperations which are strictly compatible with one another.

These operations are strictly compatible:

DeleteOperation(1)
DeleteOperation(1)

These operations are not strictly compatible, even though they are loosely compatible:

DeleteOperation(1)
InsertOperation(1, "foo")

Creating CompoundOperations

CompoundOperations are created from a sequence of EditOperations. The easiest way to do this is by using the utility functions getCondensedBasic, getCondensedLoose, and getCondensedStrict found in the CondenseEditOperations module. However, you can also create them manually by repeatedly adding EditOperations or by providing a machine string.

Adding EditOperations

CompoundOperation.addLoose[source]

CompoundOperation.addLoose(operation:Union[InsertOperation, DeleteOperation, ReplaceOperation, ForwardRef('CompoundOperation')])

Attempts to add the given operation such that it is loosely compatible with the overall CompoundOperation. This may change the type of the CompoundOperation. If the addition was successful, then returns True; else, returns False.

CompoundOperation.addStrict[source]

CompoundOperation.addStrict(operation:Union[InsertOperation, DeleteOperation, ReplaceOperation, ForwardRef('CompoundOperation')])

Attempts to add the given operation such that it is strictly compatible with the overall CompoundOperation. If the addition was successful, then returns True; else, returns False.

Using machine strings

Machine strings are tokenized representations of CompoundOperations used for training a HephaestusModel.

CompoundOperation.getMachineString[source]

CompoundOperation.getMachineString(form:str='general')

Returns a string formatted for use in training a machine learning model, i.e. a HephaestusModel. The structure is as follows:

<X> beginIndex endIndex <sep> tokens </X>

The value of X depends on the given form parameter. The ouputted machine string can be of general form or typed form:

  • "general": X will always be "op", regardless of the CompoundOperation's type. Thus, the type of the operation is generalized. This is the default behavior.
  • "typed": X will be one of "ins", "del", or "rep", depending on the type of the CompoundOperation.

The range beginIndex:endIndex refers to the pythonic range of tokens which the CompoundOperation deletes. Thus, if beginIndex and endIndex are equal, then no tokens are deleted. tokens refers to the list of tokens which are added at beginIndex once the aformentioned range is deleted.

Note: this method is different from the __str__() method, which returns a more human-readable string.

CompoundOperation.FromMachineString[source]

CompoundOperation.FromMachineString(string:str)

Returns a CompoundOperation which represents the given machine string such that the following equality holds: operation == CompoundOperation.FromMachineString(operation.getMachineString()). The CompoundOperation is derived from the given machine string regardless if it is of general form or typed form.

compoundOp = CompoundOperation(DeleteOperation(2))
compoundOp.addLoose(ReplaceOperation(2, "return"))
compoundOp.addLoose(InsertOperation(3, "VAR_1"))
compoundOp.addLoose(InsertOperation(4, ";"))

compoundOp
COMPOUND_REPLACE 2:4 -> ['return', 'VAR_1', ';']
generalMachineString = compoundOp.getMachineString("general")
generalMachineString
'<op> 2 4 <sep> return VAR_1 ; </op>'
CompoundOperation.FromMachineString(generalMachineString)
COMPOUND_REPLACE 2:4 -> ['return', 'VAR_1', ';']
typedMachineString = compoundOp.getMachineString("typed")
typedMachineString
'<rep> 2 4 <sep> return VAR_1 ; </rep>'
CompoundOperation.FromMachineString(typedMachineString)
COMPOUND_REPLACE 2:4 -> ['return', 'VAR_1', ';']