# Term Normalization and Structural Equivalence.

In The original rho calculus paper, Structural equivalence is defined as a mutual recursion between the following 3 equivalences:

- Parallel composition is a commutative monoid with 0/Nil as the identity.
- Quote(Drop(x)) is equivalent to x. Otherwise name equivalence.
- Alpha Equivalence.

There is a straightforward algorithm using substitution and exhaustive search for comparing to see if 2 processes are structurally equivalent. This is insufficient if what we want are keys in maps.

An important distinction between Rholang and the Rho calculus is that Rholang does not use names as variables. It uses variables as variables. Whenever a name appears in binding position (new or the result of a for), that name is not to be substituted, like in Rho Calculus, but to be pattern matched. (e.g. in `for (@Nil ← y)`

`P`

`@Nil`

is a pattern, not a variable name, and P is only executed when the quoted empty process is sent on Y. In Rho calculus, P is executed when any name is sent on Y, and that new name is substituted for `@Nil`

We want to write down processes in such a form that for any member of the equivalence classes above, we pick a unique representative, this process of picking a unique representation is called **Normalization**. Nearly all terms in practice can be written this way.

When we write the processes down in this way, structural equivalences becomes a byte compare, rather than a complex algorithm.

We also plan to support the following additional equivalences:

- Maps and sets are unordered collections.
- for(x ← a; y ← b) is equivalent to for(y ← b; x ← a).
- If the sources (a, b) are different, or the patterns (x, y) are different, then there is an obvious ordering.
- If there is a bind of the same pattern as a read from the same channel, then the body is sorted, with those variables having the same sort value.
- If there are no parallel terms in the body with the same sort value, the binds can be numbered according to their use in the body.
- If there are terms with the same sort value, this is a term that can't be normalized. I have to work out in more detail, but we should be able to mark the parts of a term that lead it to be non-normalizable, and use the exhaustive algorithm just for those parts.
Here's an example showing that for some terms, we just can't normalize them, or we would solve a hard problem:

for (x0 ← a; x1 ←a; x2 ← a) { x0!(x1) | x1!(x2) | x2!(x0) } vs for (x0 ← a; x1 ←a; x2 ← a) { x0!(x2) | x1!(x0) | x2!(x1) }

It should be clear that if we could normalize terms like this, we could solve the graph isomorphism problem in polynomial time.

- Because Rholang uses variables for bound names rather than other names, Drop(Quote(P)) is also equivalent to P (I need to think about this. We may do evaluation on quote)

Let's start with the 3 equivalences above:

- A standard way to write down a unique representation of a commutative monoid is to impose a sorting.
- We can enforce as we normalize and evaluate processes that Drop/Quote pairs are zeroed out.
- A standard way to handle alpha equivalence is to use a naming discipline. In our case we have chosen DeBruijn (Closest English Pronunciation: Duh Brown) levels.

I'm working on producing a "semi-normalized" form that is straightforward to sort. Kent has been working on a total sort order for all of the terms.

Some equivalences we are **not ever** enforcing:

- Any commutativity of arithmetic expressions.

Some equivalences that we will **not currently** enforce:

- new x in { new y in { P } } is not the same as new y in { new x in { P } }
- Binding order of free variables in process patterns is not commutative.

We would need to use an approach similar to the one outlined above for join binding order. Identify if the term is one with a normalized form, and which sub-terms prevent it from being normalized.