In part 4 of our series implementing programming language constructs with actors, we extend our pattern matching behaviors to support pattern equations. These are true equations that express relationships between patterns. They form the basis for introducing `LET`

and `IF`

expressions.

The grammar for our extended language is shown below. Changes from part 3 are **highlighted**.

expr ::= <term> ',' <expr> | <term>; term ::= <const> | <ident> | <term> '(' <expr>? ')'| 'LET' <eqtn> 'IN' <expr>| 'IF' <eqtn> <expr> {'ELIF' <eqtn> <expr>}* {'ELSE' <expr>}?| 'CASE' <expr> 'OF' {<ptrn> ':' <expr>}+ 'END' | '(' <expr>? ')';eqtn ::= <ptrn> '=' <ptrn>| <ident> '(' <ptrn>? ')' '=' <expr>;ptrn ::= <pterm> ',' <ptrn> | <pterm>; pterm ::= '_' | <const> | <ident> | '$' <expr> | '(' <ptrn>? ')'; const ::= '\' <ptrn> '.' <expr> | <number> | '?' | 'NIL' | 'TRUE' | 'FALSE' | '#' <token>;

## Equality

The notion of equality can be a slippery one [1]. In many computer languages, equality comes in several “flavors” (identity, equivalences of various kinds, etc.) Equality is also confused with assignment, which interferes with referential transparency.

In our language we would like to represent true equations. These equations *establish* equality. When used in a conditional context, they do not evaluate to a Boolean `TRUE`

or `FALSE`

, but rather make a *conditional assertion* of equality.

Like true mathematical equations, our equations should be symmetric. In general, the two sides of an equation should be interchangeable without changing the meaning of the equation. Consider the equation `a=42`

. It seems reasonable that `42=a`

should be equivalent. However, the usual lambda calculus trick is to define `LET a=42 IN (a, a)`

as equivalent to `(\a.(a, a))(42)`

. If we consider the reversed equation `LET 42=a IN (a, a)`

we get the nonsensical `(\42.(a, a))(a)`

.

The problem is that the left side of the equation is treated as a *pattern* while the right side is treated as a *value*, as discussed in part 2 of this series. If we want symmetric equations, we are left with two options. Either expect values on both sides, or expect patterns on both sides. If we expect values on both sides, our equations would *test* equality (presumably with a Boolean result), since variables would be replaced by their values. Instead we expect patterns on both sides, so our equations *assert* equality, possibly binding variables in the process.

## LET and IF

A `LET`

expression establishes an equality (usually binding one or more variables in the process) and then evaluates the `IN`

expression in the environment extended by the new bindings.

# first draft LET let_expr_beh(eqtn, expr) = \(cust, #eval, env).[ SEND (k_env, #unify, env) TO eqtn CREATE k_env WITH \env'.[ CASE env' OF ? : [ SEND ? TO cust ] _ : [ SEND (cust, #eval, env') TO expr ] END ] ]

When `let_expr_beh` receives an `#eval`

request, it sends a `#unify`

message to the equation `eqtn`. The customer in that message `k_env` receives the extended environment resulting from the unification of the equation. If the extended environment `env’` is `?`

(the undefined value) then the undefined result value is sent to the customer `cust`. Otherwise, an `#eval`

message (with the extended environment) is sent to the expression `expr`. The original customer `cust` eventually receives the result of evaluating the expression in the environment extended by unifying the equation.

An `IF`

expression makes a conditional attempt to establish an equality. If the equation can be successfully unified, the expression is evaluated in the extended environment (just like `LET`

). If unification fails, then the `ELSE`

expression is evaluated in the original environment.

LET if_expr_beh(eqtn, expr, else) = \(cust, #eval, env).[ SEND (k_env, #unify, env) TO eqtn CREATE k_env WITH \env'.[ CASE env' OF ? : [ SEND (cust, #eval, env) TO else ] _ : [ SEND (cust, #eval, env') TO expr ] END ] ]

The `if_expr_beh` is nearly identical to `let_expr_beh`. The only difference is the handling of the `ELSE`

clause. If the equation `eqtn` can’t be unified, an `#eval`

message is sent to the alternative expression `else`, but the environment used is the **original** environment `env`. The alternative expression can be another `if_expr_beh` (representing an `ELIF`

clause), or it can be any arbitrary expression (representing an `ELSE`

clause).

Given the similarity between `if_expr_beh` and `let_expr_beh` we can refactor to remove duplication by observing that the a `LET`

is equivalent to an `IF`

where the `ELSE`

clause is `?`

the undefined value.

# refactored CREATE undefined WITH \(cust, req).[ SEND ? TO cust ] LET let_expr_beh(eqtn, expr) = \(cust, #eval, env).[ BECOME if_expr_beh(eqtn, expr, undefined) SEND (cust, #eval, env) TO SELF ]

The singleton actor `undefined` responds to all requests by sending `?`

the undefined value to the customer `cust`. With that definition in place, the `let_expr_beh` is refactored to simply transform itself into the equivalent `if_expr_beh` and reprocess the original message.

## Unification

Equation expressions respond to `#unify`

messages by sending the customer one of two things. Either an environment extended by any bindings introduced while matching patterns. Or `?`

the undefined value, if unification fails. Each pattern contains private data needed to perform the unification. We can’t reach inside the patterns, so we delegate the process to the patterns themselves.

LET eqtn_beh(left_ptrn, right_ptrn) = \(cust, #unify, env).[ SEND (cust, #eq, right_ptrn, env) TO left_ptrn ]

When `eqtn_beh` receives a `#unify`

message it sends an `#eq`

message to the left pattern, including the right pattern as part of the message.

### Pattern #eq

Unification of equations requires extending the protocol of messages understood by patterns to include an `#eq`

message. For constant patterns, this extension is easy to handle. A constant pattern simply matches its value against the other pattern.

LET const_ptrn_beh(value) = \(cust, req).[ CASE req OF (#match, $value, env) : [ SEND env TO cust ](#eq, right, env) : [SEND (cust, #match, value, env) TO right]_ : [ SEND ? TO cust ] END ]

When `const_ptrn_beh` receives an `#eq`

message is sends its `value` in a `#match`

message to the `right` pattern.

A value pattern behaves in a similar way. It first evaluates its expression in the current environment, producing a value. That value is then handled as if it was the value of a constant pattern.

LET value_ptrn_beh(expr) = \(cust, cmd, arg, env).[ SEND (k_val, #eval, env) TO expr CREATE k_val WITH \value.[ SEND (cust, cmd, arg, env) TO NEW const_ptrn_beh(value) ] ]

When `value_ptrn_beh` receives a command with an argument and an environment (both `#match,value,env`

and `#eq,right,env`

qualify), it sends an `#eval`

message to the expression `expr`. The customer in that message `k_val` receives the `value` of the expression and creates a new constant pattern actor with the value. The original message is then sent to the new actor.

An identifier pattern has a little more work to do. An identifier usually introduces a binding into the environment, but it needs a value to bind to. So we need to extend the pattern protocol further to account for bindings.

LET ident_ptrn_beh(ident) = \(cust, req).[ CASE req OF (#match, value, env) : [ CREATE env' WITH env_beh(ident, value, env) SEND env' TO cust ](#eq, right, env) : [SEND (cust, #bind, SELF, env) TO right]_ : [ SEND ? TO cust ] END ]

When `ident_ptrn_beh` receives an `#eq`

message it sends a `#bind`

message to the `right` pattern.

The wildcard pattern behaves like an identifier pattern at this point in the process.

CREATE any_ptrn WITH \(cust, req).[ CASE req OF (#match, _, env) : [ SEND env TO cust ](#eq, right, env) : [SEND (cust, #bind, SELF, env) TO right]_ : [ SEND ? TO cust ] END ]

### Pattern #bind

A `#bind`

message tells a pattern that an identifier is available to bind, if the pattern has a value to bind to. A constant pattern has a value, so it can simply turn around and match its value to the identifier.

LET const_ptrn_beh(value) = \(cust, req).[ CASE req OF (#match, $value, env) : [ SEND env TO cust ] (#eq, right, env) : [ SEND (cust, #match, value, env) TO right ](#bind, left, env) : [SEND (cust, #match, value, env) TO left]_ : [ SEND ? TO cust ] END ]

When `const_ptrn_beh` receives a `#bind`

message it sends its `value` in a `#match`

message to the `left` (identifier) pattern. As we’ve already seen, the identifier pattern handles `#match`

messages by extending the environment with a new binding, which is exactly what we want.

Value patterns, as shown above, create constant patterns to do the work, (and `#bind,left,env`

matches the expected protocol) so there is nothing new to add for `#bind`

.

Identifier patterns and the wildcard pattern are an interesting case. If there are identifiers on both sides of an equation, it might be reasonable to interpret that as aliasing. However, to keep things simple we will discourage that practice by making the result “undefined”. Again, this requires no further changes, since any “unknown” request results in sending `?`

the undefined value to the customer.

## Pair Pattern

The previous sections show how to handle all of the patterns from part 2. Now we will extend the pair pattern from part 3. The first step is to add support for the `#eq`

message. We use a strategy similar to the identifier pattern, extending the pattern protocol to account for pairs.

LET pair_ptrn_beh(head_ptrn, tail_ptrn) = \(cust, req).[ CASE req OF (#match, (h, t), env) : [ SEND (k_head, #match, h, env) TO head_ptrn CREATE k_head WITH \env'.[ CASE env' OF ? : [ SEND ? TO cust ] _ : [ SEND (cust, #match, t, env') TO tail_ptrn ] END ] ](#eq, right, env) : [SEND (cust, #pair, SELF, env) TO right]_ : [ SEND ? TO cust ] END ]

When `pair_ptrn_beh` receives an `#eq`

message it sends a `#pair`

message to the `right` pattern.

### Pattern #pair

A `#pair`

message tells a pattern that the other pattern is a pair pattern. The receiving pattern must decide how to handle matching the pair pattern. Once again, a constant pattern can simply turn the tables and provide a value back to the pair pattern.

LET const_ptrn_beh(value) = \(cust, req).[ CASE req OF (#match, $value, env) : [ SEND env TO cust ] (#eq, right, env) : [ SEND (cust, #match, value, env) TO right ] (#bind, left, env) : [ SEND (cust, #match, value, env) TO left ](#pair, pair, env) : [SEND (cust, #match, value, env) TO pair]_ : [ SEND ? TO cust ] END ]

An identifier or wildcard pattern provides an opportunity for binding, so it also delegates back to the pair pattern, but with a `#bind`

message.

LET ident_ptrn_beh(ident) = \(cust, req).[ CASE req OF (#match, value, env) : [ CREATE env' WITH env_beh(ident, value, env) SEND env' TO cust ] (#eq, right, env) : [ SEND (cust, #bind, SELF, env) TO right ](#pair, pair, env) : [SEND (cust, #bind, SELF, env) TO pair]_ : [ SEND ? TO cust ] END ] CREATE any_ptrn WITH \(cust, req).[ CASE req OF (#match, _, env) : [ SEND env TO cust ] (#eq, right, env) : [ SEND (cust, #bind, SELF, env) TO right ](#pair, pair, env) : [SEND (cust, #bind, SELF, env) TO pair]_ : [ SEND ? TO cust ] END ]

This brings us right back to the pair pattern, to handle the `#bind`

message. The pair pattern must produce a value to bind to the identifier, so it asks each sub-pattern for a value and combines them into a pair.

LET pair_ptrn_beh(head_ptrn, tail_ptrn) = \(cust, req).[ CASE req OF (#match, (h, t), env) : [ SEND (k_head, #match, h, env) TO head_ptrn CREATE k_head WITH \env'.[ CASE env' OF ? : [ SEND ? TO cust ] _ : [ SEND (cust, #match, t, env') TO tail_ptrn ] END ] ] (#eq, right, env) : [ SEND (cust, #pair, SELF, env) TO right ](#bind, left, env) : [CREATE call WITH call_pair_beh(head_ptrn, tail_ptrn)SEND (k_pair, #value, NIL, env) TO callCREATE k_pair WITH pair_bind_beh(cust, left, env)]_ : [ SEND ? TO cust ] END ]LET pair_bind_beh(cust, left, env) = \msg.[CASE msg OF(?, _) : [ SEND ? TO cust ](_, ?) : [ SEND ? TO cust ](h, t) : [ SEND (cust, #match, (h, t), env) TO left ]_ : [ SEND ? TO cust ]END]

We will re-use the “call” behavior from part 3 to send a `#value`

request to each sub-pattern. The resulting pair of values is received by a dynamically created actor `k_pair`, whose behavior is defined by `code_bind_beh`. If either component of the pair-value is undefined, that means one of the sub-patterns could not produce a value, so the result is also undefined. If both sub-patterns produced values, the pair-value is sent to the `left` pattern to complete the binding (by `#match`

ing the pair-value).

### Pattern #value

A `#value`

message requests that a pattern produce a value. Only a constant pattern can produce a value. But, remember that a value pattern will `BECOME`

a constant pattern with the value of its expression. So a value pattern produces a value indirectly.

LET const_ptrn_beh(value) = \(cust, req).[ CASE req OF (#match, $value, env) : [ SEND env TO cust ] (#eq, right, env) : [ SEND (cust, #match, value, env) TO right ] (#bind, left, env) : [ SEND (cust, #match, value, env) TO left ] (#pair, pair, env) : [ SEND (cust, #match, value, env) TO pair ](#value, _, env) : [SEND value TO cust]_ : [ SEND ? TO cust ] END ]

When `const_ptrn_beh` receives a `#value`

message, it simply sends its `value` to the customer `cust`. Note that the middle argument in the `#value`

message ignored. It is a place-holder used to keep the message structure consistent with the general form expected by `value_ptrn_beh`.

### Pattern #both

We must deal with one final case. When there is a pair pattern on both sides of an equation, we want to recursively match the sub-patterns of each pair pattern, combining any bindings created by the sub-pattern matches. Once again, this requires extending the protocol of our pattern behaviors. This time we introduce a `#both`

message, which indicates that the other pattern is a pair, and provides access directly to the sub-patterns.

LET pair_ptrn_beh(head_ptrn, tail_ptrn) = \(cust, req).[ CASE req OF (#match, (h, t), env) : [ SEND (k_head, #match, h, env) TO head_ptrn CREATE k_head WITH \env'.[ CASE env' OF ? : [ SEND ? TO cust ] _ : [ SEND (cust, #match, t, env') TO tail_ptrn ] END ] ] (#eq, right, env) : [ SEND (cust, #pair, SELF, env) TO right ] (#bind, left, env) : [ CREATE call WITH call_pair_beh(head_ptrn, tail_ptrn) SEND (k_pair, #value, NIL, env) TO call CREATE k_pair WITH pair_bind_beh(cust, left, env) ](#pair, pair, env) : [SEND (cust, #both, (head_ptrn, tail_ptrn), env)TO pair](#both, (h_ptrn, t_ptrn), env) : [SEND (k_head, #eq, h_ptrn, env) TO head_ptrnCREATE k_head WITH \env'.[CASE env' OF? : [ SEND ? TO cust ]_ : [SEND (cust, #eq, t_ptrn, env')TO tail_ptrn]END]]_ : [ SEND ? TO cust ] END ]

When `pair_ptrn_beh` receives a `#pair`

message, it sends a `#both`

message to the other pair pattern. The pair of patterns held by this pattern is sent as the argument.

When `pair_ptrn_beh` receives a `#both`

message, it sends a new `#eq`

message to unify `head_ptrn` with `h_ptrn`, the head of the other pattern. The customer of this message `k_head` receives the extended environment `env’`. If the extended environment is undefined, the head match fails, to undefined is send to the customer. Otherwise, an `#eq`

message is sent to unify `tail_ptrn` with `t_ptrn`, the tail of the other pattern. The further extended environment is sent to the original customer `cust`.

As you can see, handling pair patterns add significant complexity to the matching algorithm. The combinations of cases greatly increases, leading to several extensions to the protocol of patterns. However, it should be noted that pairs allow the construction of arbitrarily complex structures. All of the other patterns are simply leaves on the tree. Pairs are the branches that give it shape, so some additional complexity is to be expected.

## Application Equation

So far, we have considered the unification of pattern equations. Our extended grammar adds one more equation form, the application equation. We will often use this kind of equation to define function abstractions. Consider the equation in this `LET`

expression:

LET dec(x) = sub(x, 1) IN ...

This equation asserts an equality between two applications. It says that the application `dec(x)`

is equal to the application `sub(x, 1)`

. Normally when we see this kind of equation, we have some context that tells us `sub`

is already known, thus `dec`

is the thing being defined. Without this contextual knowledge, we are back in the variable aliasing situation we explicitly avoided before. As a practical compromise, we will adopt the convention of putting the application being defined on the left side of the equation and the defining expression on the right. This breaks the symmetry of being able to reverse the two sides of an equation, but it gives us a way to resolve the ambiguity in a sensible way.

We implement this kind of equation by a source-to-source transformation, rewriting the equation as follows:

<ident> '(' <ptrn> ')' '=' <expr>

is rewritten as

<ident> '=' '\' <ptrn> '.' <expr>

This creates a constant pattern on the right side of the equation, since a lambda abstraction is a constant. After the transformation we have a regular pattern equation which is handled by the behaviors we have already seen. In the case of our example equation, the rewritten code is:

LET dec = \x.(sub(x, 1)) IN ...

## Summary

We have explored the implementation of pattern equations using actors. These are true equations which establish, rather than test, equality. These equations form the basis for `IF`

and `LET`

expressions, representing conditional and unconditional assertions of equality. Unification of equations required several extensions to the protocol of pattern-matching actors. The resulting message flow is an excellent example of distributing responsibilities among a cooperating group of actors. Each pattern performs matching operations based on it own private data. The message contents and the state of dynamically-created continuations accumulate the results. Finally, we added a bit of syntactic sugar in the form of a source-to-source transform rewriting equations that define applications.

This completes our pure functional expression language. We’ve taken a few non-traditional branches in exploring this solution space. We used pattern matching to generalize parameter substitution in part 2. We used pairs (instead of Currying) to generalize multi-valued arguments in part 3. And we used pattern equations to generalize conditional definitions in part 4. In part 5 we will explore the implications of supporting recursive references.

## References

- [1]
- H. Baker. Equal Rights for Functional Objects or, The More Things Change, The More They Are the Same. http://home.pipeline.com/~hbaker1/ObjectIdentity.html, 1992.

## 3 Comments

thanks for the post

Thanks for good stuff

Helpful blog, bookmarked the website with hopes to read more!

## 3 Trackbacks

[…] building block introducing pattern matching. We will have more to say about pattern matching in part 4. This simple case evaluates an expression and then attempts to match the resulting value against […]

[…] part 4 we introduced the LET/IN expression. This allowed us to extend our lexical environment with new […]

[…] part 4 we will extend our pattern matching behaviors to support pattern equations. These are true […]