## Solving “Same Fringe” with Stream Generators

A classic problem in concurrent programming is known as the “same fringe” problem [1]. What is the same fringe problem? As described by Richard Gabriel [2]:

The samefringe problem is this: two binary trees have the same fringe if they have exactly the same leaves reading from left to right.

There are many different approaches toÂ solving this problem in a variety of languages. Our approach is to incrementally generate a stream of leaves from a tree (the fringe). We will use a comparison process to read from a pair of streams (one for each tree) and report if they are the same or different. Comparison stops as soon as we detect a difference, avoiding the need to generate the rest of the fringe.

## Stream Generators

When we generate the fringe of a tree, we follow a stream-generator protocol. Before we get into the fringe itself, let’s examine a protocol for streams in general.

A stream-generator receives requests with only a customer. It sends a pair to this customer consisting of a value and an actor to ask for the next pair. You can think of the actor as representing a position in the stream.

A simple stateful stream-generator is one that generates an infinite sequence of values.

```LET sequence_gen_beh(value, fn) = \cust.[
BECOME sequence_gen_beh(fn(value), fn)
SEND (value, SELF) TO cust
]```

The sequence begins with an initial value. When a request is received designating a customer cust, the mutable state of the stream is updated by applying a function fn to the value. The customer is sent a pair consisting of the original value and a reference to itself. Since its state has been updated with the new value, the actor to ask for the next value is the same actor.

Figure 1 - Mutable Stream-Generator

We can also generate the same sequence of values using immutable actors.

```LET sequence_gen_beh(value, fn) = \cust.[
CREATE next WITH sequence_gen_beh(fn(value), fn)
SEND (value, next) TO cust
]```

In this implementation, rather than altering the state of the current actor, we create a new actor to represent the next value in the sequence. This allows us to retain references to actors representing any particular position in the stream and ask them to replay the stream from that position.

Figure 2 - Immutable Stream-Generator

The immutable implementation can be made more efficient by avoiding recalculation of successive values when the sequence is replayed.

```LET sequence_gen_beh(value, fn) = \cust.[
CREATE next WITH sequence_gen_beh(fn(value), fn)
BECOME \cust.[
SEND (value, next) TO cust
]
SEND cust TO SELF
]```

In this implementation, we only calculate theÂ next value once. We change the actor’s subsequent behavior to always return the sameÂ value andÂ next. We re-issue the request (the customerÂ cust) to the updated actor. This is a form of memoization. Once initialized, each actor is immutable.

Figure 3 - Memoizing Stream-Generator

Notice that all of these actor behaviors support the same protocol. From the perspective of a customer reading the stream only once, their observable behavior is identical. The mutable and immutable versions differ in their handling of repeated requested to the same actor. Specifically, the mutable version is not re-usable since itÂ becomes the representation of the next position on each request.

## Fringe Generation

The fringe generator produces a stream of values, pairing each value with an actor representing the generator of the next value. Each fringe generator actor maintains a location in theÂ tree, and the actor that will generate the next result. By convention, whenÂ next isÂ `NIL`, we’ve reached the end of the stream. The generator has mutable state. It is intended to be used only once.

```LET fringe_gen_beh(tree, next) = \cust.[
IF \$tree = (left, right) [
SEND cust TO NEW fringe_gen_beh(left, SELF)
BECOME fringe_gen_beh(right, next)
] ELSE [
SEND (tree, next) TO cust
]
]```

When a customerÂ cust requests a value, we check ifÂ tree is a branch or a leaf. If we have a leaf, we send the customer the leaf valueÂ tree and theÂ next actor. If we have a branch, we create an actor to represent theÂ left part of the tree, designating the current actor asÂ next. We send the customerÂ cust to the new actor in order to process the left sub-tree first. The current actor becomes the representation of theÂ right part of the tree, using the original value ofÂ next. In this way, once the left sub-tree has been processed, the right sub-tree is processed asÂ next. Once the right sub-tree has been processed, we continue with the originalÂ next.

Figure 4 - Fringe Stream-Generator

Figure 4 illustrates the fringeÂ `1, 2, 3, 4` generated from the treeÂ `(1, ((2, 3), 4))`. Note how the fringe is generated incrementally. Each request only does enough work to locate the next element of the fringe. This property is critical to the efficient comparison of trees with large fringe, especially if they differ in the early part of the fringe.

## Stream Comparison

Checking two trees for the same fringe is accomplished by comparing corresponding values from two fringe-generator streams. We create an actor behavior for incrementally comparing two streams regardless of how they are generated.

```LET cmp_stream_beh(cust) = \(value, next).[
BECOME \(value', next').[
IF \$value = \$value' [
BECOME cmp_stream_beh(cust)
IF \$next = \$next' [
SEND TRUE TO cust
] ELIF \$next = NIL [
SEND FALSE TO cust
] ELIF \$next' = NIL [
SEND FALSE TO cust
] ELSE [
SEND SELF TO next
SEND SELF TO next'
]
] ELSE [
SEND FALSE TO cust
]
]
]```

The stream comparison actor is designed to be used as the customer for two stream-generators. It maintains its own customerÂ cust to report aÂ `TRUE` or`FALSE` comparison result. When it receives aÂ value andÂ next from one of the streams, it transitions to a state waiting for the other stream. It doesn’t matter which stream generates a value first. When aÂ value’ andÂ next’ is received from the other stream, we now can compare corresponding values. If the values don’t match, we sendÂ `FALSE` to the original customerÂ cust. We only read enough of each stream to locate their first difference, or keep reading (as long as they match) until they both end.

If the values match, we prepare to receive (and compare) another set of corresponding values. However, before we can ask each generator to generate their next values, we have to check for termination conditions. IfÂ next andÂ next’ match (usually because both areÂ `NIL`), we sendÂ `TRUE` to the original customer cust, since we’ve successfully matched everything up through the end of both streams. If eitherÂ next orÂ next’ areÂ `NIL`, we sendÂ `FALSE` to the original customerÂ cust, since one stream has ended before the other. Otherwise, we send the comparison actor (ourÂ `SELF`) to each stream, requesting the next value from each.

Using these building blocks, we can create a service to compare pairs of trees.

```CREATE same_fringe_svc WITH \(cust, a, b).[
CREATE match WITH cmp_stream_beh(cust)
SEND match TO NEW fringe_gen_beh(a, NIL)
SEND match TO NEW fringe_gen_beh(b, NIL)
]```

The service creates a new stream comparison processÂ match that reports its results to the customerÂ cust. A fringe generator is created for each tree. The comparatorÂ match is sent (as a customer) to both streams, initiating the process of reading and comparing corresponding values from each stream.

The following test case generates aÂ `TRUE` result:

```LET a = (1, ((2, 3), 4))
LET b = ((1, (2, 3)), 4)
SEND (println, a, b) TO same_fringe_svc```

The following test case generates aÂ `FALSE` result:

```LET a = (1, ((2, 3), 4))
LET z = (1, 0, 3, 4)
SEND (println, a, z) TO same_fringe_svc```

An important property of the stream comparison process is that it avoids reading beyond the point of difference between the streams. This is especially important if one of the streams is infinite. Consider the result of executing the following:

```LET a = (1, ((2, 3), 4))
CREATE match WITH cmp_stream_beh(println)
SEND match TO NEW fringe_gen_beh(a, NIL)
SEND match TO NEW sequence_gen_beh(1, inc)```

The sequence generator will produce an infinite sequence starting withÂ `1, 2, 3, 4...` (assuming thatÂ inc is an integer increment function). The fringe generator will produce the same initial values, but ends afterÂ `4`. The stream comparison process will reportÂ `FALSE` after the fourth value and stop reading the streams.

## Conclusion

The “same fringe” problem is considered one of the simplest problems that requires concurrency to implement efficiently. It seems to require a non-deterministic merge between two incrementally-generated streams of information. Asynchronous actor messaging provides the means for handling the non-deterministic merge (comparison). In addition, this problem provides an excellent illustration of the use of incremental stream generators. We will have much more to say about stream-based processing in future articles.

## References

[1]
C. Hewitt, et. al. Behavioral Semantics of Non-recursive Control Structures, Proc. Colloque sur la Programmation, B. Robinet ed., in Lecture Notes in Computer Science, No. 19, Springer Verlag, 1974.
[2]
R. Gabriel. The Design of Parallel Programming Languages.Â http://www.dreamsongs.com/10ideas.html, 1991.

This entry was posted on Friday, June 4th, 2010 at 7:00 am and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

### 8 Responses to “Solving “Same Fringe” with Stream Generators”

1. chrmills

So what I don’t quite get here, is that the message passing is all synchronous, and indeed to ensure that you break off as soon as you find a failure to match it needs to be (i.e cust is waiting for one leaf before requesting the next, and as the call graph shows the messages propagate through the actors in a synchronous fashion).
So why I can see that this is a succinct way of expressing a solution to the problem I can’t see how concurrency is required for an efficient solution (as stated in the conclusion) – surely its synchronous nature means that you could just as easily implement it with coroutines with the same level of efficiency? Did I miss something?

2. chrmills

Sorry also forgot to ask, is there a reference implementation of humus available or is it just at the thought experiment stage at the moment? I like the simple grammar, it kind of reminds me of the likes of lisp or smalltalk in it’s simple syntax on top of a very powerful underlying concept (and it doesn’t have as many brackets as lisp, which I like :)).

3. chrmills

Ok after trying to implement this iteratively I now understand. The problem is moving to a node on both trees, then passing control back to allow comparing the nodes and then knowing where you got to and hence where you need to continue on your tree traversal to get to the next node. It’s easy enough to do with continuations, but then of course I fell in to the classic trap of conflating concurrent with parallel, continuations are concurrent so I just succeeded in proving your point without even noticing it :)
I’m guessing to do this iteratively I would need to maintain a list showing where in the tree I have ‘visited’ in the past so I can backtrack, but that will not be space efficient. Cool.

@chrmills It’s amazing what we learn by implementing something ourselves :) Glad you went to the effort. You got it exactly right.

As for a Humus implementation, I have a running interpreter but it’s not very friendly from a usability standpoint (most errors terminate the process). I’m working on a Humus simulator/debugger environment written in Javascript that will be a little easier to play with.

5. chrmills

I have roughed up an implementation in Lua, using coroutines as I fancied having a play to see what can be done with the underlying concept. It doesn’t follow exactly the same grammar as I have effectively implemented it as an embedded DSL, so it sticks to lua style grammar.
I have effectively combined new/create in to one primitive (becomes a create if you assign it to a variable), and the let is implemented using lambda’s instead of it’s own keyword. Most of the rest is the same though, happy to send it over to you if it would be of interest.
The implementation was able to correctly run the factorial and same fringe examples and a couple of my own tests to check the basics were down pat. Using metalua it would probably be possible to get the correct grammar too.

I am thinking of roughing up a c version based around thread pools (c programming is my day job and I think the actor model may be appropriate for implementing one of my ‘10%’ projects).