Composing Actors

A significant challenge in developing concurrent systems is the problem of composability. We create a solution that works properly in isolation, but when composed with other solutions leads to interference. The actor model ensures that individual actors may be composed without changing their behavior. Interference is prevented by definition, keeping the system consistent. In order to create larger actor-based systems, we must be able to treat groups of collaborating actors as if they were a single actor. We need the same guarantee of consistency that exists for individual actors. This consistency is provided by serializers [1].

Service Serializers

We use the term “service” to describe an actor with request-response behavior. Services can be implemented by a group of collaborating actors. A single actor will always process messages one-at-a-time. For a group of actors to behave like a single actor, requests (from outside the group) to the group entry-point must be handled one-at-a-time. The following behaviors implement a general purpose serializer for actor-based services.

Serializer State Machine

Figure 1 – Serializer State Machine.

An actor implementing a service expects a message consisting of a customer paired with a request. Eventually, a response is sent to the customer. A serializer provides a wrapper for a service that ensures the service will receive requests one-at-a-time. If the service is busy handling a request, the serializer buffers new requests until the service responds to the current request (see Figure 1). Customers of the service do not issue requests directly to the service. Instead they issue requests to the serializer acting as gate-keeper for the service. From the perspective of a single request-response cycle, there is no difference between the serializer and the actual service.

LET serializer_beh(svc) = \(cust, req).[
	CREATE ret WITH return_beh(SELF)
	BECOME serializer_busy_beh(svc, cust, ret, NIL)
	SEND (ret, req) TO svc
]

An “idle” serializer receives a message consisting of a customer cust and a request req. It creates a return-proxy ret, transitions to “busy” state, and sends the request on to the actual service svc. The customer for the request to the actual service is the return-proxy ret rather than the original customer.

LET return_beh(caller) = \res.[
	SEND (SELF, res) TO caller
]

The return proxy eventually receives a response res from the actual service. When it does, it tags the response with its own identity, and sends the pair back to caller (the serializer, in “busy” state). The identity of the return proxy actor is used as an authentication token. Only the serializer that created the return proxy (and the proxy itself) knows the identity of the proxy. Therefore, no other actor could construct a message containing the identity of the proxy and send it to the serializer. This allows the serializer to distinguish between a return response and a new request.

LET serializer_busy_beh(svc, cust, ret, pending) = \msg.[
	CASE msg OF
	($ret, res) : [
		SEND res TO cust
		IF $pending = ((cust', req'), rest) [
			CREATE ret' WITH return_beh(SELF)
			BECOME serializer_busy_beh(svc, cust', ret', rest)
			SEND (ret', req') TO svc
		] ELSE [
			BECOME serializer_beh(svc)
		]
	]
	(cust', req') : [
		LET pending' = $((cust', req'), pending)
		BECOME serializer_busy_beh(svc, cust, ret, pending')
	]
	END
]

A “busy” serializer retains much more state than an “idle” serializer. In addition to the actual service svc, it captures the original customer cust for the current request, the identity of the return proxy ret, and a (possibly empty) list of pending requests that arrive during processing of the current request. A “busy” serializer receives two kinds of messages, return responses and new requests.

A return response is a pair consisting of the identity of the return proxy ret, and the response value res. This is our indication that the service has completed processing the current request and produced a response. We immediately forward the response to the original customer cust. If the pending list is empty we simply transition back to “idle” state. Otherwise, we extract a new customer cust’ and request req’ from the pending list. A new return proxy ret’ is created, and we remain in “busy” state with the rest of the buffered messages. Finally, the new request is sent on to the actual service svc. This is the same as the behavior of the serializer in “idle” state, except that we have to keep track of any additional buffered messages.

Any message that is not a return response must be a new concurrent outside request. Since the service is still busy processing the current request, this new request must be buffered for future delivery. We accomplish this buffering by adding the entire message (customer and request) to the pending list.

Implementing Queues

This queue implementation uses a serializer to guard internal state transitions among a group of collaborating actors. Customers of the queue issue requests to the serializer as if they were sending to the queue directly. Responses are forwarded to the original customer through a return-proxy created by the serializer.

We start with a behavior that could be considered a “constructor”. It contains all the information needed to initialize a new configuration of actors. It does not process any messages itself, but delegates them to itself after preparing its initial behavior. This is a form of lazy initialization. It is done only once, if and when the first message is delivered to this actor.

LET serialized_queue_beh = \msg.[
	BECOME serializer_beh(NEW empty_queue_beh)
	SEND msg TO SELF
]

In this case, we initialize the actor as an empty queue wrapped in a serializer. The creator of an actor with this behavior need not be aware that a serializer is involved. In fact, we could choose to define empty_queue_beh within the local scope of this behavior, thus preventing direct use of the implementation. The (initially empty) queue service is created anonymously, so only the serializer has a reference to the service, thus only the serializer can send messages to the service.

Serialized Queue Message Flow

Figure 2 – Serialized Queue Message Flow.

The protocol of a queue includes #put and #take requests. #take removes an item from the head of the queue and sends it to the customer. #put appends an item to the tail of the the queue and sends the item to the customer as a synchronization signal. Figure 2 illustrates the interaction between the lazy-initialized serializer and the underlying queue service implementation in handling #put and #take requests from concurrent customers.

LET empty_queue_beh = \(cust, req).[
	CASE req OF
	#take : [ SEND NIL TO cust ]
	(#put, item) : [
		CREATE e WITH queue_entry_beh(item, NIL)
		BECOME queue_beh(e, e)
		SEND item TO cust
	]
	END
]

When the queue state is “empty”, it responds to #take requests by returning NIL. If an “empty” queue receives a #put request, it creates a new queue entry e to hold the item, transitions to “non-empty” state, and sends the item back to the customer cust.

A “non-empty” queue retains references to the first and last entries in the queue. Of course, initially these are the same entry. But as the queue grows, this allows constant-time access to both the head and tail of the queue.

LET queue_beh(first, last) = \(cust, req).[
	CASE req OF
	#take : [
		BECOME queue_first_beh(cust, first, last)
		SEND (SELF, #remove) TO first
	]
	(#put, item) : [
		BECOME queue_last_beh(cust, first, last)
		SEND (SELF, #append, item) TO last
	]
	END
]

Removing or appending an item on a “non-empty” queue involves coordination between the queue and the queue entries. To handle a #take request, the queue transitions to “waiting-for-first” state, and sends a #remove request to the first queue entry. To handle a #put request, the queue transitions to “waiting-for-last” state, and sends an #append request to the last queue entry. While in these intermediate states we are assured that no outside requests will arrive because we have not sent a response to the serializer (our customer).

Each queue entry stores an item, and a reference to the next entry (or NIL if this entry is the last).

LET queue_entry_beh(item, next) = \(cust, req).[
	CASE req OF
	#remove : [
		SEND (SELF, item, next) TO cust
	]
	(#append, item') : [
		CREATE next' WITH queue_entry_beh(item', next)
		BECOME queue_entry_beh(item, next')
		SEND (SELF, item', next') TO cust
	]
	END
]

When a queue entry receives a #remove request, it sends its own identity, the item it holds, and its next link to the customer cust. The customer is always the queue itself, since no other actor sends messages to a queue entry.

When an #append request is received, a new queue entry next’ is created to hold the new item’. The next links are adjusted to attach the new entry after the current one. Finally, this entry’s identity, the new item’, and the new entry next’ are sent to the customer (the queue itself).

While a #take request is in progress, the queue is waiting for the first queue entry to respond to a #remove request. In this state the queue retains, the original customer cust, and references to the current first and last queue entries.

LET queue_first_beh(cust, first, last) = \msg.[
	IF $msg = ($first, item, next) : [
		IF $next = NIL [
			BECOME empty_queue_beh
		] ELSE [
			BECOME queue_beh(next, last)
		]
		SEND item TO cust
	]
]

When the first queue entry completes processing the #remove request, the queue receives a message that matches the first identity, provides the item removed, and the next entry to become the new first. If there is no next entry (it is NIL), then the queue transitions to “empty” state. Otherwise the queue updates first with the value of next, returning to “non-empty” state. In either case, the item is sent back to the customer cust. Since this customer is the serializer (or more precisely, the return-proxy) this indicates completion of the serialized transaction and allows delivery of new requests from actors outside this collaborative group.

While a #put request is in progress, the queue is waiting for the last queue entry to respond to an #append request. In this state the queue retains, the original customer cust, and references to the current first and last queue entries.

LET queue_last_beh(cust, first, last) = \msg.[
	IF $msg = ($last, item, next) : [
		BECOME queue_beh(first, next)
		SEND item TO cust
	]
]

When the last queue entry completes processing the #append request, the queue receives a message that matches the last identity, provides the item appended, and the next entry to become the new last. The queue updates last with the value of next, returning to “non-empty” state, and sends the item back to the customer cust. Since this customer is the serializer (or more precisely, the return-proxy) this indicates completion of the serialized transaction and allows delivery of new requests from actors outside this collaborative group.

Queue Service State Machine

Figure 3 – Queue Service State Machine.

Figure 3 summarizes the state transitions within the queue service. Requests from outside customers (via the serializer) can arrive only when the queue is in “empty” or “non-empty” states. The “waiting-for-first” and “waiting-for-last” states are internal states used to coordinate the actor representing the queue with the actors representing the queue entries.

Conclusion

Groups of collaborating actors can be treated as a single actor with the help of a serializer. We describe a generic transparent serializer for request-response services. A queue implementation is used to illustrate the usage of this serializer. Correct concurrent behavior of the queue requires the coordination of updates to multiple collaborating actors. The serializer provides protection from overlapping requests, allowing the queue to operate safely.

References

[1]
C. Hewitt and R. Atkinson. Specification and Proof Techniques for Serializers. IEEE Transactions on Software Engineering, Vol. SE-5, No. 1, January 1979.


Tags: , , , , , ,
This entry was posted on Friday, May 21st, 2010 at 7:57 am and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

16 Responses to “Composing Actors”

  1. Message Passing, part 2 – Object-Oriented Method Invocation

    […] for specific variables by matching their names as part of the lookup policy. Method calls may be serialized to prevent concurrent access to shared state. Every part of the meta-object protocol is explicitly […]

  2. Roy Oliver

    Actors are indeed powerful, yet are there any concrete simple (if possible) web project examples?

  3. Dale Schumacher

    I suggest you take a look at Akka and node.js as possible starting-points. Akka is explicitly actor-based, and node.js is built on asynchronous messaging, which can lead to larger-scale actor-like designs.

  4. Delucia

    I’m interested in the actor paradigm. However, the queue implementation is far more complicated than the OO approach and I don’t see the added value. Basically the serializer is doing locking to allow processing of one message at a time. Creating an equivalent in an imperative language like python is trivial with the same performance characteristics and far less code. Please correct me if I’m wrong.

  5. Delucia

    I realize now the difference which is in the case of locking the client will be blocked while in the actor model it won’t. This is a big difference because in order to achieve the same effect in OO (not blocking) it will require more code than with the actor model.

    I have a question regarding the functional nature of the solutions. Is this part of actor model or is it specific to humus or is it just the way you prefer to do it?

    For example, is it ok to implement an queue using on actor that keeps an internal list to hold items since the actor process one message at a time?

  6. Dale Schumacher

    The point of this article is to illustrate how arbitrary collections of actors can be treated as a single actor. This allows scaling, by composition, to arbitrarily large systems, while maintaining the guarantee of correct concurrent behavior. I chose the queue as a simple-to-understand example, but this applies to services as complex as a full database. As you noted, the actor approach avoids blocking the whole system while delaying responses to customers contending for the same resource.

    The implementation details of the queue, like any actor, are conveniently encapsulated behind the asynchronous messaging interface. You are free to use any representation you like within the actor’s behavior, as long as you support the expected message protocol. The implementation I chose here has O(1) performance.

  7. Java Plain Old Concurrent Object « Josef's Blog

    […] · Schumacher, Dale, “Composing Actors”, “blog It’s Actors All The Way Down”, http://www.dalnefre.com/wp/2010/05/composing-actors/ […]

  8. senderista

    Storing pending messages in a linked list has O(1) push/pop, but also implies LIFO semantics, which means the serializer is liable to starve earlier clients once throughput reaches steady state. (FIFO semantics would require yet another queue implementation.)

  9. Dale Schumacher

    @senderista Good Observation. I wanted to keep the description simple, so I used the easy-to-implement push/pop operations to store deferred messages. This is somewhat justified on the basis that there are no assurances about message delivery order in the actor model.

    Also, services are expected to generally keep up with demand, so the deferred message buffer with generally be short and empty often. This is true of any service. If it’s protected by a serializer, messages will pile up in the deferred message buffer. If not, messages can still pile up in the message delivery system of the actor runtime. I expect to be posting a few articles that address this issue by providing back-pressure and flow-control among producers and consumers.

    [EDIT: See http://www.dalnefre.com/wp/2012/05/producer-consumer-rate-matching/ for much more sophisticated flow control]

  10. High Availability for Mutable Shared State

    […] previous articles, when we needed to defer customers, we often used a simple stack built by pairing new […]

  11. Dale Schumacher

    See http://www.dalnefre.com/wp/2011/11/high-availability-for-mutable-shared-state/#queue for an O(1) Banker’s Queue implementation that could be used to implement the “pending” list here.

  12. Debugging Actor Systems

    […] the interactions inherent in their connecting protocol, separately with no loss of generality. The Serializer is a simple example of this […]

  13. Java Plain Old Concurrent Object | T. C. Mits 108

    […] · Schumacher, Dale, “Composing Actors”, “blog It’s Actors All The Way Down”, http://www.dalnefre.com/wp/2010/05/composing-actors/ […]

  14. It's Actors All The Way Down » Serializers Revisited

    […] of the earliest concepts explored on this blog was the Serializer, which is a mechanism for providing exclusive access to a group of actors [1]. More recently, Carl […]

  15. David McClain

    I just tried constructing some serializers in my Lisp. Turns out, in an SMP multi-core environment, the BECOME primitive is a real problem…

    There is SMP contention on the behavior slot.

    If you make access to behavior SMP safe it is still too late to redirect parallel threads that were already dispatched using the old behavior.

    So you have to serialize all executions of any Actors that want to call BECOME. And that serialization has to occur ahead of dispatch for other threads.

    And now that I think more about it, it isn’t just unsafe in SMP, but even in preemptive single-core systems. Any threads that have already passed dispatch become un-directable by a call to BECOME.

  16. Dale Schumacher

    I suspect that “become” is not the only problem. Actor behaviors must be single-threaded to get the right semantics. The most common model for doing this is to schedule actors across multiple threads/cores such that only the behavior from one event can execute on a given actor at one time. Pending events for other actors may be freely scheduled to other threads/cores.

    This leads to a traditional idle/ready/busy scheduling state for each actor. Idle if there are no message-events pending for that actor. Ready if there are. Busy if the behavior handling a message-event is currently running. If you have a shared ready-queue, each thread/core can pull (an actor) from the queue, dispatch a single message-event, and return the actor to the ready-queue, or idle if no more messages are pending for that actor.

    FWIW, that is definitely not the only scheduling policy that works. In fact, I typically use something more sophisticated. However, it is probably the most conceptually simple, and is similar to basic thread-scheduling in a multi-core system.

Leave a Reply

Your comment