Hypermedia Workflow

Purpose

Here, I'm going to take a look at how "workflow" might integrate into a hypermedia system based around REST's uniform interface architectural constraint, and with "hypermedia as the engine of application state".

There's been much excitement recently in Web services, about choreography, orchestration, and the like. I've chosen to avoid using those terms, because I don't think they're very well defined. So I'm using "workflow", which I think is mostly well understood. FWIW, it seems to jive well with the term "process execution" such as is seen in BPEL4WS.

I'm actually going to focus on BPEL here, in explaining how workflow could be done on the Web.

BPEL4WS

Let's get right to some examples. I'll start at Chapter 10, "Structured Activities", since it defines most of the workflow capabilities of the specification.

Sequences

A BPEL Sequence activity contains a list of activities which are to be executed in order. The intent is to declare and expose an "order of operations", so to speak, for a series of WSDL-described Web services and their operations.

On the Web, in a solution respecting all of REST's constraints, a sequential dependancy such as this would be communicated, at runtime, through the successive publication of URIs identifying possible state changes (as new resources). As Roy Fielding writes in his dissertation;

The model application is therefore an engine that moves from one state to the next by examining and choosing from among the alternative state transitions in the current set of representations. Not surprisingly, this exactly matches the user interface of a hypermedia browser. However, the style does not assume that all applications are browsers. [...]

So if the sequenced activity was to first put on your shirt, and then your coat, this would be reflected on the Web with a scenario such as;

GET on a URI to reveal a representation of a person with no shirt

representation includes a POST form declaring acceptance of representations of shirts

POST a representation of a shirt to that URI
POST response redirects back to initial URI, whose resolution now reflects that the person is wearing a shirt

new POST form reflects it's expecting representations of coats

POST representation of coat
etc..

Also note that Paul Prescod has described some similar examples that involve simple navigation of URIs via GET, without state changes.

Switch

A BPEL "Switch" is a construct for introducing conditions based on the evaluation of a simple expression. So one might say "If stock on hand is less than 5% of capacity, follow the 'low inventory' process". This is indeed a useful tool, but again, need not be communicated in order to automate such a process. It can be viewed as an implementation detail, and exposed through a representational state and URI based interface.

For example, if a client or its machine agent were filling out an order for some number of widgets, then upon submitting this form to a processor for processing, the code that did the processing may make the "low inventory" determination itself, and follow the process behind the scenes, returning different information such as a document which offered a 10% discount if the purchaser would accept a 5 day delay. In this case, the purchaser needs to know nothing about the internal details of the business processes of the supplier, only the hypermedia interface, and the data formats returned through it.

While

While declares the repeated execution of an activity until some expression is valuated true.

Again, this is an implementation detail that need not be communicated to a party using the service. It can be well hidden behind the hypermedia interface.

Pick

"Pick" provides a conditional construct similar to a Switch, except that rather than the decision being based on an expression, it's based on the arrival of a message, or alternately, the expiration of a timer (which provides an upper bound on wait times for messages). The first message to arrive that is identified in the pick, completes the pick.

The exposure of this construct through the uniform interface would exactly resemble the exposure of the Switch construct. That is, the fact that some particular condition was met (be it an event or alarm occuring, or an expression evaluating true) which triggered a change in application state would be an implementation detail whose effect would be reflected through the presentation of a URI at some point in the document flow.

The example given in the BPEL4WS spec is;

<pick>
  <onMessage partner="buyer"
                portType="orderEntry"
                operation="inputLineItem"
                container="lineItem">
        <!-- activity to add line item to order -->
  </onMessage>
  <onMessage partner="buyer"
                portType="orderEntry"
                operation="orderComplete"
                container="completionDetail">
        <!-- activity to perform order completion -->
   </onMessage>

   <!-- set an alarm to go after 3 days and 10 hours -->
   <onAlarm for="P3DT10H">
        <!-- handle timeout for order completion  -->
   </onAlarm>
</pick>

The purpose of this example is to trigger an activity depending upon the occurence of one of three events; the arrival of any of two types of messages, or the expiration of a timer.

This could quite easily be hidden behind any interface, be it REST's uniform interface, or WSDL's service-specific interfaces. For example, using the example WSDL above, if the timer expires then some action may be taken, and the state of the service modified such that invoking the inputLineItem or "orderComplete" operations resulted in a "Too bad, timer has passed" fault. Similarly, this could be hidden behind REST's uniform interface by exposing two resources, one which identifies an order to which line items can be added (POSTed), and the other a processing container to which representations of orders can be submitted (POSTed).

Flow

A Flow is a construct that adds concurrency and synchronization capabilities to activities. Most simply, any set of activities nested in a flow element are being declared to be able to execute concurrently. In workflow terms, this would enable "Splits" and "Joins"/"Rendezvous".

From a hypermedia point of view, this is perhaps the most interesting of constructs, as one wouldn't automatically appreciate that the hypermedia application model was rich enough to encapsulate this functionality. I believe it is.

Take the simple example from the BPEL spec;

<sequence>
  <flow>
     <invoke partner="Seller" .../>
     <invoke partner="Shipper" .../>
  </flow>
  <invoke partner="Bank" .../>
</sequence>

This declares that the Seller and Shipper activities can occur concurrently, but that when both are complete, that the Bank activity can occur. This identifies a "Split" at the beginning of the flow, and a "Join" at the end of it.

A hypermedia version of this might proceed as follows;

GET on a URI identifying the composite process

response is a 300, whose body provides URIs to the next steps in each of the Seller and Shipper processes. The semantics of "300" are such that each provided URI is a valid state transition, inducing the desired "Split" semantic (actually, even a 200/2xx may be fine - I'll have to mull that one over).

client proceeds to follow both processes
assuming the Seller process is completed first

client invokes MONITOR on the final URI provided by the Seller process (the URI would identify the Rendezvous with the Shipper process)
at some point, the Shipper process concludes and the service modifies the state of the Rendezvous to indicate a new URI which declares the next state transition in the Sequence
client is notified of updated Rendezvous due to established MONITOR, refreshes it, sees the new URI, and invokes GET on it

client follows next process to conclude Sequence

There's a lot more complexity to Flow that I'm going to skip over for now. If somebody wants to pay me, I'd be happy to finish it. 8-) Suffice it to say that the little used 300 response code is sufficient for enabling "Splits" to occur in a hypermedia application model, and that the to-be-standardized MONITOR HTTP extension (or Waka built-in) would provide the scalable (compared with constant refreshing) ability to enable Rendezvous.

Conclusion

Web services have been lacking an application model since they arrived on the scene a few years ago, and "choreography" solutions are a response to that. For whatever reason, the hypermedia application model used by the Web was rejected, despite, as this paper attempts to lay out, its suitability for the task.

If the issue were as simple as two possible solutions to the same problem, this paper would be nothing more than an interesting comparison. However, a critical point is that the choreography application model requires that implementation details (the business process rules) be exposed as part of the interface. This will make them very brittle should they need to change, as well as drastically increasing the coordination costs of implementing a multi-party process (since the more parties that are in the process, the more agreement that will need to be made). By keeping process implementation details hidden behind the interface, none of those problems exist.