Werner responds. I don’t think we’re that out of synch, but I maintain that from what I’ve read of the techniques he’s talking about, they are not suited for Internet scale use. And by that, I mean a few orders of magnitude larger than the 10K/100K numbers he quotes. More like 10^8-10^11.

I know that Werner is anti-transparency, as am I. It was really interesting to watch the evolution of GCS research and tools in this regard. Sometime during this transition, “group communications” stopped being referred to as such, perhaps due to the reduced degree of coupling between members of a group; a result of the movement away from transparency (or perhaps because of the bad rep that it got due to the early highly-transparent commercial toolkits being quite brittle 8-). I guess I never bought into that terminology switchover though, as I always considered “group communications” to refer to any multi-party state-alignment based approach to concensus problems, which I’d say that even Werner’s groups more recent work falls under. Hopefully that explains my seemingly outdated view of the work of his group.

Anyhow, to see what a RESTful “GCS” might look like, I point to KnowNow and their mod_pubsub project. The principle means of managing reliability is via the stateless approach that’s part of REST, that KnowNow reused. That is, the client maintains application state, and so is responsible for dealing with partial failures and getting up-to-state (using GET, of course).

Werner Vogels comments on my argument against reliable messaging. I’m not sure he read it in its entirety though, as he leads off by saying;

I was surprised to read Mark Baker’s statement that he feels there is no need for reliable communication provisions in web-services runtimes.

Which isn’t the case, because I said that HTTP could do with some reliability help. What I’m against is the specific solution of an application-independant reliable messaging layer. There are other ways of achieving the same goals, though at the expense of application protocol neutrality (see below).

I fully understand that some of Werner’s work, and pretty much the whole group communications style of distributed computation, builds upon the reliability-as-a-layer approach. I studied his work, and the work of his group at Cornell under Birman, and even developed code with the Isis toolkit. But GCS doesn’t scale up to the size of system I’m interested in, or that Web services are struggling to be. Perhaps it has a role in the LAN, or in other small group environments though. It’s definitely cool tech.

I’m for “doing reliability” in the application layer, by coordination, with a coordination language in the guise of an application protocol. As I mentioned above, that means disposing of the notion of doing it in a protocol-neutral manner, in so far as application protocols define the application. So basically that means ensuring that when you design an application protocol, it’s able to give you the reliability semantics you need (within the realm of possibility, as we seem to agree on 8-), or can be extended to do so. HTTP is such a protocol for the hypermedia application model. The interesting question is, how general is the hypermedia application model, and is it relevant to your problem? I say, yes, there’s a good chance it is relevant, and it’s certainly relevant for Web services.

Jorgen responds to my comments on a presentation he gave last week.

Re my comment that architectural styles are pattern languages, not patterns, I can only point to Roy’s dissertation on this, where he suggests the association between an Alexendar “pattern language” and an “architectural style”, by suggesting indirectly that both are a system of patterns that constrain the form of the resultant system. “Stateless”, “Uniform interface”, “Layered Client Server”, etc.. are constraints, and when coordinated together, form the REST architectural style.

Re the stateful/stateless point (both parts), I don’t see how whether the targetted endpoint does additional dispatch or not, matters to this issue, unless of course that dispatch operation uses some state (which is not required of an OO style, AFAICT). You suggest that all object oriented styles are stateful, yet REST is stateless, and it’s object oriented in that messages are targetted at identifiable objects.

Re SQL, and to add to my last blog, it’s true that a SQL row may be a resource, but it’s not the case that all resources (as defined in 2396) can be manipulated via SQL.

Re POST, perhaps we miscommunicated, but by “partial update” I thought you meant that the meaning of POST permitted the updating of partial state changes to be communicated to the client, which it doesn’t. It is true that the effect of a POST may be a “partial update” of the state of the resource, but the issue is that a client will not know that. All a successful response to a POST can mean to a client is “thanks, I accept this data”. So I’d say that your comparison to SQL UPDATE is inaccurate, because after a successful UPDATE, the client knows the state of some part of some table. UPDATE is more like PUT for this reason, whose successful invocation informs that client that the state of the resource is what they asked for it to be.

Jorgen writes;

And there I was thinking this statement would be an olive branch for you, Mark ;-)

Heh, yah, I appreciated the attempt, but I felt it was pretty early to propose a synergy existed before you really understood REST. Oh, and please don’t take that the wrong way 8-); understanding REST isn’t a matter of smarts, it’s just a matter of recognizing what it is (or more specifically, what an application protocol is, at least in my experience as an ex-CORBA lover). I’m confident that you’ll understand soon enough because of your broad experience, and your eagerness to learn.

I still don’t see any real conflict – you can still have requests returning XML data, the only question would be whether the request data must be in XML format or whether it can be encoded into a URL/URI.

Depends what you mean by “request data”. A typical Web services centric solution, because its normally early bound, requires that the request data include a method name. REST, because it’s late bound, requires only that you provide an identifier. From a cost-of-coordination perspective, the latter is vastly superior.

You can use Web Services standards and do pure REST. Equally you can use Web Services standards and _not_ do REST.

Of course. As I said before, I consider this a bug, not a feature.

Roy defined the null architectural style which could be said to be another style in which you can do REST or not. The way it does this is by being entirely devoid of constraints, which has the “disadvantage” (cough 8-) of also being entirely devoid of any desirable properties.

I’m not suggesting that you believe that the null style is a useful thing, but I’ve heard from a lot of Web services folks who feel that architectural constraints are a bad thing. From a software architecture point of view, this is madness. Have they never heard of entropy? 8-/

Fundamentally, cacheability IS a big factor as the client implicitly caches a local copy of the resource data at least for a time.

Sure, it is important, but as a side effect of the client maintaining the state (i.e. the interaction being stateless). If the style were said to “revolve around” anything, this (statelessness) could be one of the big things, sure.

I am not sure Roy Fielding’s dissertation would agree with your assertions here, Mark – see bottom of page 14:

Roy’s comment about combining styles doesn’t suggest how styles are combined. As I see it, if you’re using an architectural style with constraints A, B, and C which yield desirable properties X and Y, and then you want to add property Z which is known to be obtained via constraints D and E, then in order to get Z without giving up X and Y, your new style needs to have constraints A, B, C, D, and E.

FWIW, this model of constraints and properties has been around for some time, since at least Perry and Wolf’s “Foundations for the study of software architecture” (using that terminology, anyhow). Some of the folks you quote, like Shaw, Garlan, Kazman, etc.. have accepted this model. I don’t see how what I’m saying is controversial in this regard.

I’m glad to hear you’re giving the presentation again, and I look forward to following up on this with you.

Jorgen asks Is REST the SQL of the Internet?.

There are definitely some similarities between REST’s uniform interface and the SQL language, most importantly that they are both coordination languages, a priori deployed application interfaces that defer component binding (i.e. late binding), which are ideal for deployment on a network between untrusted parties (hence the use of the word “coordination”). But “Resource oriented”, a term that Jorgen defines, doesn’t apply to SQL since its coordination semantics are not specific to “resources” as defined in RFC 2396 (i.e. they’re not uniform), just as it doesn’t apply to other coordination languages like Linda, SMTP, or IMAP. If I knew more about what he was trying to achieve with such a categorization, I might be able to recommend a better name.

Apparently OASIS has decided to tackle reliable messaging, with help from the usual non-IBM/MS Web services suspects.

I think “reliable messaging” is a huge waste of time. It’s akin to saying that the network is unreliable, so let’s just make a reliable network on top (which is different than “reliable data stream” ala TCP). Sorry, it just doesn’t work that way. “reliable network” is an oxymoron, for any number of reliability layers you might try to build on top.

As with most problems over the Internet, reliability is a coordination problem. That is, how do two or more independant pieces of software distributed on an unreliable network, coordinate to achieve some goal in a reliable manner (such that both know that the goal has been achieved or failed, etc..)? Unfortunately, you can’t coordinate “reliability” in a vacuum, like the typical reliable messaging approach of store/forward/timeout; you have to look at what task is being coordinated in the first place, and then figure out how to augment your coordination semantics such that the necessary degree of reliability can be achieved. In the context of the Web, that means using the uniform coordination semantics that are made available through HTTP.

Simple example. I want to turn on a lightbulb, and do it reliably such that I know if my attempt succeeded or not. I would use PUT. If I got back a 2xx, I would know the lightbulb was on. If I didn’t get back a response at all – say if the connection died after the request was sent – then I don’t know. But if I needed to know, I could do a GET. Perfectly reliable, no reliable messaging solution in sight.

That example doesn’t work for everything of course, because PUT is idempotent and not all operations you might want to perform are idempotent. POST is different, but the requirements on it are different too, since if you use POST, you accept that you won’t know the state of the resource after a successful submission (getting deeper into that is a topic too large for a blog, sorry).

Anyhow, I acknowledge that some work needs to be done to HTTP to help with reliability (as Paul describes). But that is in no way “reliable messaging”.

Bob DuCharme gets back to basics about RDF, and in doing so clearly hilights the value of partial understanding. Notice how the integration problem he undertakes scales linearly with the number of documents, rather than proportionally (O(N)) as it would if his code had to have full knowledge of all those schemas. By using RDF’s data model (note that each file uses a different serialization of the same basic RDF triples), this scaling problem is averted.

It’s good to see Bill Gurley has restarted his “Above The Crowd” newsletter. His latest is “Software in a box”, where he talks about a topic that I’m quite familiar with; we shipped the first version of our software at Idokorro in a Cobalt Qube for some of the reasons he mentioned; ease of installation, manageability, etc..

One important reason that he didn’t list though, is integration. Typically, integration with third party software is done via proprietary APIs. If you put your software into its own (black) box, then the only interfaces you have for integration purposes are commodity application protocols.

Jorgen points to a presentation he’s giving at an OMG Workshop comparing Service Oriented, Resource Oriented, and Object Oriented architectural styles. This is somewhat similar to my REST Compared presentation, though mine was done at a lower level.

I’ll jot down some comments here as I read it.

Slide 10, “In other words, architecture styles are design patterns for the structure and interconnection within and between software systems.”. Sort of. Architectural styles are pattern *languages*, not just patterns.

Slide 12, “Two main types of distributed software systems”. This distinction seems quite arbitrary to me, and a superficial distinction at best.

Slide 13, Request/Response systems. Here, an association is asserted between request/response and RPC. I consider those entirely orthogonal issues. I know RPC systems which aren’t request/response (e.g. Orbix+Isis), and I know of request/response system which aren’t RPC (HTTP). Perhaps it’s just a terminology issue, not sure.

Slide 16, Object Oriented 1. “Communications are implicitly stateful”; I don’t think that’s the case. Some are, like EJB/CORBA, but some aren’t, like MTS/COM (or whatever it’s called now; Biztalk?)

Slide 19, Resource Oriented 1. I wouldn’t include SQL here, since it’s not resource centric. If you wanted to create a “data oriented” super category, then you might include it and resource-oriented, and file-oriented (FTP), etc.. in there. It also says that resources have state and identity, as if to suggest that they don’t have behaviour like objects. That’s not the case, they do have behaviour.

Slide 21, Resource Oriented 3. POST isn’t for “partial updates”, it’s for triggering the “behaviour” part of identity/state/behaviour.

Slide 22, Resource Oriented 4. “Resource Oriented” is not an architectural style. REST is the only resource oriented architectural style I know about, though any REST extension would also be so. I suspect that any resource oriented architectural style will be either REST, or a REST extension like the Semantic Web, or ALIN. See also slide 24.

Slide 23, Service Oriented 1. “Communications are implicitly stateless (all requests are sent to the same service endpoint)”. That reason doesn’t have anything to do with stateful/statelessness, AFAIK. If it was meant that the message is sent to the same in-memory object, then that would be implicitly *stateful*, as knowing what “the same” is requires a memory between invocations on the server, and that memory is state. From the discussions I’ve had with Web services proponents, they say that service oriented architectures are state neutral; they can be stateful or stateless.

Slide 28, Web Services vs. REST 1. “There is no real conflict between the general idea of Web services and the REST approach”; excuse me?! 8-O There is a fundamental and inescapable conflict between the two. They cannot peacefully co-exist. That Web service definition you quoted is completely incompatible with the REST architectural style. End of story.

Slide 29, Web Services vs.REST 2. “The total set of Web services specifications provide a superset of the REST approach”. Absolutely true. Yet by doing this – creating a superset – most of REST’s important architectural constraints are obliterated. A superset (in this sense of the word) of REST is not REST. Architectural styles combine via constraint intersection, not union.

Slide 32, “Choosing Between Architectural Styles”. Resource oriented styles don’t revolve around cacheability. That’s a small part of the advantages of the style. There is nothing that is read-only/mostly/idempotent specific about its style. It’s simply optimized for coarse grained data transfer, like document based service oriented styles, only more loosely coupled via late binding. Anyhow, even if you disagree, you should at the very least try to defend that position; there’s nothing in there that does that.

Slide 33, “Combining Architectural Styles”. See slide 29 comment above. You cannot combine styles in this way, and hope to preserve the constraints that are giving you your desired properties … which brings me to a general point about this presentation; despite the very a propos snippets up front from the likes of Fowler, Shaw, Kazman, etc.., very little of that experience appeared to be reused in the rest of the presentation. For example, I would have expected the presentation to talk about the architectural properties of each of the styles presented, and extract from that the domain of applicability.

Slide 34, “How to avoid the Choice of Style”. Yikes. As Neil Peart wrote, “If you choose not to decide, you still have made a choice”. This is a recipe for failure. Even SCOUT2 assumes an architectural style, though a mostly constraint-less one (which is a bug, not a feature).

In general, I had high hopes for this, but with all due respect to Jorgen, it needs a lot more work. Not only does it misrepresent the value of the approaches represented without backing up its claims, but it evaluates them in a context very different from the one accurately(!) described in the first few pages by the gurus of the field of software architecture.

Here’s a telling statement for you, from a BEA press release;

Recognizing that every integration project requires development, and every development project requires integration

This is absolutely wrong; every integration project does not require development.

This afternoon I subscribed to (i.e. integrated into my existing aggregate feed) a few more weblogs, and did it without any new coding. This is because I already use some software which can snarf RSS feeds for me; I just had to identify the feeds for it.

Of course, if you assume, as BEA does, that each new service has a different interface than other services, then obviously new development is going to be required, just as if each RSS feed had its own interface. But that need not be the case, as I’ve demonstrated.

I know, I’m not being entirely fair; even if some new Web resource used the HTTP interface, but used a data format the client wasn’t familiar with, then more coding would need to be done. Which is why I’m so keen on RDF.

Jorgen Thelin references Bill de h&#211ra’s blog on session state.

Unsurprisingly, it appears there’s some serious misconceptions about state going on here. As Ken Arnold said, “State is Hell”, which jives with my experiences.

Bill writes;

One issue with the REST hypertext model is its view on managing state. REST constrains that state reside on the client, However the real web works precisely backwards to this: all interesting state is kept on the server, as sessions.

That second sentence is incorrect. Even cookie based Web apps still keep most application state on the client. If they didn’t, you wouldn’t be downloading web pages. The only architectural style I know of that keeps all state server-side is the Remote Session style, ala VNC.

And when exprienced enterprise practitioners like Martin Fowler say that state belongs on the server, it makes you wonder whether REST has anything to offer here.

I’d have to see a reference. I’d be very surprised if Martin said this in the way you’re implying he meant.

Then Jorgen writes;

[…] my hunch is that REST can’t provide a clean solution to the general category of “conversation” style applications (involving server-controlled request state transitions) which is the more usual meaning of the concept of “session state”.

I’m a fuzzy on what Jorgen means by “server-controlled request state transitions”, but I know that REST can handle what’s commonly called “session state” quite well. Moreover, even HTTP can handle it, without any extensions or use of cookies. All that’s needed is a client side implementation which manages state better. For example, rather than implementing “shopping carts” as a server-side container resource for shopping items, implement it as a generic client-side container, where the shopping/checking behaviour is a result of POSTing a representation of that client-side container to a checkout processor on the server. This generic client-side container would be useful for holding any application state, and therefore anything you’d want to do as a session.

There’s all kinds of pros and cons to this of course, as there is with any solution developed within the bounds of any architectural style, but REST can do it without batting an eyelash.