Monthly Archives: August 2003

More on Web services and distributed objects

I wanted to respond to some of the detail in Werner’s article, in addition to ranting about how document exchange was state transfer. So here goes …

The first statement that really gave me pause was this;

The goal of web services is to enable machine-to-machine communication at the same scale and using the same style of protocols as human interface centered World Wide Web.

I don’t believe that’s the case, and it’s certainly not been accomplished IMO. I think that if you asked anybody involved since the early days, that they’d say the goal is just the first part; to enable machine-to-machine communication over the Web (or perhaps Internet). “using the same style of protocols” has never been a requirement or goal of this community that I’ve seen.

Consider what can be done with a Web services identifier versus a Web identifier. Both are URIs, but because Web architecture uses late binding, I know what methods I can invoke when I see a URI (the URI scheme, specifically) and what they mean (because there’s a path back to RFC 2616 from the URI scheme). With an identifier for a Web service, I don’t have sufficient information to know what the interface is and what it means, because Web services are early/statically bound (creating centralization dependencies, ala UDDI). I don’t consider changing the architecture from late/dynamic binding to early/static binding to be “using the same style of protocols”.

I suppose I also take issue with the implicit definition of “distributed objects” as part of Misconception #1, when it says;

An important aspect at the core of distributed object technology is the notion of the object life cycle: objects are instantiated by a factory upon request, a number of operations are performed on the object instance, and sometime later the instance will be released or garbage collected.

I’ll present my definition first; distributed objects are identifiable things encapsulating state and behaviour, which present an interface upon which operations can be invoked remotely. Obviously software objects, like all software, do have a lifecycle. But it’s primarily an implementation detail, and not exposed through the object’s interface. Some systems chose to tie the identifier to some particular in-memory instantiation of an object (rather than to an abstraction for which an object could be instantiated to proxy for, in effect), which created a real mess, but I don’t consider that key to the definition.

Misconception #2 also seems prima facie incorrect to me, at least by my definition of “RPC”; an architectural style where the developer of each component is provided the latitude to define the interface for that component. More concretely, I believe the statement “there are no predefined semantics associated with the content of the XML document sent to the service” to be incorrect because, as I mentioned in my last post, if there is a method name in the document, then that is an explicit request for “predefined semantics”.

I agree with the titles of Misconceptions #3 and #4; Web services don’t need HTTP or Web servers. But I disagree with the explanation provided. That Web services are transport agnostic is fine, but that does not at all imply that they should be application protocol agnostic, although most people use them this way. The core of my disagreement is with this statement in the last paragraph of #4;

The REST principles are relevant for the HTTP binding, and for the web server parsing of resource names, but are useless in the context of TCP or message queue bindings where the HTTP verbs do not apply.

That is incorrect. REST is protocol independent, and has applicability when used with other protocols, be they application or transport. REST is an architectural style, and following its constraints guides the style of use of all technologies you might choose to work with. For example, if you were using FTP RESTfully, then you would do things like identify files with URIs (rather than server/directory/file-name tuples), and interact with them via a single “retrieve” semantic (an abstract “GET”), rather than the chatty and stateful user/pass/binary/chdir/retr process (not-coincidentally, this is how browsers deal with FTP). In essence (though drastically oversimplifying), what REST says is “exchange documents in self-descriptive messages, and interpret them as representations of the state of the resource identified by the associated URI”. That philosophy can be applied to pretty much any protocol you might want to name, especially other transfer protocols (as the Web constitutes an “uber architecture”, of sorts, for data transfer).

That’s most of the big issues as I see them. Anyhow, that was an enjoyable and informative read. Thanks!

Document misconceptions

Werner posted an article he wrote for IEEE Internet Computing titled Web Services are not Distributed Objects: Common Misconceptions about Service Oriented Architectures.

That article is very well written, and Werner makes his point loud and clear as always … but ultimately, it makes some of the same misconceptions as so many others have before it. In this case, I think I’ve boiled it down to one main misconception that I’ve talked about recently;

Web services are based on XML documents and document exchange […]

No, they are not. Just open your wallet and grab a cheque, or a credit card receipt, or your drivers license. These things are what I know a “document” to be; state. If a cop asks me for my drivers license and I hand it to her, I have performed “state transfer”, I haven’t asked her to do anything in particular by transferring this document to her. In contrast, the Web services view of a document includes a “method” which effects the semantics of the movement (aka transport) of that document. So if I had a Web services document which I handed to somebody, I’m not merely submitting that document to them, I’m asking that they perform some explicit action for me as specified by the contained method. This is a very very different thing than what “document exchange” is normally understood to mean.

I suggest that if you made the simple tweak to the big picture Web services vision to require that documents only contain state, then you’d have the Web, or at least a substantial part of it. I consider the Web to be the epitome of large scale document-centric distributed computing architectures.

Rendezvous vs. LLMNR

I’ve recently been coming up to speed on the whole Zeroconf space. Boy, what a mess.

Earlier this summer it seems, the WG decided to go with a Microsoft lead approach to multicast name resolution, called LLMNR. This was in constrast to Apple’s similar and existing work on Rendezvous, which they published in both spec and code form.

So rather than start from a solution that works, with multiple independent open source implementations available, they’re starting from scratch with something new and unproven? Brilliant!

Oh, and there’s also the issue that the applications area seems to be sitting on their duffs over the kind of transparency that LLMNR is forcing upon them by hiding the fact that the name resolution was performed via local multicast rather than via DNS-proper. Keith Moore seems to be the only well known “apps” person raising any objections.

Update; Stuart Cheshire, main Apple guy on this stuff, just posted his review of the last call working draft of LLMNR last night. Read it for yourself.

SOAPAction revisited

Every day, I get somewhere around 20 hits for the SOAP media type registration draft, referred from an old O’Reilly weblog entry of mine on SOAPAction. It turns out that this article is the first result returned when Googling for “SOAPAction”.

I feel a bit bad about this, because I only recently realized that the behaviour I described in that blog isn’t per any of the specs (obviously I don’t use SOAP at all 8-). I was extrapolating about its semantics based on some investigations into self-description and previous attempts at SOAP-like technologies such as RFC 2774 and PEP (specifically, this part, i.e. the 420 response code).

If SOAPAction/action were to be as I described there – and IMO, this would make it vastly more useful (i.e. make it useful at all 8-) – then the behaviour would have to have been specified to fault if the intent indicated by the value of the SOAPAction field were not understood. Obviously that isn’t the case today.

Sorry for the confusion.

Media type registration, decentralization, and RDF

Mark Nottingham suggests the W3C should take it upon themselves to clean up the media type registration process. I sort of concur, in that the official registration procedure doesn’t explain in sufficient detail how the burden of managing the timeline is entirely registrant-driven. This caused lots of delay during the registration of RFC 3236.

But on the other hand, I like it when centralized registries are difficult to use. If there’s really a need for a bazillion different data formats, then a centralized registry is the wrong approach, and the difficulty of using it – multipled by the number of people experiencing it – should provide sufficient impetus for somebody to suggest a change to a decentralized process.

Of course, I don’t believe we need a bazillion different data formats. I think we have a perfectly good 80% solution, which is why I’m not spearheading any efforts in this direction – though I think it would still be useful (just not required) to decentralize media types.

P.S. here’s an amusing data point, where Roy takes the W3C to task over its inability to properly register media types.

Ted Leung responds

Ted Leung – whose weblog I just subscribed to a couple of weeks ago and I enjoy reading immensely – just commented on my blog about Adam Bosworth.

First off, I want to be clear that I wasn’t “taking Adam to task”. I was just honestly excited to see that he appeared to closing in on understanding the Web via the seemingly identical path that I took. I think you have to have had the “Web epiphany” before you can appreciate why this excites me so much. 8-)

Ted writes;

Cross off CORBA and replace it with either REST or web services. The Web is already there. The missing piece is OpenDoc or something like it.

I don’t dispute that the browser provides a relatively weak form of compound document framework when compared to OpenDoc and CommonPoint, but my emphasis at the time was in studying the architecture of the system to see if it prevented richer frameworks from being built by extension. And I discovered that no, it didn’t prevent this from happening, and in addition already had some of the architectural features that I felt were required (XML namespaces (well, they came later), serialization-centric (GET), binding of state to behaviour (Content-Type), etc..). And sure enough, we’re finally beginning to see some of these systems being developed now. So I wouldn’t say that we’re missing OpenDoc, I’d just say that we’re working with a primordial-but-extensible version of it.

BTW, I also discovered that just by historical accident, an important part of what I expected to see – client side containers – wasn’t there. Cookies really threw me for a loop for many months, and it wasn’t until I read Roy’s dissertation that I realized that he didn’t like cookies, and that the RESTful solution to the problem they addressed was also a perfectly compound-document-framework friendly solution.

Self-description, redux

Kudos to Kendall Clark for stating XML is Not Self-Describing, as I did last month.

He writes;

Well, I’ve read too much Wittgenstein (not to mention too much Aquinas, Meister Eckhart, and Julian of Norwich) to think that a name is necessarily a self-description

I haven’t read them at all (8-), but I think I have a pretty good understanding of self-description that I developed “Bottom up” during my study of Web architecture over the past few years. As Kendall brought this up again, I’d thought I’d write a few more words about it.

As I see it, description is always with respect to some context. For example, “The sky is blue” is not a self-descriptive statement unless you know;

  • English
  • Which sky I mean
  • Which colour blue I mean

For any bag-o-bits, it seems to me that there exists a finite amount of contextual knowledge which is necessary in order to be able to understand it. “Self-describing” then, should mean that the bag itself contains sufficient information to identify the required contextual knowledge.

Tim Berners-Lee likes to talk a lot about this. Last year in Honolulu at WWW2002, his keynote was Specs Count, and much of it was about the value in the ability to be able to perform successive application of public specifications in order to understand a message. That’s contextual knowledge, and as you can see in his talk, it doesn’t begin with the HTTP message, it goes all the way down to the IP segment and Ethernet frame; even those bits must be considered (see an example of where this issue can show up in practice).

Where the Web fits in here, is with its contribution of an enormously valuable piece of contextual knowledge; RFC 2396 aka URIs. With respect to the example above, I can use URIs instead of strings, where those URIs can be used to provide the specifics of which blue I meant, by relating it to other colours.

There’s lots to be said about XML, RDF, and why SOA based Web services can never be self-descriptive (hint; too many methods). But I’ll leave it at that for now.

Adam Bosworth is about to grok the Web

Adam Bosworth sort of lays out some requirements for a “Web services browser”. It’s really funny for me to read this, because I was struggling with exactly these same questions back in 1996 or so, coming from some seriously hard-core CORBA work, but while also being a big fan of what we called the “Universal Front End”; a chunk of software deployed everywhere which could conceivably make use of every service out there. I spent a good amount of time trying to figure out how to integrate CORBA, OpenDoc, and the Web, in an attempt to yield what Adam’s asking for. A couple of years later, I figured it out; the key was the Web’s uniform interface, that you didn’t need to give services service-specific interfaces, you could accomplish the same tasks by exposing the “data objects” (aka resources) of that service via a common set of data-object-centric methods (what Adam refers to as “Add, Delete, Modify”, but which might as well just be GET, POST, PUT, DELETE).

I can even clearly recall holding some of the same misconceptions he had. For example;

Remember, in this dream, a web services browser is primarily interacting with information, not user interface.

Suggesting that today’s Web is about user interfaces. It isn’t, it’s precisely about “interacting with information”, where each information source is provided a URI, and the information is returned on a GET. It pains me to think back to when I didn’t even understand that simple point, because it’s so darned obvious now. But I’m comforted by the fact that gurus like Adam don’t get it, yet. 8-)

Adam earlier mentioned that he was going to be talking about REST. I’m very eager to hear what he has to say about that, given how RESTful his description of a “Web services browser” is. I think he’s almost there.

And BTW, with respect to mobility, I do believe that an additional constraint on top of REST could be useful. I actually wrote up a design at Sun back in 1999 about doing Servlets as Applets, permitting application code to be run in the browser. But Applet integration with the browser was just awful at the time (I think it’s gotten worse since 8-), making this basically infeasible. I should have investigated JavaScript, but didn’t; Mod-pubsub does some of the things with JavaScript that I couldn’t do with Java, in particular intercepting submission of POST data.