Steve’s next article for his “Towards Integration” column is titled “Web Services Notifications”. It’s the usual high quality stuff you can count on from Steve, but I’d like to respond a couple of comments made regarding my forte, Web architecture. Something tells me Steve probably anticipated this response. 8-)

A URL specifies only a single protocol, but a service could be reachable via multiple protocols. For example, a service might accept messages over both HTTP and SMTP, but any URL for the service can specify only one of those access methods.

It’s very commonly believed that URI scheme == protocol, but that really isn’t the case. A URI scheme defines a few things, but most important are the properties of the namespace it forms, in particular scope and persistence of uniqueness, whether it’s hierarchical, and probably some other things I’m forgetting. Defining an algorithm for mapping an identifier in that namespace, to a connection to a remote server on the Internet someplace is independent of those properties. Consider that;

  • interactions with mailto URIs don’t necessarily begin or end with SMTP
  • an http URI can be minted and used successfully before a Web server is installed, or even while the Internet connection is down
  • RFC 2817 describes how to interact using HTTPS with a resource identified by a http URI using HTTP only as a bootstrap mechanism via Upgrade. Upgrade isn’t specific to HTTPS either.

There is certainly a relationship – as defined in the aforementioned algorithm above – between a URI scheme and a protocol in the context of dereferencing and sending messages, but as those last two points above describe, it’s not quite as clear-cut as “URI scheme == protocol”.

Steve also adds;

URLs can’t adequately describe some transport mechanism types. For example, message queues typically are defined by a series of parameters that describe the queue name, queue manager, get and put options, message-expiration settings, and message characteristics. It isn’t practical to describe all of this information in some form of message-queue URL.

I’ve had to tackle this exact problem recently, and I figure there’s two ways to approach it. One is to suck up the ugly URI and embed all that information in one; I’m confident that could be done in general, because I’ve done something very similar. I would highly recommend this solution if you can do it because it’s efficient. But, if you can’t, you can always use a URI which doesn’t include that information, but which is minted at runtime as a result of POSTing the endpoint descriptive data to a resource which hands out URIs for that purpose; that requires an additional coordination step, but you get nice, short, crisp looking URIs.

Moreover, not only do I believe that URIs are great for identifying message queues, I believe (surprise!) that http URIs are. Consider what it means to invoke GET on a message queue; what’s the state of a queue (that will be returned on the GET)? Why, the queued messages of course. This, plus POSTing into the same queue, is the fundamental innovation of mod-pubsub, IMO.

Next up…

URLs do not necessarily convey interface information. This is especially true of HTTP URLs, because Web services generally tunnel SOAP over HTTP.

Wait a sec, you’re blaming http URIs for the problems caused by tunneling?! 8-O 8-) Most (good) URIs do convey interface information, at least in the context of dereferencing (which is the only place that an interface is needed). So if I see a http URI, I can try to invoke HTTP GET on it (keeping in mind some of the considerations mentioned above).

Savas writes;

Mark Baker talks about the WSDL WG’s decision not to require the name of an operation in the body of a SOAP message

Just to be clear, the issue wasn’t about placing the operation name anyplace in particular. It was just that I wanted a self-descriptive path to find it, no matter where it’s located. That could be in the body, the headers, the underlying transfer protocol, in a spec, or in a WSDL document someplace.

Of course, I think having the method name in the SOAP message is harmful. I’d much rather it were inherited from the underlying transfer (not transport!) protocol, at least when used with a transfer protocol.

And to respond to this comment of his;

Web Services are all about exchanging information and not identifying methods, operations, functions, procedures that must be called. What services do with the information they receive, through message exchanges, it’s up to those services.

I’d just say that, well, at some layer you have to worry about operations. If Web services aren’t that layer, then whatever goes on top of them will have to worry about it. And my understanding was that Web services wanted to tackle this layer. FWIW, I think Jim and I agreed on this in a recent private exchange.

I’ve talked about my nose for self-description problems before in the context of RDF and media types. Now, with the publication of -04 of the RDF/XML media type registration draft, there’s another one.

Comment submitted.

The best summation of the issue, as I wrote in the ensuing thread is probably;

If somebody on the Web can’t distinguish between an RDF message which says “Mark hates bananas” versus one that says “Mark hates bananas (but not really)” (aka unasserted), then there is a failure to communicate. The “but not really” part must be part of the message. It can either be done through mechanisms in the RDF specs themselves (e.g. parseType=”literal”), or it can be done in an encapsulating spec or registry, such as the media type registration.