I’ve heard about XML Catalogs before, but never in a context that piqued my interest enough such that I’d want to go learn what they were. Thanks to Norm Walsh’s description of them today in his weblog, I now know.

The idea, it seems, is that you need different identifiers in different contexts. So, for example, a http URL for some document won’t be usable when you’re offline, so you need a way to package that identifier, with the local one on the file system.

My view is that while I agree this is a problem, I don’t think new standards are required to fix it. I suggest that better technology is what is required.

“http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd” is an identifier for a DocBook DTD, and independent of the online status of your notebook, it remains an identifier for that DocBook DTD. What’s needed are operating systems, browsers, and network libraries that, when offline and asked for a representation of the resource identified by that URI, returns a cached representation.

Another consequence of this is that “File->Save As” in a browser should be de-emphasized. I’d prefer it be just “Save” or “Store” or something like that where the user isn’t prompted for a file name. The implication being that the file already has an identifier, so why does it need a different name on my computer? Obviously you’d still want access to “File->Save As” in some cases, but I don’t believe it’s what most people need most of the time.

Simon St. Laurent reports on Norm Walsh’s XML is not Object Oriented essay.

Simon writes;

The only thing I can think to add is that XML is pretty explicitly a rejection of an aspect of OO practice that Norm touches on only briefly: encapsulation. Everything accessible all the time is pretty clearly a hallmark of XML work. You can hide things if you want to, but it takes a lot more effort.

I’m pretty sure that Simon meant to say “data hiding” instead of encapsulation there, as the last sentence suggests. Encapsulation refers to the binding of associated data and behaviour into an identifiable whole. Data hiding refers to, well, hiding that data by not exposing it via the interface. There are many OO fanatics, myself included, who believe that you don’t need data hiding to be OO.

FWIW, I consider the Web to be the epitome of the anti-data-hiding view; resources as objects, URIs as object identifiers, GET as “give me your data”, POST as “process this data”, etc..

While noting that Roguewave has terminated its XML/tuple-space project, Ruple, Don Park wrote;

I am getting a dangerous itch to apply tuplespaces to web services workflow problems. TupleSpaces are extremely powerful as coordination infrastures so tuplespaces and web services go very well together IMHO.

Don, do you realize that REST’s uniform interface (GET/POST, etc..) defines a coordination language very similiar to a tuple space?

And for enabling workflow, there’s the additional REST constraint of using hypermedia as the engine of application state.

Jeremy Allaire posts a transcript of a “conversation”(?) with Tim Berners-Lee on the Semantic Web at PC Forum.

Here’s a snippet which includes some of Tim’s words, plus Jeremy’s commentary;

TBL: business model for semantic web is the biz model of the web. it’s how apps interoperate, it’s how apps talk. short answer: dramatically reduce cost of enterprise app integration.

(My side conversation with Adam Bosworth, BEA chief architect and ex-Microsoft, Adam helped shape many of the XML standards. We both agree that this RDF thing is a big joke and TBL is on another planet. Adam helped drive the creation of XML Schema and XML Namespaces, as well as Web Services standards that uses these, and these are the things that are actually driving the semantic web. Virtualy no one uses RDF, but nearly everyone is moving to these other standards).

I’m a big believer in the technology behind the Semantic Web, but am skeptical that it will see widescale deployment anytime soon, due mainly to the (current) lack of a killer app. But that doesn’t reduce its value for application integration by very much. As we’ve seen, any form, of exposure of a system in a machine processable manner is an improvement over the alternative of having no access. It sounds to me like Jeremy and maybe Adam don’t even see the Semantic Web as a solution to the same problem that they’re tackling in their Web services work. Well, it is, and it’s worth investigating further before so easily dismissing it.

I’d recommend reading an earlier blog entry about the value of the Semantic Web for integration.

I was just reading over an article that gives a very high level overview of GXA, which, like so many others, makes the fundamental mistake of talking about HTTP as a “transport protocol”. Of course, one only need look at the HTTP spec to see that it’s not, it’s a transfer protocol, which is a very different beast; a coordination language dealing in state.

But after writing about coordination languages last night, something occurred to me; that the more general a transfer protocol, the easier it is to mistake it for a transport protocol. HTTP, effectively, has only two methods; GET and POST, and they are commonly confused with the semantics of send() and recv(). Other transfer protocols, like SMTP, or IM protocols, are also often confused with transport protocols, for the same reason.

Coordination languages come in very general and very specific flavours. WS-Transaction is very specific, for example, as it deals only in “transactions”; any task which can be modelled as a transaction, can be accomplished with WS-Transaction.

An example of a very general coordination language, would be tuple space based systems such as Linda or Javaspaces. These are general because they both deal in an abstraction known as a “space”. A space is a more general abstraction than a “transaction”, because more things can be modelled as spaces. But not every thing. For example, modelling a lightbulb as a space leaves one no way, without additional agreement, to check if the bulb is on or off.

Of all the coordination languages that exist though, I suggest that there is one that is the most general, and it is REST’s “uniform interface”; it deals in “resources”, which includes every thing.

Edwin points to some BPEL presentations. I read Satish’s (PPT), and saw the term “coordination language” mentioned in the context of Web services for the very first time. It’s about time, as the Internet’s application layer is populated by a variety of coordination languages, otherwise known as application protocols.

Werner Vogels writes;

The keynote was by Tim Berners-Lee who did a truly awful job in trying to make a connection between web services and the semantic web.

Yup, absolutely agreed. I read over his slides last night, and, like Werner says, was waiting for some new connection to be presented that I also hadn’t considered. But nope. Very disappointing.

I’m pretty sure I know what Tim was trying to do though. He was trying to “embrace and extend” Web services by convincing people that they needed the Semantic Web for data integration, with the expectation that once it is in use, that the need for application-specific interfaces would melt away.

I’m confident that won’t work. For one, if there’s anything that’s held in more contempt by Web services proponents than the Web for app-to-app integration, it’s the Semantic Web. They see it as an academic exercise, and irrelevant to the “needs of business”. Of course that’s wrong, but the belief that it’s true is pervasive in this community. It seems that “the needs of business” has become synonymous with architectural styles that resemble CORBA and DCOM. It is that which needs to be refuted.

It’ll be interesting to see if Tim will keep up this tact when facing the architectural problems with Web services in response to my issue to the TAG.

Yesterday, I proposed a new issue to the TAG regarding what I consider factual errors in the recently published draft of the Web Services Architecture document concerning the architectural property of visibility. You can read the text for yourself, but I thought I’d explain why I raised this issue and not the more general and abundantly obvious “Web Services are incompatible with Web architecture” one.

My principle motivation was that I wanted a very well defined, bite-size topic for the TAG to chew on; one that could be resolved in short order, but that still had “architectural impact” and relevance to Web architecture. In addition, I wanted to pick a topic that the WSA WG had already agreed was important, but believed they had it covered. My end goal is that having this issue resolved in my favour will be enough of a boot-to-the-head to the WG, that they may see where they’ve erred, especially as the resolution will presumably drive home the critical point that application protocols define application interfaces.

It’s a bit of a risk; perhaps the WG members won’t see it that way and I’ll have wasted an opportunity. But I figure that all I need to do is to convince one staunch Web services proponent, and then the house of cards will topple. Plus, it’s unlikely that discussion on this topic will be able to avoid the more general problems with Web services, so there’s an opportunity for additional issues to be raised by others.

Smart move, or missed opportunity? Time will tell.

My inbox collected a few, erm, “colorful” emails this past weekend. It seems that more than a few people have had enough of my pro Web architecture position.

I’m confident of my earlier predictions, but perhaps I underestimated how nasty this is going to get. If these messages are any indication, this will be indistinguishable from a war.

Place your bets!

A very nice piece from Tim on how Web services should look; RESTful.

Dave, Sam, and Don have all responded.

First, to Tim, right on man. It’s about time too. For quite a while, he’s seemed to be on the fence, but this seems to make it quite clear where he stands, and hopefully where he’ll be voting on Decision Day. Perhaps with a name as well respected as his firmly in the REST camp, more people will take notice.

To Sam, a couple of points. First, safety != idempotency. Safety means messages don’t change state (roughly). Idempotency means multiple messages have the same effect as one. All safe operations are idempotent, but not all idempotent operations are safe. For example, PUT is idempotent but not safe. In addition, having the response change for a GET, even in a very short period of time, is perfectly RESTful. What isn’t RESTful is if the result changes because GET was invoked (and the owner cares about the change).

Also, the “take three parameters” analogue misses a major point I believe. In the POST/SOAP/XML-RPC case, sure you get the same info back, but you have no way to refer to that info, or pass references around to other parties. When you marshal data into a http URI, you are creating a token which has associated with it a publicly specified method for dereferencing. That is a vast improvement over the one-time use-and-consume approach of POST for retrievals.