Ed Dumbill writes;

I’m wondering how long it will be before everybody’s completely reinvented RDF in the search for what it had all along.

Yup. Any day now, I expect. In fact, I bet BEA is working on something as we speak given Adam’s comments, plus the recent flurry of work by Dave Orchard.

Anybody who’s spent any time following RDF would know that there’s a whole bunch of things you could seriously mess up if you didn’t know better. Given that Dave and Adam have previously demonstrated some extreme ignorance regarding RDF, I’m not hopeful that what they produce will be anything very interesting.

WS-DataExtensibility anyone?

I also wanted to add, in response to Chris Ferris’ comment about “partial understanding” (a key benefit of RDF) being unnecessary, that partial understanding is little more than “MustIgnore”. How DaveO can go on-and-on (see links above) about the value of MustIgnore, yet not see the enormous value-add that RDF/XML provides over plain-old XML, totally boggles my mind.

Oh my, WS-Discovery is a Web service spec I might actually use! Horror! 8-)

When I heard what it was, and that it was written by BEA, I was sure that Yaron Goland would be involved, after all his related work on UPnP. He wasn’t, nor was he even acknowledged. Odd.

But there’s not really too much to say about it (at least until I do a detailed review). Link local discovery is a pretty well understood domain, and the authors of this spec seem to grok it at least as well as I do. The use of SOAP/XML is unfortunate, I’d say, because of its bloat; you really need to keep things lean for multicast discovery so as to fit everything in a single datagram. Some kind of binary-encoded SOAP would be useful here.

I sort of wonder why Rendezvous or LLMNR weren’t adopted; the former has a whole lot of support and running code behind it, while the latter has MS and should be published as an RFC shortly. But I suppose that nothing’s really close to critical mass in this space, so I can’t blame them for starting from scratch.

There’s also mention of a “SOAP/UDP” spec, which is “To be published”. That’ll be interesting to see, especially if there’s a compact (but still extensible) binary encoding. What’s suggested in the spec, re “UNICAST_UDP_REPEAT” and “APP_MAX_DELAY”, and comments such as “waiting for timers” suggests that it might be more a case of trying to reinvent parts of TCP rather than embracing the message-per-datagram model which seems to work so well. But my experience there is rather limited, so I’d be happy to be proven wrong.

Chris still isn’t seeing it;

And you either recognize this:

<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”/>

or you don’t. In all cases, you need to read the specs.

Did you notice that in my example document there’s no RDF namespace? I was trying for an apples-to-apples comparison.

If an XML document is also an RDF/XML document, then an RDF processor can extract fine grained pieces of information (triples) from that document.

I’ll stop there for the moment, before we go on to why that’s valuable. Are we in agreement?

As long time readers of mine know, I’ve talked a lot about the value of visibility, but had little success convincing Web services proponents that WS/SOA has significantly less of it than do Internet scale architectural styles.

With that in mind, I thought I’d talk a bit about a couple of related properties reusability (sometimes called “substitutability” when relating to components), and configurability. Combined, these properties refer to the ability to swap components in and out at runtime, as you can with the Web (your browser can request data from any Web server), or with pipe and filter.

I wonder, are there any Web services proponents who’d claim that this isn’t much more loosely coupled than with the unconstrained-interface SOA approach?

Chris Ferris writes in response to my suggestion that processing an XML document is an all-or-nothing proposition;

I don’t see it that way. Understanding an XML document is not an all-or-nothing proposition by any stretch of the imagination. For instance, I can have a generic SOAP processor that understands the SOAP namespace but is oblivious to the content of the soap:Body element (amongst other things such as certain SOAP headers).[…]

I see the disconnect. I’m referring to any/all XML document(s). No fair saying that some specific kinds of XML documents are partially understandable, because clearly you can design one to be, and SOAP, as an envelope, is one as you correctly point out.

So, consider this XML document;

<iwoejaf xmlns="http://example.org/oijerwer">
  <ijrwer>inm4jvxc</ijrwer>
</iwoejaf>

That’s the kind of document I’m talking about. Wouldn’t you say that understanding that document is all or nothing? You either recognize the namespace or you don’t, right? Well, that’s not the case with RDF/XML since it gives you “partial understanding”; if that document above were known to be RDF/XML (and it is valid RDF/XML), then an RDF/XML processor can extract information from it piece-meal (in triples). Now, maybe none of the terms in any of the triples will be recognizable, but perhaps if you dereference the URI for each of the terms in those triples, you’ll find that the terms you don’t know are related to ones you do.

Now can you see why TimBL is so keen to see folks use RDF/XML? It’s the answer to the schema evolution problem.

HTTP is a great application protocol, for the application for which it was designed… the Web.

Finally, something we can agree on! 8-) Now, if only you understood what the Web was, and was capable of, we’d be all set.

More good insight from Savas on REST.

He writes;

The human factor is involved. If a resource (e.g., a web page) has moved, applications don’t break. It’s just that there is nothing to see. We are frustrated that it’s not there. If an application depends on that resource being there, that application breaks.

Yep. But how is that any different than a service which you depend on not being there? At least HTTP responses code are well-defined, and handle a lot of common cases, including redirection, retiring, retry-later, etc.. I don’t see how this is human-centric at all; it’s just dealing with inevitable errors in distribution across trust boundaries.

I’m not sure what he means by using HTML for “interfaces”, but he then later speaks my language again when he describes HTML as a format for describing resource state;

If a resource’s representation is described in HTML, all is fine. Everyone knows how to read HTML. How about an arbitrary XML document though? Did we have a way of specifying to the recipient of the resource’s representation about the structure of the document? Perhaps they wouldn’t have requested it if they knew about it.

XML is fine and dandy, and I use it whenever I can, but it’s just a syntax. As such, it doesn’t do anything to alleviate the issue that understanding an XML document is an all-or-nothing proposition. That’s why when I use XML, I almost always use RDF. It enables a machine to extract triples from an arbitrary RDF/XML document, and triples are much finer grained pieces of information than a whole document. It allows me to process the triples I understand, and ignore the ones I don’t, which another way of saying that it provides a self-descriptive extensibility model. See this example.

If we are going to glue applications/organisations together when building large scale applications, we need to make sure that contracts for the interactions are in place. We need to define message formats. That’s what WSDL is all about.

Agreed, but that’s also an important part of HTTP. It just defines message formats in a more self-descriptive way (i.e. that doesn’t require a separate description document to understand what the message means).

Also, we talk about exchanging messages between applications and/or organisations. Do we care how these are transferred? Do we care about the underlying infrastructure? I say that we don’t; at least, not at the application level.

I’m not sure we’ll get past this nomenclature problem, but in my world, documents are transferred while messages are transported. I do agree that how message transport occurs doesn’t matter, but I don’t agree that how document transfer occurs doesn’t matter. As an example, consider a document transferred with HTTP PUT, versus that same document transferred with HTTP POST. Both messages mean entirely different things (more below).

If there is a suggestion that constrained interfaces are necessary for loose-coupling and internet-scale computing, then here’s a suggestion… What if the only assumption we made was that we had only one operation available, called "SEND"? Here are some examples:

TCP/IP:SEND, CORBA:SEND, HTTP:POST, EMAIL:SEND, FTP:PUT, SNAIL:POST (for letters), etc.

Ah, this one again. 8-)

You can’t compare TCP/IP “SEND” with HTTP POST or SMTP DATA. TCP/IP is a transport protocol and therefore defines no operations. You can put operations in the TCP/IP envelope yourself (e.g. by sending “buyBook isbn:123412341234”), or you can have them be implicit by the port number by registering your “Book Buying” protocol with IANA, only ever using that one operation (“buyBook”), and sending just “isbn:123412341234”). On the other hand, HTTP, SMTP, and FTP, all do define their own very generic operations.

Sean McGrath writes;

If you want to look at a cheap, solid, scalable way to do distributed computing, look no further than the combination of HTTP and asynchronous messaging using business level XML documents. The beauty of this, is that it is both Intranet and Internet class at the same time. Work with the web – not against it. Its resources + names for resources + an application protocol (not a transport protocol) that make it work.

+1. The Internet, not the Intranet, is the general case.

Lots of distributed object, SOA, Web, Web services talk going on recently …

Dare Obasanjo on Web/SOA;

What Don and the folks on the Indigo team are trying to do is apply the lessons learned from the Web solving problems traditionally tackled by distributed object systems.

I know they’re trying to do that, but what they’ve (and most nearly everybody else) have missed, is that the Web is already a distributed object system; it has its own way of addressing most of the problems that previous attempts at distributed object infrastructures attempted to solve. For the things it doesn’t address, Web extensions like the Semantic Web and ARREST cover them … and then some.

James Robertson on HTTP, documents and coupling;

Here’s my point though. It’s magic thinking to say that you have looser coupling simply because you use Http transport and XML documents. It’s a fantasy. Why do I say that? Well, let’s posit a blog server that accepts XmlRpc formatted posts. There you go – http transport, xml documents.

HTTP isn’t a transport protocol. It’s not intended to send RPC messages, it’s intended to send real documents like images and resumes and letters and purchase orders and … anything that is serialized state. If you use it that way, then there is magic there, because it gets data into the hands of somebody else’s application code, rather than into the hands of some infrastructure code.

It’s actually no different than CORBA – except that maybe it’s slower. Either way, I have a server listening on a port, expecting data in a given form, and able to perform a constrained set of actions if I send it the right requests – and ready to send back errors if I don’t.

CORBA only tells you that objects have interfaces. HTTP tells you what that interface is.

More later…

Norm’s playing around with media types. This one is served with application/xml.

Let’s have a look at what should happen in a perfect world if XHTML is served with that media type. RFC 3023 says;

An XML document labeled as text/xml or application/xml might contain namespace declarations, stylesheet-linking processing instructions (PIs), schema information, or other declarations that might be used to suggest how the document is to be processed. For example, a document might have the XHTML namespace and a reference to a CSS stylesheet. Such a document might be handled by applications that would use this information to dispatch the document for appropriate processing.

What that means is that one cannot assume that namespace dispatching will occur, and therefore the semantics of application/xml are ambiguous; it is reasonable that the recipient see it as XHTML/HTML, but also reasonable that they see it as “XML” (such as in the IE XML tree view).

In the real world of course, reality can trump specification; concensus (in the form of running code) may very well be that namespace dispatching is assumed, and in that case at least the ambiguity vanishes. But then we’ve lost the ability to send plain-old XML. For example if somebody asks me for a sample XML document, I’d like to be able to send them some XHTML without it being interpreted by the other end as XHTML, just XML. I think it would be great if application/xml could be used for this purpose, but it’s not a huge deal; text/plain would also be appropriate in many cases.

So I set up a little test. Let me know what you see and I’ll record it. It could be useful in the soon-to-come revision to 3023, enabling us to be a bit more specific than “might”.

I gave a presentation tonight at the XML Users Group of Ottawa, titled REST, Self-description, and XML.

Not unexpectedly, the slides don’t capture a lot of what was presented (and nothing of what was discussed), but there’s a story in there that should be easy to follow. It also has a surprise ending that caught at least one person off guard. That was my objective.