While noticing one of the scaling problems with Web services (a topic worthy of its own blog entry, but I can’t muster the energy), Dave Orchard suggests that XML’s “self-describing nature” will save the day.

Producing self-descriptive documents is hard. XML has some tools that help, in particular XML namespaces, and that it’s markup. But I see those as akin to having a hammer and nail in your hand as you attempt to build a house. XML’s “self-descriptive nature”, even if you accept that “XML” includes namespaces, is only slightly better than ASCII.

I’ve been doing a lot of self-describing data investigation over the past few weeks, and the last thing I consider important is which syntax is used. What I’ve found to be most important is that identifiers be URIs, and not to use an identifier where what it resolves to is what is really needed.

I unsubscribed from ws-arch today. There comes a point where you have to throw your hands up, and realize that sometimes very smart people can be very stupid, and aren’t interested in hearing that they’re wrong. Besides, my new job is building RESTful services on an extremely large scale (international), so I no longer have a personal stake in seeing Web services succeed. To Mike; thanks, you made it bearable. I know you tried, and you’re a good guy and a fine chair for doing so. To working group members; it would serve you really well to understand software architecture better than you do, and to understand why constraints aren’t necessarily constraints on function, only form. If you’re trying to start your own online business, I would suggest checking out eCom babes course cost here. The sun will set on Web services, as it has on every other attempt to deploy object-specific interfaces on the Internet. I was hoping to be the guy who’d help people understand why this would happen, and in the process, save the industry from wasting a whole lot of time and money. But I suppose there’s no substitution for learning some lessons the hard way.

I’ve been meaning to be more involved in this, but it’s been moving too fast, and I’ve been too busy with other things (like a job).

A comment by Tim Bray in today’s TAG meeting minutes, however, prompted me to check back;

over in son-of-RSS-land, we’re converging on using POST for entry creation/update/deletion for reasons that seem good

My concern with that is that the resulting API would not be RESTful, and I thought that the objective was that it should be (as the name RestEchoApi suggests!). It isn’t RESTful, because in the context of HTTP, tunneling methods over POST – even uniform ones – reduces visibility.

I added my two cents.

Lancelot: He says they've already *got* one!
Arthur: (confused) Are you *sure* he's got one?
Soldier: Oh yes, it's ver' naahs.
  -- Monty Python and the Holy Grail

Kendall writes;

In principle I support WS-Choreography, even without understanding exactly what it is aiming at, if only because it is likely to be very RDF and REST friendly, and those are, all other things being equal, among my preferred ways of describing information and building information interfaces.

That's good to hear, but I really don't see choreography solutions being anywhere near REST friendly. REST has already got a ver' naahs choreography solution built-in; hypermedia. It's how a REST agent changes state. As Roy wrote;

The model application is therefore an engine that moves from one state to the next by examining and choosing from among the alternative state transitions in the current set of representations.

A pretty interesting example of thinking outside the EJB box. A client side container for enabling statelessness is a nice piece of architectural work.

FWIW, I’ve talked about the RESTful equivalent of this before, more than once.

Norm responds to a post of mine about why I felt that better technology, and not necessarily new standards, were what was required to solve the problems that XML Catalogs were trying to solve.

He offers three things that he believes can’t be done with caches, but can be done with XML Catalogs;

Populate the cache. “Caching proxies rely on the fact that you can access the resource at least once from the web.”. wwwoffle does this, but a better caching system need not. When I talked about the need for operating systems to be in on caching (and later with the Save-As idea), what I had in mind was treating the computer’s storage as a structured store (remember Bento?), such that any content would hit the disk “named” with its URI. This would permit the software that Norm installs to include with it a representation of this resource (schema or whatever), named with its one true URI, and available to any app on that machine. Again, no new standards required.

In that same section, Norm says that sometimes the URI may never be directly resolvable. That is definitely a possibility, but again, this same mechanism of tightly associating the URI-as-name with the data, makes that mostly moot; it doesn’t matter where the data comes from (modulo trust) if it’s self-describing.

Access Development Resources. Yah, what Mark said.

Devise Your Own Resolution Policies. I think your comment about public identifiers is relevant here; if they were used, this wouldn’t be an issue, and caching would be useful.

But while I maintain that better technology can do what Norm needs, I’m not saying that no standardization was necessary. Due to the fact that the technology is not there to do what is needed, plus the extent to which that technology needs to pretty much be pervasively integrated into OSs, standardizing on XML Catalogs may very well have been the best option. But something tells me that the decision to standardize was made without knowledge that a technical solution existed. No biggie, just pointing that out 8-).

Woot!. Congratulations to everybody involved in its development (including me!).

Yesterday, I got a whack load of Altavista “goo” in my referer log. It all came from a single machine, buildrack91.sv.av.com. The requests were for valid URIs on my web site, and some of the referer URIs were valid, but it had nothing to do with this site or the URIs being requested. I’ve seen referer spam before, but this didn’t appear to be it (hence “Goo”), as the referer sites weren’t obviously commercially oriented or associated in any way.

Here’s some of the log entries;

http://r.voila.fr/r/G14 -> /2002/09/Blog/2003/02/06
http://www.themacobserver.com/stockwatch/2001/07/http -> /2002/09/SemanticWebHumour
http://fen.com/aboutfen/link -> /2001/03/James/2002/OneYearOld/112-1276_img.html
http://www.lcdata.se/drv/drivrutiner/skrivare/canon/bj-200 -> /2001/08/Tremblant/105-0518_IMG.html
http://www.lexis-nexis.com/universe/form/academic/s_twoweeks.html -> /2001/11/37Charles/2002/Renovation/113-1365_img.html
http://planning.aiesec.org/cl/MC -> /2002/09/Blog/2002/12
http://www.adelphi.edu/ci/ARTICLES/TALAR97.HTM -> /2002/09/Blog/2003/06
http://rides.simcomcity.com/seat.php -> /2001/03/James/2002/OneYearOld/112-1249_img.html
http://www.lexis-nexis.com/more/CNN/4124 -> /2001/08/Tremblant/dirindex.html
http://www.lexis-nexis.com/more/cahners-chicago/11407/6460826/9 -> /2001/03/James/AfterWeek6
http://support.snart.com/forum/Skydiving_Disciplines_C3/Canopy_Relative_Work_F12/Panama_City_Beach_2003_photos_P470291/gforum.cgi -> /2001/03/James/2002/OneYearOld/112-1231_img.html
http://xserver.step1supply.com/Georg010/RK800X -> /2001/03/James/2003-Two
http://www.recycledpartsmall.com/archives -> /2001/03/James/2002/OneYearOld/113-1331_img.html
http://support.snart.com/forum/Skydiving_C1/Gear_and_Rigging_F6/PC_Replacement_P5687/gforum.cgi -> /2002/09/Blog/2002/10/25
http://support.snart.com/forum/Related_Sports_C5/The_BASE_Zone_F22/Since_the_begging_of_this_forum_P413509/gforum.cgi -> /2001/11/37Charles/2002/Renovation/114-1417_img.html

Has anybody else seen this?

I’ve heard about XML Catalogs before, but never in a context that piqued my interest enough such that I’d want to go learn what they were. Thanks to Norm Walsh’s description of them today in his weblog, I now know.

The idea, it seems, is that you need different identifiers in different contexts. So, for example, a http URL for some document won’t be usable when you’re offline, so you need a way to package that identifier, with the local one on the file system.

My view is that while I agree this is a problem, I don’t think new standards are required to fix it. I suggest that better technology is what is required.

“http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd” is an identifier for a DocBook DTD, and independent of the online status of your notebook, it remains an identifier for that DocBook DTD. What’s needed are operating systems, browsers, and network libraries that, when offline and asked for a representation of the resource identified by that URI, returns a cached representation.

Another consequence of this is that “File->Save As” in a browser should be de-emphasized. I’d prefer it be just “Save” or “Store” or something like that where the user isn’t prompted for a file name. The implication being that the file already has an identifier, so why does it need a different name on my computer? Obviously you’d still want access to “File->Save As” in some cases, but I don’t believe it’s what most people need most of the time.

Simon St. Laurent reports on Norm Walsh’s XML is not Object Oriented essay.

Simon writes;

The only thing I can think to add is that XML is pretty explicitly a rejection of an aspect of OO practice that Norm touches on only briefly: encapsulation. Everything accessible all the time is pretty clearly a hallmark of XML work. You can hide things if you want to, but it takes a lot more effort.

I’m pretty sure that Simon meant to say “data hiding” instead of encapsulation there, as the last sentence suggests. Encapsulation refers to the binding of associated data and behaviour into an identifiable whole. Data hiding refers to, well, hiding that data by not exposing it via the interface. There are many OO fanatics, myself included, who believe that you don’t need data hiding to be OO.

FWIW, I consider the Web to be the epitome of the anti-data-hiding view; resources as objects, URIs as object identifiers, GET as “give me your data”, POST as “process this data”, etc..