It’s about time. I wonder if they’re using Nutch? RDF Gateway could be integrated into Nutch as a content type handler, I would expect.
(link) [Mark Baker’s Bookmarks]

I’m so glad that we synchronized on the similarities between our two approaches. I’m just still not sure I understand his “coupling complaint” against REST.

He writes;

However, I believe that there is a danger by adopting the resource-oriented view that Mark’s approach suggests. Identifying a resource through a URI and applying the semantics of an HTTP verb on it (e.g., GET URI) is good for retrieving state, as the web demonstrates. My worry about this approach is that there is a coupling of an identifier (the URI), an interface (HTTP verbs), and the state on which these verbs operate. The approach may work for the web but as we attempt to build large-scale, distributed applications that do not involve the human factor, we may get complex networks of references between the states of resources containing references to other resources which, in turn, may lead to brittle applications. We use the same argument against the resource-oriented WS-RF approach, which of course introduces lifetime management, renewable references, etc. behaviours into the picture (borrowed from the distributed-objects world).

I’m trying hard to understand the coupling Savas speaks of there, and its purported problems. I’d say that there exists a weak form of coupling between the identifier and the interface via the URI scheme, and I consider that a feature; I don’t think you can late bind identifiers to representations – which provides a whole lot of simplicity and scalability – without it.

I’m not sure what he means by the coupling between the identifier, the interface, and “the state on which these verbs operate”. As I see it, even with a “processThis” approach, there’s that same form of coupling; some document is processed, and though its semantics are independent of previous documents (unlike WS-RF), the resulting state change in the server component is a function of the just-processed document, all previously processed documents, and any initial state that existed before that point. For example, if I was firing a bunch of documents to a processor which adds integers, then although the messages have identical semantics – “process this 1” – the state of the processor changes as a result of each one.

Savas, if that doesn’t describe what you’re talking about, can you elaborate please? Perhaps an example would help.

Document exchange is an application. Discuss.

Apparently, I’ve got a lot to learn about rules engines
(link) [Mark Baker’s Bookmarks]

Ok, fess up, who pissed in Chris’ coffee this morning?

I think the operative word from my blog that Chris missed was “need”; that, IMO, we need a WS-* RSS feed because new specs are appearing at a crazy rate. You can’t compare that with the W3C’s TR page and corresponding RSS feed because it represents deltas while the Wiki represents a sigma. If the W3C published a list of recommendations via RSS, that would make for a more fair comparison. So how many Recs have they published? Let’s see, in almost 10 years, they’ve got about 80 (90ish if you include the IETF specs), while there’s 40ish Web services specs listed on the Wiki, the bulk of which have been produced in the past two or three years. Not exactly apples-to-apples, but not too far from it.

He concludes;

Please don’t misunderstand my intent, I like HTTP. Unlike Mark, neither do I think it is the last protocol we’ll ever need (it is not), nor do I spend every waking moment trying to tear it down or to poke fun at things that it simply doesn’t handle effectively. That would be pointless.

Please don’t misunderstand my intent, I like SOAP. I just don’t like how it’s being used. It would best used for document exchange, not RPC (Web services circa 1999-2002), or RPC dressed to look like document exchange (present day Web services). I also don’t “poke fun” at Web services very often, but I do take pride in being able to point out their many architectural flaws in a variety of different ways, which I do frequently. And I don’t think HTTP is “the last protocol we’ll ever need”, though I do believe that if it suddenly became impossible to create any more, that it wouldn’t be such a big deal, at least for those of us building document exchange based Internet scale integration solutions. As for what things HTTP “simply doesn’t handle effectively”, I believe you grossly overestimate the number of items in that list, though clearly it’s non-empty.

So do me a favour and drop the strawmen, ok? You’ve been pulling that crap for years.

… this gem from Roy;
If this thing is going to be called Web Services, then I insist that it actually have something to do with the Web. If not, I’d rather have the WS-I group responsible for abusing the marketplace with yet another CORBA/DCOM than have the W3C waste its effort pandering to the whims of marketing consultants, have a look at hoe branding on social media can help your business grow. I am not here to accommodate the requirements of mass hysteria.
And the hysteria continues… Hint; anytime you need an RSS feed to track new specs, something is, prima facie, horribly, horribly, wrong.
Kinda dated, but I hadn’t heard about this. Cool. CORBA gets state-transfer. Now to hunt down the spec …
(link) [Mark Baker’s Bookmarks]
Hmm, I wouldn’t expect “INSERT INTO” to be the command you’d execute to unsubscribe me from a mailing list. No wonder I can’t unsubscribe. Yikes.
(link) [Mark Baker’s Bookmarks]
What should you put between and ?
(link) [Mark Baker’s Bookmarks]

I’ve been using the CRM 114 Discriminator for spam filtering for the past few weeks, and have been thoroughly impressed by its appetite for spam (and only spam).

Training took a lot longer than I expected, seven weeks, but over the past week I’m finally to a happy point (YMMV) where there are essentially no false positives, and only a trickle of uncaught spams make it through. Here’s what I’ve observed over the past three days;

  • 1305 messages filtered
  • 916 spam messages caught
  • 21 spam messages missed
  • 2 false positives, both of which were mailing list acknowledgements (which I don’t believe I had previously trained)

Spam, what spam?