Y. B. Normal

Y. B. Normal
Ziv Caspi can't keep his mouth shut.

Updated: 2005-01-08; 11:55:57 PM.

Saturday, January 08, 2005

Predicting Tsunamis Comment [] • Trackback [] • Google It! • 12:27:14 PM •

Prediction: No word will gain popularity in 2005 as much as the word "tsunami".

We're headed towards a tsunami of tsunamis, so to speak.

Saturday, March 06, 2004

Social Software Lock-in Comment [] • Trackback [] • Google It! • 9:23:46 AM •

Dare Obasanjo writes a follow-up to Adam Bosworth's what is the platform? post. Adam's point (and Dare agrees) is that we're seeing a shift from software platforms to platforms that are about community access, collaboration, and content. The implication appears to be that web-based service providers such as Amazon, Google, and Yahoo will rise in importance, as the actual software platforms of old (Linux, Mac and Windows) loose their relevancy as they become mere access points to the web.

Dare makes the following interesting point:

The interesting thing about the rise of social software is that this data lock-in is migrating from local machines to various servers on the World Wide Web.

One should not underestimate the consequences of this. Over the last few years, people have complained about the application data lock-in, meaning that they're locked into an application and can't migrate to using another because they can't port their data to the other application's format. In the next few years, people will realize that in terms of lock-in, the web has the potentail for much greater lock-in.

In software lock-in, you own the data bits. You can do whatever you want with them, including porting them to another format. If you can't do it yourself, you can probably find someone who can. In fact, this has already happened for the most successful application: all word processors, for example, come equipped with an import facility that allows them to read and use the competition's data format.

On the other hand, if all your data lives on the web, you're at the mercy of your service providers. If Yahoo goes down tomorrow, all your mail messages that they keep for you on their servers will be lost. Same for your IM contact list, the stock portfolio you track, etc.

Consider this: many people use web services because they're free. It's nice that you can get a 1GB mailbox from Google without paying anything. Some providers (such as MSN), will let you backup some of this data locally (for example, by using IMAP4), but only if you pay them a fee. What happens if providers decide that advertising doesn't make enough money for them, and start charging twice as much for allowing you to take your data elsewhere? Will you still be glad that you made a "deal" with the provider in which you paid it nothing and so it is under no obligations to give you back your data?

Innovation: less important than it's made up to be Comment [] • Trackback [] • Google It! • 1:18:12 AM •

If you haven't already, go read Clemens Vasters' excellent article Free as in Freedom. He makes lots of good points, that I won't repeat here.

One point he makes, however, is something I disagree with:

If someone is really interested in stopping them [Microsoft] from legitimately dominating every aspect of the software market (market as in money) in the long run, they need to compete with them on the innovation front.

Innovation means squat. It's the execution that counts. This is why in Microsoft, people are often judged by the number of times they've shipped a product. Whether the actual product proved successful or not matters less (though it still counts, of course). What's important is that you've made it to the target line. Shipping a product is what separates the men from the boys.

Monday, March 01, 2004

A Tale of Two Fences Comment [] • Trackback [] • Google It! • 11:25:14 PM •

An interesting article about the fence that the I's are building to stop the P's: A Tale of Two Fences.

Saturday, February 28, 2004

Rationalize Away, Brother! Comment [] • Trackback [] • Google It! • 12:42:53 PM •

I saw this on Green Hat Journal:

Eric Raymond lambasted open-source hackers for their pathetic user-interfaces: "This kind of fecklessness [in UI design] is endemic in open-source land. And it's what's keeping Microsoft in business — because by Goddess, they may write crappy insecure overpriced shoddy software, but on this one issue their half-assed semi-competent best is an order of magnitude better than we usually manage."

One thing we (Microsoft) have on them (Eric Raymond and friends) is that we write software people actually buy. Hell, in their case they find it hard to give it away.

I wonder how does the Mac fit into Eric's rationalization. It can't be the UI...

Tuesday, August 19, 2003

Sleepless in Seattle Comment [] • Trackback [] • Google It! • 9:56:48 PM •

The next two weeks I'll be paying a visit to the mother ship in Redmond. People who want to meet can drop me a mail to this weblog, or to my alias at Microsoft (ZivC).

UPDATE: (Radio, for unknown reason, has decided I wrote this in August.) Just to make it clear, this will be the weeks of x-mas and sylvester.

A Comment on XML Namespaces and RDF/XML Comment [] • Trackback [] • Google It! • 4:35:51 PM •

In response to my previous post, Danny writes:

Looking at your ugliness criticism, it all seems to be directed at the use of XML namespaces rather than RDF per se. It has pretty well been decided that Atom will support these, so it's hardly RDF's fault. Note too that this is a maximal example - practically all the elements used in practice would be those in the Atom namespace, rather than their counterparts in DC etc. Any equivalence would be stated at a schema level. I'm not entirely sure I understand your X:Y:Z namespacing, but it does sound rather like architectural forms, an alternative to XML namespaces that crops up on xml-dev periodically.

Yes, I think the XML Namespaces spec leaves a lot to be desired. It only goes half-way on the road to make it easy for authors to manually create XML documents composed of elements from multiple namespaces. If you have a long document to author, and you there are some elements from differing namespaces you constantly use, your only reasonable option is to declare all namespaces at the top, invent prefixes for each, and then constantly juggle these in your mind (and in your document) as you write the document.

As a designer, when I'm faced with the issue of whether to create my own vocabulary or use a vocabulry made with a mix of several pre-existing ones, Create-My-Own suddenly becomes the simpler option. This is bad. It should have been easier to go with what I already have, but it isn't. Not if the goal is KISS.

This being an XML issue, why am I picking on RDF/XML? Because I can easily create an Atom vocabulry that has everything in its own namespace (nice and simple for users), but I have to abandon that if I'm to go RDF/XML.

Bringing in yet-another-spec (architectural forms) doesn't help -- you see, it's the RDF people who have to tell us if there's a way to keep the syntax simple, and still manage to feed our documents to RDF parsers. I'm just a lowly XML guy.

(To anyone contending that once you have an Atom-RDF/XML template, everything is easy: yes, but Atom should support people whose main business is generating notifications from toasters, and XML is difficult enough for these guys.)

Thursday, August 14, 2003

A Useless Comment on Atom 0.2, RDF Style Comment [] • Trackback [] • Google It! • 2:05:27 PM •

Some people have suggested that Atom could be made more RDF-friendly; others object. Most helpfully, we now have a suggested RDF version of the Atom 0.2 example. After looking at it, all I can say is how ugly.

It's not that I don't like RDF. I actually do.

RDF/XML, OTOH, ITD [*].

Two things are apparent:

Syntax-wise, XML namespaces are seriously flawed. In every reasonable language (programming language, but the principle works the same for languages people speak) one has a way of pulling names from multiple namespaces into another namespace. In C++, I can say "using A::B::C; using D::E::F;" and later refer to A::B::C and D::E::F as C and F, respectively. This simplifies handling such names considerably. XML Namespaces doesn't let you do this -- it insists you write fully qualified names every time (except for one default namespace, which obviously is not enough in our case).
Although RDF by its very nature is meant to weave together different namespaces into a single model, RDF/XML does nothing to alleviate the problem. It doesn't provide a way to locally create a namespace X and then map X::C to A::B::C and X::F to D::E::F. If you wonder why the Atom <id> element had to be replaced with the rdf:about attribute, this is the reason: while both mean the same thing, RDF demands using its own namespace, for no good reason.

Ugly.

BTW -- Looking at the sample feed, I believe there needs to be an rdf:about attribute on <foaf:Person>. Otherwise, if Mark Pilgrim were to write two entries, an RDF parser would not be able to tell they are the same person.

Update: Morten Frederiksen comments:

Re 1: "Fully qualified" would seem to imply that only complete namespace URIs with local names attached would work. That is of course not true, as qnames are used extensively. Also, you don't have to keep the same default namespace throughout a document.

I wrote "fully qualified" without regards to the meaning it has in XML. Sorry. What I meant was that you have to qualify names with their namespaces except for names in the default namespace. The fact that the default can change helps a little, but doesn't solve the problem of allowing you to create a document in which the presence of namespaces is reserved to just the "header" of the document. I don't want to juggle namespace constantly in my documents.

Re 2: Why define an atom-id in the first place? The rdf:about is part of the syntax. In other cases subPropertyOf etc. could be used.

Atom-id is defined because (1) when it was defined nobody considered reusing named from RDF a good thing, and (2) people wanted (and some of us still do) the document structure to be simple. As a result of the above, this means we have our own "union" namespace with all the interesting stuff we think Atom needs.

Re BTW: You should look at the FoaF spec to see that this is also not true. FoaF makes entensive use of owl:InverseFunctionalProperty to be able to identify persons accross mentions.

I must admin OWL is something I have avoided learning for quite some time... Thanks. (BTW -- if it can do that, can it also make an out-of-band association of the element atom:id with the attribute rdf:about?)

Update 2: Samuel writes:

However, in Java (and as I recall, C++, Perl, and Ruby), if you have X::Foo and Y::Foo, and you want to use them both in the Z namespace, you still have to call them X::Foo and Y::Foo regardless of what you use, import, or require.

That's the whole point of namespaces: avoid collisions. With code it's much easier to do because if there's a collision, it breaks and you fix it. But, XML is *not* code. It is data. You have absolutely *no* control over where your data might end up, and so it is imperative that the namespace delimiter stay with it to prevent possible collisions.

I think this is missing the issue:

Local-name collision happens when the namespaces you want to unify have colliding names. Since what we're trying to create is a new namespace built from elements from known namespaces, we already can tell, at design time, whether their local names will collide or not. In our case, they do not, so this is not an issue.

Even if it were, it should have been trivial to map names during the unification process. In C++, for example, if you have both A::x and B::x, you can pull both names into a single namespace by typedef-ing one as Ax and the other as Bx.

Our intention in Atom is to create a core spec that provides all the essentials, as well as some extension mechanism. The core itself is completely "static" in this view -- it identifies the vocabulary we intend to use. Colliding names from other namespaces come by way of extensions, and they can still use XML Namespaces. It's about the core that we're talking about now, and how to make it as simple as we can.

[*] I truly dislike.

Saturday, August 09, 2003

The Economics of Application Installation Comment [] • Trackback [] • Google It! • 10:54:41 PM •

Sean McGrath writes in ITWorld:

In my mind's eye, I see an installation system based on Unix's chroot concept (for establishing virtual hierarchies for applications) and Unix's symbolic link concept (for managed duplication). I see a world in which every Java application has its own JVM, its own JDK, its own copy of *everything* all in a nice tidy directory - a truly self contained world.
Why not? It would waste a few gigabytes? In the time it has taken you to read this article you have probably been paid the equivalent of many gigabytes of disk space.

The sad reality is that as CPUs are getting faster, main memory and disks lag behind. By a long shot. So, if each application you have installed duplicates all the libraries it depends on, it will take longer to install, longer to load, and (because modern CPUs totally rely on their cache to keep their maximum pace) longer to execute. The assumption that we should stop optimizing for size, popular as it is among dynamic languages supporters, is plain wrong. Actually, it's getting to be farther from the truth as CPUs keep getting faster, but memories and disks don't.

URI != URL Comment [] • Trackback [] • Google It! • 9:21:16 AM •

In a comment to my post on Atom 0.2, Sam notes that the <link> element is a URI, not a URL.

Thanks, Sam, I didn't notice it. Now that I have, it looks wrong to me.

There's a tendency in our industry to treat URIs and URLs as if they're the same thing, or at least very similar. There's also a common thinking that "a URI is everything a URL is, only a bit more general, so let's use that instead". I agree with neither.

On the Difference of URIs and URLs

To explain why, here are the two important differences between URIs and URLs:

A URI represents identity. As such, a resource's URI doesn't change. A URL, on the other hand, is a name. Just like people can change their name, the name(s) of a resource may change.
A resource's URI tells you nothing about the resource. It's URL gives you a closure of actions you can apply to it. (For example, if you have a http: URL, you can do GET, PUT, POST, etc; if you have a mailto: URL, you can send mail to the resource, etc.).

The point is that while these differences make little difference to us humans, tools that process them (should) behave differently. If U is declared to be a URI, a tool that processes it cannot in general apply any action to U. (There was a long discussion in the XML circles about what would you find at the "end" of namespace URIs when they are URLs; as far as I know, the proposal I most liked -- RDDL -- never got anywhere.)

When U is declared to be a URL, the tool can in general rely on its semantics beforehand, for example try to retrieve it for offline reading. (This applies to URLs whose protocol has some "GET" action, like http:; this isn't the case for mailto:, for example.) Even if the tool does not understand the semantics of the URL, it can still offer a hyperlink to it (punting the work to the OS, if you're working in Windows).

<link> Should be a URL

Coming back to the original issue, a <link> element serves human readers, because tools cannot rely on the resource it represents to mean anything. As such, making <link> a URI makes little sense to me: what is the utility of allowing a <link> to be "bla:1234.0987"?

Friday, August 08, 2003

TrackBack for Radio Comment [] • Trackback [] • Google It! • 11:42:31 AM •

I have enabled TrackBack for Radio. Let's test if it works.

Atom 0.2 Comment [] • Trackback [] • Google It! • 10:52:59 AM •

Atom 0.2 is out. Although not an official spec, it looks like it's now solid enough to comment for people who are, shall we say, wiki-shy. So here's mine.

Choice of Top-Level Element

Atom 0.2, unlike RSS 2.0, has <feed> as its top element. While this is fine if all you want to use Atom for is blog notifications, it's too restricting for future growth.

For example, it doesn't handle cases in which several feeds are held in a single container. (For example, one can think of an Atom replacement for the ugly and restrictive OPML format, in which you have a single XML tree that only holds feed elements, with no content.)

I think we need a top-level element that would hold the <feed> element, as well as any @version information we'd care to put.

The <link> Element

The spec mandates one <link> element per <feed> and <entry> elements. It says this about the element:

[T]he link to the website described by this feed

and:

[p]ermanent link to a representation of this entry

Here's what it doesn't say, but is implied: the definition of a <link> element in this manner means that it is useful mostly to humans. Of course, one may build tools that would make good use of this element, but in general, such links cannot be relied upon.

Why am I saying that? After all, this is what RSS 2.0 does today, so it is the proven way of doing things, right?

Well, I don't think so. Let's consider the use of the permalink in weblogs. Most weblogs today fall into one of two camps: "Take That" weblogs provide the entire content of each entry in the feed. "E.T. Phone Home" weblogs provide only a teaser, and readers who want to actually read the entry are forced to do so with their browsers.

Now, consumers who prefer the first type of weblogs (TTW) rarely click on the <link> element, bacause they have no reason to. There are some weblogs that I couldn't recognize on sight, simply because I read them entirely using Aggie, without ever navigating to the site itself. For them, the element is mostly useless.

Other consumers rather like that they get only a teaser, and they get to decide if they want to navigate to the site or not. They also like getting to the site, to see everything in original colors, etc. For them, the link is everything.

Note, however, that in both cases the element is not used by the aggregator itself. To the aggregator, there's little difference between the <link> element and the <tagline> element -- they're both for consumption by humans.

So what? I have no problems with having a <link> element. I do have a problem with (1) having this element mandatory, and (2) the apparent thinking this mechanism is enough.

Don't Make <link> Mandatory

<link> should not be mandatory. It assumes a web presence, which is not always there. Suppose your printer delivers "job-done" notifications back to use in an Atom feed. What should it put in the <link> element? The URL of HP?

<link-to-atom>

Why is today's <link> element not enough? Suppose a producer of the feed does not want to provide full content (for example, because it takes too much bandwidth) but he has some readers (me!) who would like information to come to them rather than the other way around. With Atom 0.2, one has to give.

IMHO, this is exactly the type of limitations Atom was born to solve. The solution would be to provide another type of link element, <link-to-atom>, whose contents is a URL to a resource in Atom format itself. When attached to a feed, it provides the URL to the feed in Atom format, thus making the feed declare where it is located. (This has the pleasing property of allowing aggregators to track changes in feed location naturally, and producers who can't generate redirect pages happier.) When attached to an entry, it provides the URL to the Atom representation of the entry itself, only this time with the whole contents.

(Bandwidth-aware producers might also note that such a mechanism reduces their bandwidth consumption even more, because clients don't need to download all the "fluff" that entries usually contain in their web page.)

<generator>

This is a cosmetic remark. If we want the <generator> element to provide both a URL and a display name, then the URL should be an attribute and the elements' content be the display name, not the other way around. This is how anchor elements work in HTML, and I see no reason not to keep this format.

<author/url>

There must be a good reason why this is called <url> and not <link>, but I just don't see it.

<entry/id>

Yes, YES, YES

I can't stress this enough: a mandatory <id> is the most important feature in Atom, and it alone is sufficient to justify the whole effort.

Here's a simple use model that is not currently supported: Suppose I write a short article about (say) Atom 0.2. I want people to read this article, so I post it on my weblog. I want more than two people to read it, so I also post it to the Atom mailing list, the Wiki, as a remark at Sam Ruby's weblog (I'm sure he won't mind), and any other place I can spam. Now how can someone who reads several of these "water fountain" sources tell that he's already seen the element? How can he comment and make sure the comment propagates everywhere the original went?

Today, we have poor connection between distribution channels: People who leave comments in other people's weblogs don't post them to their own weblogs. Remarks I make in a mailing list remain confined there, unless I repeat them on my weblogs, and there's no way a smart universal client could weave them all together.

<id> will allow us to change all that. Technically, it's as old as SMTP and NNTP, who put it to good use. As a concept, it's as old as Adam's naming all animals. 3000+ years later, we still use these names [*]

content/mode

If this attribute is optional, the spec should call out what the default mode is.

[*] If you speak Hebrew, that is.

Monday, July 28, 2003

Ziv Comment [] • Trackback [] • Google It! • 10:57:47 PM •

Google is loosing the war against weblogs. Consider this:

On the first results page for Ziv Caspi, all links are about me

According to Google, I am the most important Caspi on the net

Not only that, yours truly is the number 2 Ziv out there (a trait I share with Sam)

This is just insane.

(Yes, the title of this post is designed to my position on that last point, because titles matter...)

Shameless Plug Comment [] • Trackback [] • Google It! • 10:47:31 PM •

Joe seems to be sailing in a new direction, having recently formed Bitworking, Inc., a company that does Custom System Development.

Friday, July 11, 2003

Quick 'n Dirty Comment [] • Trackback [] • Google It! • 11:41:01 AM •

A) Time to read Simon Fell's announcement: <1min.

B) Time to download resource to cache for testing: <1min.

C) Time to get Aggie to read his new Necho feed: <10min.

D) Time to write this in Radio (twice, because things didn't work on the first try): >(A+B+C)

Saturday, June 14, 2003

Antibiotic Days Comment [] • Trackback [] • Google It! • 11:21:32 AM •

Tim Bray:

[...] there was a time when being a Web Guy was like being Gandalf the wizard and James Herriot the country vet all rolled into one.

Full Content RSS Fragments Comment [] • Trackback [] • Google It! • 10:25:46 AM •

In the RSS world, opinions differ on the important issue of whether or not to provide full content RSS feeds (that is, whether at least one of item/description, item/dc:content, or item/xhtml:body has the full "information" content of the item).

Ignoring for the moment conceptual aspects (for example, is the RSS feed a notification channel to draw people to the web site or a "first-class" content distribution means), there are significant "down-to-earth" aspects to this issue: both producers and consumers would like to cut down their bandwidth costs.

Downloading a 15-items full content feed when only one or two items change per download is wasteful. This has driven many producers to offer only "lightweight" feeds, in which the content is some "lossy-compression" of the full content, which is not provided as RSS. Sometimes the "compression" is done by providing just the first N words (or sentences) of the content; in other feeds the author providing an abstract in the RSS feed, or use a "tease" catch-phrase. In all cases, consumers have to manually go back to the original publisher's site to get the full monty.

Problem is, this makes reading RSS feeds unpleasant for a large group of people who read RSS feeds offline. Imagine this: you're on the train, happilly reading all the RSS feeds you've collected in your aggregator, when you read something interesting on Sam Ruby's weblog. Sam, however, just switched to short-form feeds, so you're out of luck; you have to wait until you get home to get it all.

It doesn't have to be this way.

Here's the idea: In every RSS <item>, provide a link to a resource that holds the item's full content in RSS form. For example:

http://bla.bla.bla.com/blog/12345.rss

(Hopefully, Joe would allow me to use his well-formed web namespace.) The resource indicated by link-to-rss is a valid RSS feed, probably (but not mandatorily) including only a single <item>, with full content. Aggregators are already quite good at detecting when items in an RSS feed change (by hashing title/link/description like Aggie does, or by looking into the dc:date/pubDate, or via similar means), so all an aggregator has to do is detect that an RSS item has been added/modified, and then download the item itself, this time with full content included.

Comments?

Friday, June 06, 2003

Lisp Syntax Matters Comment [] • Trackback [] • Google It! • 12:50:41 PM •

This is bound to come up every now and then. Graham Glass writes:

LISP was an incredible work of art. so simple and so reflexive. but an absolutely crap syntax that doomed it.

In response, he got some of the expected "stupid, the syntax is what makes Lisp so powerful" comments (which I'll not link to), and some interesting ones (see his comments page). This debate comes every often, with Lisp gurus telling everybody else not to worry about the syntax and that they'll get used to it, and everybody else saying how they like Lisp, except for its syntax.

Come to think of it, not unlike the RDF scene (see what Joe, Tim, and yours truly had to say about that one).

Monday, June 02, 2003

IE and the OS Comment [] • Trackback [] • Google It! • 1:01:21 PM •

Joe Gregorio writes:

Microsoft, trying to squeeze more revenue from operating system sales, looks to leverage it's monopoly in the browser market to force people to upgrade to the latest version of Windows.

Why do you say that? As you (I think correctly) point out, the current monopoly Microsoft enjoys in the browser arena [*] essentially provides no leverage.

IMHO, a decision to only add value to IE on future operating systems is simply making a good business move: there's no point in putting high-paid developers on a product that makes no income, right?

The Microsoft revenue stream is built almost entirely of selling bits and papers [**]: you buy bits from us (in the form of a CD, a DVD, an Internet download, or an activation number), and you buy software licenses. To keep the employees paid, we need to either increase market penetration, or improve our products so that people will want to upgrade. This mechanism is well-known.

So improving products is how Microsoft pays the bills (and Bill). Nobody forced people to install IE 6 (as has been indicated by so many people, it offers little features beyond IE 4!), yet they did. Similarly, nobody will force Joe to upgrade his Windows 98 . Perhaps if he sees enough value in it, he will [***]. The fact that he continues running Windows 98 (IMHO the worst OS release Microsoft made in the last 10 years) clearly shows that despite Microsoft's being a monopoly in the desktop OS market (and browser arena, and probably office suites), it still can't force customers to do what they want.

If you look at all the products/technologies that started life on their own and were later incorporated into the OS, I believe you'll see a giant jump in customer value, both in quality and in features. I don't doubt that the same could be said of IE.

* It certainly isn't a market, because most of us don't pay to get a browser, at least not directly.

** This is how Steve Wasserman explained it to me a month after my company (Peach Networks) was bought by Microsoft in early 2000.

*** Joe, here's an offer you can't refuse: I'll buy you a Windows XP Professional as a birthday present if you only dump that junk they call 98.

Disclaimer: I work for Microsoft. I own (very few) Microsoft shares. My opinions do not reflect those of my employer. I am not privvy to any internal discussions or decisions made by Microsoft on the future of IE (if I were, I wouldn't be posting). Everything I say here is based on stuff that has been publicly available on the Internet, and my own speculations.

Friday, May 23, 2003

The RDF.net Challenge Comment [] • Trackback [] • Google It! • 9:54:11 AM •

Tim Bray has his own rant about RDF, mainly because he finds the RDX/XML syntax unsuitable for human consumption.

I hope, however, that he pushes harder on this issue than just offering to hand over the RDF.net domain name. There's a lot of untapped potential in RDF. When the little people tried to push, we got nowhere.

Friday, March 21, 2003

Capacity of Ad Hoc Wireless Networks Comment [] • Trackback [] • Google It! • 8:37:48 PM •

Ad-hoc wireless networks (AHWN) are digital communication networks built from a set of small, independent, wireless devices. Unlike more "traditional" wireless networks, in which end devices communicate with some type of a fixed server, in AHWNs end devices talk to each other directly. As devices move from one place to another, or communication patterns change, connections between devices are changed accordingly, thus the "ad-hoc" aspect.

By their nature, AHWNs do not require a centralized architecture, and all devices on the network can be regarded on equal footing. In principle, this means that the entire population of a large geographical region can be equipped with such devices, and never pay their local service provider to communicate. If device A wants to talk with device B, it can do so directly provided they are close enough.

What if the two devices are two far apart? Here comes the "neat" part: The devices talk to other device which are close enough, thus establishing a multi-hop route between them. In terms of routability, it's just like how the Internet works (a message starting from end device A travels through multiple routers until it reaches end device B), except that each end device can also act as a router, the routes are ad-hoc, and no-one needs to pay any bills.

If it's so good, why don't we all dump our Internet providers tomorrow? Well, some people think this is exactly how the future looks like. They lobby quite regularly for their position these days. Personally, I have quite a few reservations about this sort of "free-lunch" architecture.

An interesting paper I read today provides some practical evidence (as opposed to theoretical arguments) that such networks -- while they might work perfectly for local regions -- do not scale to Internet sizes. In their paper Capacity of Ad Hoc Wireless Networks (which is recommended reading to anyone interested in the subject), the authors conclude:

[...] We find that, in general, 802.11 does a reasonable job of scheduling packet transmissions in ad hoc networks. 802.11 is more efficient for orderly local traffic patterns, such as a lattice network with only horizontal flows. 802.11 is also able to approach the theoretical maximum capacity of O(1/sqrt(n)) per node in a large random network of n nodes with random traffic.

We argue that the key factor deciding whether large ad hoc networks are feasible is the locality of traffic. We present specific criteria to distinguish traffic patterns that allow scalable capacity from those that do not.

[Li, Blake, De Couto, Lee, and Morris; Capacity of Ad Hoc Wireless Networks]

Saturday, February 15, 2003

Aggie RC5 Comment [] • Trackback [] • Google It! • 11:10:09 PM •

Aggie RC5 has just been released. Congratulations to Joe and all the people involved.

The Amateurs Comment [] • Trackback [] • Google It! • 10:59:37 PM •

I remember a conversation we had a year or so before his death, walking in the hills above Pasadena. We were exploring an unfamiliar trail and Richard, recovering from a major operation for the cancer, was walking more slowly than usual. He was telling a long and funny story about how he had been reading up on his disease and surprising his doctors by predicting their diagnosis and his chances of survival. I was hearing for the first time how far his cancer had progressed, so the jokes did not seem so funny. He must have noticed my mood, because he suddenly stopped the story and asked, "Hey, what's the matter?"
I hesitated. "I'm sad because you're going to die."
"Yeah," he sighed, "that bugs me sometimes too. But not so much as you think." And after a few more steps, "When you get as old as I am, you start to realize that you've told most of the good stuff you know to other people anyway."

From Richard Feynman and The Connection Machine, a fascinating read by Danny Hillis.

Tuesday, January 28, 2003

Deleting a Post in Radio Comment [] • Trackback [] • Google It! • 11:46:29 PM •

Is there a way to delete a post in Radio?

(Other than editting the old post to read like a question about deleting the post, that is.)

Saturday, January 25, 2003

Parsing RSS At All Costs Comment [] • Trackback [] • Google It! • 2:45:08 AM •

Mark Pilgrim writes about liberally parsing non-well-formed RSS feeds. We've discussed the issue before (see, for example, my own shaming them into submission article, as well as Mark's own comments on his liberal parser on his site). Reading the comments people made is quite interesting, BTW. As I've said in that article, an aggregator cannot simply ignore broken feeds. Once it does, it stops answering users needs, and is destined to be ditched.

It helps to review what happened during those five months. The RSS parser I wrote for Aggie RC5 -- soon to be release -- has indeed been complaining for quite some time about broken feeds. I've personally sent quite a few emails to RSS authors whose feed Aggie complained about, and in all cases I got excellent results. I'm certain others has as well. The quality of RSS feeds has been improving over the last few months; largely, I think, because people who noticed complained.

To sum it all up, I think the approach we preached certainly paid off. On one hand, we tolerated broken feeds and gave our users tools that satisfied their needs -- reading RSS feeds. On the other hand, we raised a warning flag that helped people improve as they went along. Unlike HTML, the RSS compliance scene is improving over time, all because we took the middle ground.

Friday, January 17, 2003

The State of the .NET Aggregators Union Comment [] • Trackback [] • Google It! • 12:13:46 PM •

Must be something in the air. In the last few weeks we've seen an explosion in the number of .NET RSS aggregators. Can someone explain the reason for this phenomenon?

The first .NET aggregator was, as far as I know, Aggie. Created by Joe Gregorio, Aggie enjoyed having the playground for itself for many months. It has been considerably improved since its early days, and has now what I think is the best RSS parser available for any platform. Unfortunately, Joe's interests seem to have shifted somewhat. The last Aggie release Joe shipped was RC4. The current code base (what someday we hope to ship as RC5) is very different from RC4, so if you plan to use Aggie I suggest you download the latest code and use that. (If you're too lazy to do that, send me a note and I'll send you the binaries you need to run Aggie.) Aggie produces output as a single large HTML file, or as a series of mail messages it sends to your account. Disclaimer: Apparently, I am the last Aggie developer left working on the code, so my view is obviously biased.

In no particular order, some of the other .NET RSS aggregators are:

Dare Obasanjo's RSS Bandit.
Dmitry Jemerov's Syndirella.
Greg Reinacker's NewsGator.
Don Box's RSS reader (which apparently has no name).

Update: I received two more pointers (thanks guys!):

John Ludwig sent me a link to Beaver.
Dejan Jelovic sent me a link to his very own Composite RSS/RDF Aggregator.

Update II: Let's not forget David Peckham's NewsDesk!

Again, if you know of another, please let me know.

Saturday, January 11, 2003

Aggie Update Comment [] • Trackback [] • Google It! • 11:06:42 PM •

Earlier today I posted the following to the Aggie development list

As some of you may have noticed, Aggie has been updated twice during this weekend. This mail describes the changes.

Extensible RSS components

The most significant change is difficult to observer: Aggie's RSS parser has been modified so that it becomes extensible. The parser regards RSS feeds as a container ("channel") that holds a sequence of information pieces ("item"). While both the container and the items it contains have three mandatory (perhaps empty) properties -- title, link, and description -- each may also have any number of additional properties that the parser can now hold.

Moreover, the parser now supports defining new channel and item properties at run-time. (The necessary code to read external configuration files into the parser is not yet done, but all the necessary pieces are in place.) This means that adding support for new elements has become quite simple. You only need to define the namespace the elements appear in, what variants of RSS they are relevant to, and how they are to be taken from the RSS document into the aggregated output that the parser spits out.

(For those with source access, check out the static RssDocument constructor, which I intend to replace with a configuration file shortly.)

Support for output of arbitrary elements is already partially implemented. If you're using the HTML output of Aggie, simply modify the skin template to include any property that is found in the parser's aggregated output.

(If you have not updated your skin template in a long while, I advise you to do so now -- I've checked-in a version that displays all the comments that Aggie has to say about non-well-formed feeds, as well as added support for dc:date elements.)

(Also, if you are a template writer yourself, please have a look at what elements Aggie may put into the aggregated XML document and decide if you want the new elements to appear in your output.)

Email and short-notes channels

During the last few weeks we see a growth in use of RSS feeds to publish comments, pingbacks, and trackbacks to a particular feed. If you read Aggie's email output (like me), this means you now get lots of "short-notes" messages which are difficult to fathom because they hold no context information.

To fix this, I've finally caved and added support in the mailer for aggregating several items in a single mail message. (Yes, the ultimate solution would actually be to provide an external XSLT transform, like what we do with HTML, but we're not there yet.)

The mailer can now be instructed to do one of the following:

Put each item in its own mail message
Put all items from a single channel in one mail message
Group all items of a channel with the same title in one mail message

Additionally, the mailer can reverse the order of items (useful for upside-down feeds).

In principle, we would like this to be per-channel configuration option. However, Aggie currently has no way (other than by twicking the URL, an obvious hack) to maintain per-channel configuration information. We'll need to work out an extension model for the OPML with AmphetaDesk (and perhaps UserLand?) Until we do, however, controlling this aspect of the mailer is a global setting.

The little things that count

These modifications provided a good opportunity to make several bug fixes that mainly affect the "polish" of Aggie. For example, we now have better UX when a user attempts to add a channel to Aggie and something goes wrong.

A plea for testing

There's no chance all these changes took place without breaking something. While I've tried everything on my test feeds, and am constantly dog-fooding any change I make, I'm sure there are problems I've not caught. Please install the latest build and let me know if you see something wrong.

(For those who have no source access, or who do not want to build Aggie, I'd be happy to provide the necessary binaries and configuration files. Note that all changes are completely backward-compatible with respects to your configuration files, so you won't lose anything if you decide to switch back later.)

Thanks

Saturday, January 04, 2003

What the Future Holds for Microsoft Comment [] • Trackback [] • Google It! • 10:32:00 AM •

Kuro5hin has an interesting interview with Adam Barr on the future of Microsoft.

Spinning Wheels Comment [] • Trackback [] • Google It! • 10:09:55 AM •

Grotto11 writes:

In Windows, to play MP3s, you navigate through folders, find the files you want to listen to (however you've chosen to organize them), and double-click to open them in your MP3 player, which immediately plays them. On the Mac, you first have to find and open iTunes; thereafter, you work with the music on the music's own terms, using the music's own intrinsic attributes, which are intuitively obvious within minutes of a user seeing the program for the first time.

Hillarious! I still remember the time Apple trumpeted its document-centric UI as opposed to Microsoft's "dull" application-centric way of doing things.

Sunday, December 15, 2002

The Product of Talent and Knowledge Comment [] • Trackback [] • Google It! • 11:00:11 PM •

Joel Spolsky writes an excellent article about the importance of experience when programming.

Joel, a one time Microsoft employee, doesn't give examples from the largest software company in the world. This is surprising, because traditionally Microsoft had a very strict recruiting policy: get the best people, regardless of whether they know something about the problem or not. The underlying assumption is that knowledge can be taught, but talent cannot.

The consequences of this policy are evident in many Microsoft products. When a new product which is unrelated to previous Microsoft products is being designed, the people who work on the project are brilliant, but inexperienced in the problem domain. Often, developers bring their own mindset from previous projects they've worked on to newer projects, whether this mindset suits the problem or not. Therefore, it takes them a few trial-and-error phases to get things done right.

For example, look at Windows CE. Windows CE has a desktop mindset enforced on a realtime environment. It was really too big for the little devices available at the time. Despite Microsoft claims at the time, it was not a realtime OS when it came out. (Microsoft used a weird definition of realtime invented by the automobile industry, at a time when the latter didn't actually understand realtime.) As a result, it took several releases of Windows CE to become moderately successful.

(It's interesting to note that Windows CE, while too big for the little devices they tried to run it on, was exactly right for the embedded systems market, in particular the industrial and military markets; these markets really wanted to use Windows CE, but were almost completely ignored by Microsoft at the time, and a great opportunity was lost.)

Apparently, Microsoft has shifted its opinions over the last few years. This is good, because you really can't beat the combination of talent and domain knowledge. With the job market low as it currently is, there's an opportunity to hire good people, experienced in areas the company isn't.

(Note: I realize Joel is mainly speaking of tools and technologies experience, while I'm talking mainly of problem domain knowledge; you really need both, I claim, to be successful.)

Saturday, December 14, 2002

Open Spectrum: A Global Pervasive Network Comment [] • Trackback [] • Google It! • 2:04:38 AM •

Aaron Schwartz wrote an article on logicError titled Open Spectrum: A Global Pervasive Network. I'd like to address a few inaccuracies in his article, and then tackle the bigger issue.

Aaron writes:

How much information can we send over a radio station? In other words, what's the capacity of the spectrum? This would seem to be a very important question for anyone interested in radio, but the fact is we just don't know.

On the contraty: we know this quite well. Shannon proved his channel capacity theorem quite a long time ago. What we don't have, to this day, is the answer to an arguably more important question regarding the aggregate capacity of a network of N nodes.

The early models assumed that capacity was the same as the number of stations used. In other words, information rate was proportional to bandwidth.

Capacity, as defined by Shannon, has a very strict mathematical meaning. It would be more accurate to ssay that "the early model talked only on networks in which communication is end-to-end, with no intervening nodes-in-the-middle".

(This misconception explains why most people call the speed of their Internet connection their "bandwidth".)

The reason people call their Internet connection "bandwidth" is because they don't understand what bandwidth mean, and "information rate" would not sound as good. They are not far from wrong, BTW, because it *is* the phone network's bandwidth (restricted by filters intentionally put in by the phone company) that is the limiting factor. If you dialup via a moedm to your ISP, the capacity theorem applies to you. Modems today are so good they virtually achieved the limit set by the capacity theorem; that's why, while all other computer equipment gets faster, 56K modems have been with us for a few years, and won't go away until the phone network is changed (and it won't).

[...]

More research has found some other interesting results: What if we spread our communications across the spectrum? Capacity goes up. What if we spread them out across time? Capacity goes up. What happens if we have multiple paths to transmit? Capacity goes up. What happens if the transmitters move around? Capacity goes up. Every place research has looked, they've found that if they do that capacity goes up. And the research is far from done (sadly, because few people are doing it -- more on that later).

As much as I like this description of "everyone thought X, then somebody showed that Y", reality is not that simple. The article uses a few well-defined terms (capacity is cheif among them), but treats them as if they mean something other than their definition. Such language, probably chosen to make the article understandable by the general public, is bound to be inaccurate, and to lead some readers to false conclusions.

Spread spectrum (CDMA) techniques do *not* increase capacity. Their power lies in the ability, unlike other multiple access techniques (FDMA and TDMA), to dynamically achieve better quality when transmitters suddenly stop. While the older FDMA and TDMA techniques achieve efficiency by using central coordination, CDMA need to nothing to adapt.

Similarly, spreading across time gives you no extra capacity, pure and simple. In fact, one of the most striking results in information theory is that everything is connected with the relation Eb/N0, where Eb is the energy you put to transmit a bit and N0 is the noise level. It doesn't matter if you send a strong signal with 50% duty cycle, or a 50% weaker signal with a 100% duty cycle. What counts is the energy you put into sending that bit.

Spreading has other desirable attributes, such as its inherent "resistance" to duplicated shadows reaching the receiver. That's why multipath, a major source of headache, can actually be used as an advantage by smart CDMA receivers.

Capacity also does not go up when you have multiple paths. If it did, makers of microwave antenna dishes (which are built to have narrow beams) would go out of business.

It is, however, true to say that when the transmitter and the receiver have no directional communication means (that is, they are forced to use omni antennas, like your cellphone), multipath can be used to *realize* the capacity. Again, note that capacity is the absolute achievable maximum, and all the hardware and brainpower we throw at it can only help us achieve it, not increase it.

(As a sidenote, in an optimal network, you always know where the "other side" is, what spacial emission pattern is best, and have a phase array antenna to actually generate that signal. This makes multipath another channel characteristic like, say, different noise levels at different frequencies, and makes the solution "simply" another type of coding, this time spacial coding.)

[...]

The case went to the Supreme Court who decided, based on the flawed but intuitive model above, that the FCC was necessary. Spectrum is limited, they were told, if everyone tries to speak, then no one will be able to. The FCC is required simply by the way that radio waves work.

But as we have seen, that's simply not true. What if the Supreme Court had known this? Would they have declared the FCC unconstitutional?

Actually, it was absolutely necessary, or chaos would have ensued. You see, collaborative networks, spread spectrum, and all the wonderful achievements of modern digital communications, were simply not possible at the time. And even if they were, radio stations (and the like) strongly depend on the economy of "thin client" models: You have sofisticated, expensive, radio equipment at the broadcaster's location, and everyone can afford to buy cheap radio receivers. Building a radio AM receiver is a piece of cake. It costs almost nothing. Symmertric network communication equipment, however, is an altogether different thing. Only now we reach the price-performance point to even think of collaborative networks.

Don't get the impression I'm shooting down the idea. I'm not. What I don't like is the way the idea is presented. Having discussed the envelope, let us discuss the contents, and the idea is certainly worth discussion.

Aaron is entirely correct in saying that we can achieve better capacity, and the way to do it (as he correctly points out) is to use the "network effects". Node cooperation can improve aggregate capacity, somewhat analogous to partitioning a large LAN into smaller LANs to improve aggregate "bandwidth" (but only somewhat).

There are other technical issues with collaborative networks. For example, we don't have today the routing technology to make this happen. Another example would be the latency issues with such a network. I think, however, these issues are large enough to punt them to another time ;-)

I'm not at all opposed to having more bandwidth opened up for collaborative networks. I think it's a good idea with lots of potential. I also think that, when all's said and done, it won't be the basis of a global network.

The reason is social. For such a network to work, you need everyone's cooperation. Everyone must limit their transmission power. Your neighbors must agree to let you pass information through them. There must be no "bad guys" out there to jam your signal (make no mistake: it *is* possible to jam spread spectrum signals).

I don't see all this good will spontaneously happening. Is your web server being attacked today? Now suppose your internet connection is exposed to any bozo driving along with a strong-enough jammer. Who do you blame when you don't get the level of service you need? Collaboration looks like a nirvana: pay one-time for the equipment, and get Internet access for free. I don't believe in free lunches.

Tuesday, November 26, 2002

My Pick for Best Invention of the Year Comment [] • Trackback [] • Google It! • 10:55:12 PM •

SpeechView, a small Israeli company, has come up with what appears to be one of the coolest inventions over the last few years. It's a device that connects your phone with a computer, "listens" to what the other side says, and translates speech into facial expressions on the monitor that lip-readers can actually read.

An Israeli cellular company has started selling this solution today. If this works as advertised, it should be a revolution.

Friday, November 08, 2002

A Nice Surprise Comment [] • Trackback [] • Google It! • 12:07:17 PM •

For the first time in a month or so, Radio didn't crash when I shut it down. Oh, the joy...

Saturday, November 02, 2002

Comments on Microsoft Trial Aftershocks Comment [] • Trackback [] • Google It! • 12:33:13 PM •

As can be expected, there are a lot of funny aftershocks to the ruling at the Microsoft trial. Here are a few comments by a completely biased observer (buy more MSFT shares NOW!)

AP (through Yahoo Finance) says:

Sun Microsystems Inc., long one of Microsoft's harshest critics, urged the nine states that objected to the settlement to appeal the ruling by U.S. District Judge Colleen Kollar-Kotelly.

One of Microsoft't harshest critics? That's bad journalism. They are a rival (as the title correctly makes out).

"Choice, innovation and competition form the foundation of the technology industry," said Sun Microsystems attorney Michael Morris. "(Friday's) ruling does little to advance these principles or to protect the millions of developers and businesses that want an open marketplace."

That's funny coming from an attorney. How is that a legal argument? It isn't. We're talking politics here, falks.

Microsoft nearly drove Netscape out of business in the late 1990s when Microsoft melded its own Internet Explorer browser into the Windows operating system that controls more than 90 percent of all personal computers.

What drove Netscape out of the business was their own bad decisions. While Microsoft's anticompetitive actions played, at least in part, a role in driving customers to using IE instead of NN, Netscape was not making money on NN. It's business model was to sell server software. This model failed, but not because of Microsoft. Truth is, Apache probably had more in choking Netscape than Microsoft did.

The chief executive of another Web browser maker, Opera Software in Oslo, Norway, also expressed disappointment with Kollar-Kotelly's decision.
"It isn't very much of a settlement at all," Jon von Tetzchner said. "Microsoft was found guilty. There were no real remedies, no actual punishment."

Opera is in a peculiar position, because it tries to make money selling a product everyone else is giving for free. While NN funs (I was one!) might argue that IE won because of "illegal bundling", Opera can not. The cry about the punishment is particularly strange -- Opera was not a part in this battle, so it can't expect to get something out of it. Say the ruling was for Microsoft to pay a fine of 5bn dollars -- how would that help Opera?

Morris said Palo Alto-based Sun will continue to pursue its civil lawsuit against Microsoft so the company "does not continue to use its monopoly position to become the gatekeeper of the Internet."

Sun's position is also somewhat delicate here. Microsoft has been declared a monopoly in the desktop area. Sun has two major products it tries to push in this area -- their Office lookalike and Java. Let's focus on Java/J2EE. Can Sun really claim "illegal bundling" here? Sun was perfectly happy when Microsoft bundled a JVM in Windows, so they can't in good faith say that bundling JVM is okay, but bundling .NET is not.

Reuters (again, via Yahoo Finance) says:

Santa Clara, California-based Sun filed suit in March seeking more than $1 billion in damages and claiming its business was damaged by Microsoft's abusive monopoly, which impeded the use of Sun's Java software platform.

Sun has no shame. Microsoft licensed Java, wrote a JVM (which many considered the best in the business), and bundled it with their browser which it pushed as much as it could. How was that damaging to Sun? Sun argues that the non-portable extensions Microsoft put in Java impeded its use. That's nonsense. All of these were extensions, so one could simply ignore them if one wanted to. Can Sun show a single developer who didn't use Java because Microsoft's Java had a few extensions not endorsed by Sun (and conflicting with the license Microsoft had)? Of course they can't.

If there was anybody damaging Java, it was Sun's poor handling of this matter. Forcing Microsoft to use an ancient Java engine in Windows XP is a case in point.

Update (1): Den Beste covers some common legal misunderstandings related to the trial.

Update (2): Thanks, Chris.

Thursday, October 17, 2002

Standard of Truth Comment [] • Trackback [] • Google It! • 12:28:49 AM •

Disenchanted has an interesting article on falsehoods we're being taught are truths.

I always wondered about the belief that binocular vision is what allows us to have a good depth perception. This looks like nothing more than a myth to me. Our brain has a lot of other inputs to rely on, far better than "measuring" the parallax between what each eye sees. For example, the strain on the eye muscles required to focus on an item is an indication to its distance when it's close; the item's motion on the background of farther items helps; the item's apparent size is a good indication when we can estimate its actual size, which explains why you can quite easily estimate the distance to a tree on a mountain when other methods are completely useless.

Here's a simple experiment you can try if you have two working eyes: Close one of them now, make a ball of paper, and try to throw it into your paper basket. Was it harder to hit than when doing the same thing with two eyes open? It isn't for me.

Kramnik vs Fritz Comment [] • Trackback [] • Google It! • 12:14:28 AM •

If you're a chess fun, surely you cannot miss the battle between world champion Kramnik and his machine oponent Fritz.

Tuesday, October 15, 2002

Iraq, Israel, and the United Nations Comment [] • Trackback [] • Google It! • 12:31:46 AM •

The economist has an interesting article about Iraq, Israel, and the United Nations.

Friday, October 11, 2002

Who Put That NBSP In Here Comment [] • Trackback [] • Google It! • 11:12:35 AM •

Don Park applouds getting two spaces (rather than one) in his posts, just likes he wants to:

Above two paragraphs were handcoded in HTML to show how bad it would have been. Great job, Dave!

Huh. What you apploud is a feature of the HTML editor in IE. When it sees you typed two spaces in a row it encodes the first one as   (nonbreaking whitespace).

Friday, October 04, 2002

Outing Comment [] • Trackback [] • Google It! • 12:35:09 AM •

Dare Obasanjo on Microsofties with weblogs:

This brings the number of Microsoft folks with blogs whom I've met or know personaly over five which I always thought would never happen. Interestingly enough, most of them are Web Services folks.