It's Like Déjà Vu All Over Again
"You could probably waste an entire day on the preceding links alone. But why take chances? We also give you Paul Snively..." — John Wiseman, lemonodor
snively on lisp and xslt. I'll take a chance and publish
Paul Snively's
email without asking.
I can take it down if I guessed wrong. But I think we're in sync.
After all, it's in response to my comments about Zooko's email.
And this looks suitable for public discussion. Okay, then let's go.
Note how it fits
Raph Levien's messaging
without email
remark.
By all means! Unless explicitly declaimed, anything I write is for public consumption. I only e-mailed until I discovered Sjoerd's scrape of your blog. Now we can engage in public correspondence like civilized geeks.
Aha! I checked Paul's weblog. He says someone scrapes my site.
That might explain the sudden jump in hits without any referers.
I was kinda avoiding the noisy hits from RSS on purpose. Oh well.
Dude, noisy hits were never more than a regex away!
And he did interview at Pivia. It pissed me off when Pivia said no.
I never gave him the full skinny on reasons. They seemed irrational.
Doubtful that they were any more irrational than I must have seemed to some of them.
It was two or three different things that didn't add up in my view.
I'd say hire him because he's damn smart, and gets system stuff.
I think any touch of an AI perspective frightened some interviewers.
Geez, and I was even careful not to say "AI." I just kind of expected, given the problem domain, that a self-correcting, adaptive architecture was not only obviously called for, but quite literally the only way to reach the goal. Still, yes, my inability to come up with the algorithm for Hopfield back-propagation or the Group Method of Data Handling or Analog Complexing probably hurt me. But that's why I keep lots of books close at hand: so that when I've identified a strategy, I can refresh my memory as to the tactics.
And thank you so much for the kind words.
I tend to hit folks with solid nuts-and-bolts details in my interviews.
Whatever big picture vision I reveal here gets little interview time.
Paul Snively:
I'm sure I've asked you this before, but given your stated want, how do
you like languages that are statically typed but where types are
inferred, such as Standard ML, Objective Caml, Haskell, Erlang...
basically the panopoly of functional languages other than the Lisp
family? Since they also have Algol-inspired syntax vs. Lisp's parens, it
would seem they would tend to meet your redundancy goals in syntax as
well.
The only one I directly examined was Standard ML years ago.
I was trying to learn enough to follow Andrew Appel's stuff.
But ML proved painfully hard for me to grasp for one reason.
I hate pattern-based method typing. It's too damned implicit.
(Directly related to this, I also hate any pattern-based macros.)
I inferred that folks with a mathematical bent preferred such types.
It's a belief in magic. "More typing will protect you from harm."
This feels more like abdication of responsibility for correctness.
I don't think it's good enough to let the best fitting method fire.
"Best-fitting" makes the process sound non-deterministic, which it's not. But I suppose your point is that the process is sufficiently opaque that, in the hands of a careless ML programmer, it might as well be.
I think coders must easily, consistently know exactly what runs.
This is also one of the things I don't like about generic functions.
I like very loose runtime structure, but very predictable results.
Using types to bind methods implicitly seems dangerous to me.
And personally I find debugging hard without a total global view.
(When I know everything about a system, I debug very rapidly.)
While I can sympathize with this view, I have to say that I find the pattern-matching of ML, Haskell, et al a huge help in dealing with recursively-defined data structures. So perhaps there's room for this degenerate case, where a compound data structure maps in some totally trivial way to some disjoint set of functions that handle the components of the data structure.
I like high level and higher order functional mapping semantics.
But I like to invoke them imperatively exactly according to plan.
Dang, I still haven't answered the basic question on static typing.
Yes, I like type inference, and I want to put this in Mithril later.
I haven't done that before, so it sounds non-boring at this time.
I suspected that this was the case. It's fascinating to me that so much of your writing about programming languages sounds to me like it recapitulates easily 95%+ of the functional paradigm, yet you seem more wary of functional languages than of, say, object-oriented languages, which I would strongly argue carry much worse implicit behavior characteristics than functional languages.
But thus I also haven't enough experience to know what's needed.
It would seem some static typing is needed to seed that process.
One can't infer much if no types are given, and all is dynamic.
I would very much prefer advisory rather than totalitarian types.
I suspect half measures might generate noise like C++ warnings.
Yes, it's important to get fundamental types, and the relationships among them, right in order for inferencing to be sound.
Paul Snively:
XSLT is quite well-defined by way of comparison to C, C++, et al but
hasn't had the thirty years of shaking down that the Lisp family has, so
there are some definite rough edges that I expect to see mostly sanded
off by XSLT 2.0. It's interesting to note that the functional language
community is all over XSLT. Of particular interest to you might be the
material at <
http://okmij.org/ftp/Scheme/xml.html>.
I've been a bit interested in writing XML parsers myself at times.
I've written some tree-style Lisp parsers, and they're a lot of fun.
But XML has gotten creepy with Unicode and namespace ambiguity.
Everything that obfuscates encoding seems like such a stupid move.
And the performance! Ugh. Pivia uses an awfully slow sax parser.
It might be worth investigating alternatives. The whole point behind SAX is that there are scads of alternative implementations to choose from, if one particular one isn't suitable.
I haven't studied XSLT, but I'm put off by the kitchen sink smell.
I have a deep seated fear XSLT's a committee design abomination.
If that's true, I wonder what feature subset might be least horrid.
It seems to me that the worst aspects of XSLT are dictated by the mere fact of it being sufficiently general to operate on document models based on structure as well as content. The upshot is that XSLT cannot operate on what some would call a "stream" and others would call a "forward iterator;" it can only operate on a random-access iterator. This can have dramatic consequences for memory consumption and, hence, performance. There are several moves afoot to define a workable subset of XPath and XSLT that are amenable to streaming operation.
I don't trust technology that's a big pile of cobbled baling wire.
So I'd only use someone else's XSLT tools if I could audit it all.
By definition, I can't audit a big pile of crap made by many folks.
And that, essentially, is exactly what's wrong with such things.
I gather XSLT has been characterized as functional programming.
I'm not surprised the functional programming folks are all over it.
And like most such things, there are good and bad aspects to this. The good tend to involve using XSLT to automatically generate a specialized XSLT stylesheet for use farther down the transformation chain; the bad are more verbose versions of the typically-recondite functional programs for such pursuits as quining themselves—great academic CS fun, but very nearly as torturous as programming in Unlambda.
Are there any short introductions? Do I need to read everything?
To what extent does bad definition in XML make XSLT also suck?
Parts of XML seem defined empirically as whatever current tools do.
Are XML namespaces really broken? And does this ruin XSLT code?
Nothing seems better for rotten code than symbol binding ambiguity.
Is it close enough to right that we can just dredge out the worst crap?
In theory, virtually all of XSLT's suckage, such as it is, is inherited from XML. In practice, I haven't encountered much suckage, and what I have is on the table for XSLT 2.0 and generally already dealt with in real-world XSLT implementations. In particular, XSLT prior to 2.0 had too many instances in which operations such as variable definition would result in the variable containing a tree-fragment rather than a node-set, tree-fragments not being susceptible to much further processing. Practically all XSLT processors have a non-standard function to coerce a tree-fragment to a node-set; current SAXON releases go so far as to do so implicitly.
Paul Snively:
I would go so far as to say that thanks to XML and XSLT, Lisp already
has come back around; the only question is how explicit you are about
it, by which I mean that you can either code XML processing in a
combination of continuation-passing and trampolined style in C, C++,
Java... or you can just write in Lisp, Haskell, OCaml, Oz... and get it
over with, already.
Sadly I haven't gotten my trampolined style C++ in Mithril already.
I don't know whether I can trust any existing low level C/C++ stuff.
So if I was dumped cold in XSLT, I'd want a Lisp family life saver.
But I'm pretty sure Pivia prefers my usual blazing fast C++ miracles.
However, it's too big a bite for me to chew. I need to scope it all out.
I need to start paying more attention to your code, as I keep thinking of more and more things that could almost certainly benefit even from what you might think of as your "pedagogical, naïve, not production-quality" work.
7:52:28 PM
Adam Bosworth: "RPC suggests that it is okay to automatically map the parameters or return type into or from XML messages. It isn't. That is a private implementation detail. Everyone's implementations will vary and all implementations will vary over time. RPC also implies that the caller knows the signature and classes of the receiver. In fact, it is a miracle if the one application's classes and parameter order happen to match another's. In the real world, every implementation will have its own classes." Finally it's clear what we've been debating and why we disagree. In my model of loosely-coupled apps, there is no variability allowed in the places Bosworth says he must allow it. If you want to implement the Google API, you must implement the same method names, and they must take the same parameters and return equivalent results (the search databases are different in different search engines). We went through this with the Blogger API, and it worked fine. I don't see the value in allowing variability, because you trade that off against complexity, too high a cost, too little gain. I think the world of Adam, but I think he's advocating the wrong approach. And it's good to get the issues aired and clear. [Scripting News]
Hmmm. Where to begin? The most important thing to understand here is that Adam and Dave are talking about two different things that are often used to accomplish the same result. In this instance, the two things are Remote Procedure Call (RPC) and Message Passing, or Message-Oriented Middleware (MOM). The context for the discussion is the creation of loosely-coupled systems.
I want to emphasize in this discussion that there are degrees of coupling. The principal reason, I believe, that Dave disagrees with Adam is due to drawing an implicit line in the sand with respect to where in an architecture something we'd all call an "API" is needed, and whether that line is too far in one direction or another with respect to the desired degree of coupling.
Let me try to be more concrete. Google wishes to make their searching capability available programmatically. Somewhere in their software there is already a function that, given a string, returns a list of URLs sorted in descending order of relevance. One way—arguably the one most familiar to most programmers, both inside and outside of Google—is to make this exact function available by a process that involves serializing its name, its parameters, its return type, essentially all of the language-level stuff that's necessary to distinguish that function from all other functions in their software, allowing external sources to connect to some port, waiting for a bytestream that contains exactly this information, calling the function thus named, and shipping the result back over the connection. This is RPC in a nutshell. It's nice because it so strongly resembles what we all do when we call a function locally.
The alternative is to create a message that indicates a desire for search results for a certain string, and send the message. The message could be sent to a particular server, or sent to a "cloud" for any interested party to pick up. Whoever does pick it up can interpret the message however it likes, and respond however it likes. In this case, the message should probably contain something to the effect of "please leave the result <here>" and the responder can then send the result in a message to "<here>." Note that in this example, neither the sender nor the responder have said one word about function names or parameters or return types. With this architecture, it's possible for the sender not to even know who's responding! Note also that the sender and responder are not coupled in time: the sender can send their message and go on their merry way, reacting to the reply message whenever it arrives.
Remember what I said about degrees of coupling? My descriptions so far may make it seem like MOM is loosely-coupled and RPC is not, but that would be another example of a falsely-binary distinction: the tool I'm using to write this, Radio Userland, is an excellent example of an RPC-based system that is plenty loosely-coupled enough for its (and my!) purposes, given that it works with Frontier/Manila, Blogger, and Google. It's important to acknowledge and respect this.
The MOM approach does exhibit more dimensions of looseness than RPC, and I'm satisfied that that's a primary reason why MOM architectures, built around tools such as MQSeries, Tibco, and the like, have been successful in enterprises where RPC-based technology such as CORBA has failed. At my employer, we are in the process of moving from a procedure-call/shared-database architecture to a MOM architecture using JMS. It will be interesting to see how the new architecture compares to the old.
Finally, I note that Adam works for BEA, a large-scale enterprise app server vendor. I think it's important, in these discussions, to know your audience. Dave is addressing writers on the web; Adam is addressing CTOs and developers for airline reservation systems, trading floors... we can reasonably expect the requirements of these audiences to differ, and hence the relative prioritization of inherently fuzzy goals, such as "looseness" and "familiarity," to differ as well.
Peace to both sides of the non-debate, I say.
6:57:23 PM
Hal Plotkin: "Targeting a handful of specific lawmakers for defeat makes a lot more sense than putting a bunch of geeks on planes." [Scripting News]
Amen, brother. Since most of us geeks who stand to be affected by this legislative bullshit live and work in California, why not band together to ensure that it's Senator Feinstein's ass on the pavement? That would be so deeply satisfying.
6:22:02 PM
One disadvantage of the reverse-chronological-order structure of these weblogs is that it's not clear what "camp" is being referred to here, so please continue on to the post about Adam Bosworth and web services, and I'll just note here that I feel that Steve and Dave, both of whom I respect tremendously, have missed Adam's point and are talking past him. Having said that, if I must use the taxonomy as it has been defined thus far, I have to agree with Adam and disagree with Steve and Dave. More about this in a later post.
6:13:25 PM
Wired: "Joining Hollings as co-sponsors of the CBDTPA are one Republican and four Democrats: Ted Stevens (R-Alaska), Daniel Inouye (D-Hawaii), John Breaux (D-Louisana) and Dianne Feinstein (D-California)." [Scripting News]
It's good to be reminded periodically of whom we need to remove from office at all costs, as rather than defending our future, they are selling it out from beneath us.
6:05:10 PM
If what you need is an XSLT processor that isn't part of some larger system, then SAXON is the way to go for either command-line use or Java API use.
6:00:12 PM
Wow, there's so much to write about. Let me just spit stuff out in no particular order, as I don't have time to do much more than recapitulate:
Speaking of David McCusker, I guess I can say now, some months after the fact, that I interviewed at Pivia, where he works, and they turned me down. They were probably right to do so, but I do want to get back to Silicon Valley some day and David's one of the major reasons for that.
I registered the coolest Mac OS X shareware tool recently, LaunchBar. Briefly, there's a hotkey that activates it, at which point you type two or three characters to identify something to launch: an app, a document, an e-mail address, a URL... and in best Peter-Norvig-adaptive-software form, you only have to correct ambiguities once or twice before the adjustment takes place automagically. Sound simple, even simplistic? It is—which is one of the major reasons that it's also brilliant.
Apparently there was a new Tk snapshot for Mac OS X released on January 31st. Unfortunately, it breaks the binary package for Oz for Mac OS X, and I haven't had the temerity, given my recent travails, to attempt to rebuild Oz from source.
As noted earlier, OpenCyc shipped. Given that it's only for Linux at this point, I'm not as excited as I would be otherwise. Besides, Cyc's inferencing sounds too scruffy to me these days, after exposure to John Pollock and OSCAR.
Once my Windows gag reflex abates, I realize that the OQO is probably the best device so far to tackle the hardware side of The Digital Path.
Looking around for a decent shared calendar/to-do list for the Mac really reveals only Chronos Group Organizer and Now Up-To-Date and Contact. Everything else is corporate overkill. So naturally I find myself wondering how long developing a good shared calendar/to-do list program in Cocoa would take, especially since vCard and iCalendar are out there, and there are great launching-off infrastructure points like the C++ Internet Server Framework and e4Graph to build on.
As I think about calendars, to-do lists, and excellent software like LaunchBar, I begin to wonder if the desktop isn't now an underutilized resource. I think about Dave Winer asking why Google can't index his desktop, Jon Udell saying that there will be a semantic web, and tools like e4Graph, and I wonder if there isn't an opportunity there. Then I realize that I've probably just reinvented SixDegrees, and badly.
ICS-FORTH's RDFSuite is rockin', but their RQL interpreter needs significant optimization. The C++ source is virtually impenetrable. Thankfully their EBNF grammar, type system, and semantics are pretty clear, so I'm thinking that Spirit and Phoenix would make an excellent launching-off point.
Man, I wish I didn't feel Seppuku had to be cross-platform! Coin3D, an Open Inventor 2.1 clone, recently released a sample showing how to integrate Coin3D into Cocoa. But I already looked, and there isn't a good Constructive Solid Geometry action/operation extension for Coin3D, so I'd need to develop that anyway. So I guess it's back to Whisper 2 and Quesa for Seppuku. No regrets; Quesa is shaping up very nicely, and as soon as I get JadeTeX built, I'll help document Whisper.
I still can't build JadeTeX, and I'm not even sure where to post to ask about it. That bothers me.
Sometimes this referers thing is good. I'm extremely flattered to be blogrolled from Daniel Ericsson's WebTransmission, both in personal and Seppuku form, and wow: on SaladWithSteve I'm in a list of "elpoep" that includes John Wiseman, Joel Spolsky, David McCusker, John Carmack, Tom Tomorrow, Dave Winer, and Justin Hall. I'd best get off my ass and do something to earn such stature!
And I still have to make time to learn Oz.
And I still have to make time to do Python examples for the 2nd. ed. of AIAMA. But it's already becoming clear to me that my heart's not in Python despite the recommendations of several folks I respect greatly.