Paul Holbrook's Radio Weblog : Worth $40 a year? You decide ..
Updated: 4/8/2003; 8:54:41 PM.

 

Subscribe to "Paul Holbrook's Radio Weblog" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.

 
 

Monday, April 15, 2002
XML/SGML tools for Windows 2000/XP

Here's a useful page: SGML for NT describes how to get XML/SGML tools running under Windows NT/2k/XP.  It looks like a very through tutorial; I'll try it later.

 


8:19:44 PM      comment []
TextArc - showing a text visually

The New York Times has an article about a site called TextArc.org  From the site:

A TextArc is a visual represention of a text—the entire text (twice!) on a single page. Some funny combination of an index, concordance, and summary, it uses the viewer's eye to help uncover meaning. A more detailed overview is available.

From the Times article:

The texts, which range from Lewis Carroll's "Alice's Adventures in Wonderland" to Balzac's "Z. Marcas," are too tiny to read around the perimeter. Behind the computer glass, though, Mr. Paley's online software is counting each word and noting its location every time it is used. The oval's black center soon fills with legibly larger versions of every word from the source text. Different stories look different. As a result, Mr. Paley's software effectively turns any prose into concrete poetry in which a word's size and location are as important to its meaning as how it is used.

Once TextArc slices and dices a story, the most frequently used words are the brightest. So in the Carroll work, "Alice" glows at the center. And each word's location in this linguistic constellation is determined by its exact locations in the story text. "Cheshire," for instance, is near the bottom, close to the two middle chapters in which the cat materializes. Roll the cursor over a word, and lines pop up that connect it to all the points in the outer circle where the word is used.

I thought the site would be overloaded - and perhaps it will be - but it was up when I tried it.  The effect is fascinating: I tried Edward Abbott's Flatland, the early 20th century fable about people who live in a land of two dimensions.  Words like flatland, circle, and women are more towards the center; the word sphere is more towards one side, indicating that it's used more in one section of the book.

The web site gives no details of how this is implemented, but the actual application runs in Java on your own machine.

When you launch textarc on a text, it starts drawing it as a concentric spiral around the screen, and I'm guessing that during that time the application is actually downloading the full text of whatever you're looking at.  (The texts are taken from Project Gutenberg) .Looking at my memory usage on my Internet Explorer app, it grew to 76meg when I downloaded Jane Austin's Pride and Prejudice. 

TextArc claims that their application runs quicker in Netscape 6.2, and that may be true, but it crashed in NS 6.2, and didn't crash Internet Explorer.  I'll try Mozilla 0.99 next.


10:20:35 AM      comment []
Books referenced in WebLogs

The OnFocus weblog points out the Weblog Bookwatch Top 10:

i thought it would be interesting to see which books are being mentioned most frequently on weblogs. Weblog BookWatch keeps track of weblogs that flow through the recently changed list at weblogs.com and searches for links to Amazon.com. Then it looks at the ISBN in the link's URL, and counts the link as a mention of that book. The most fequently mentioned books show up on the Top 10 list, with references to the weblogs that mentioned them. It's only looking for books right now (not CDs or other products), and only looking for links to Amazon.com.

Thanks to weblogs.com for the great service (and for offering their list in XML). And to Amazon.com for the book info.


12:19:01 AM      comment []
Filemaker returns XML

This URL lets me get at a Filemaker DB and output it in XML:

http://127.0.0.1:8100/simple/fmpro?-db=books.fp5&;-format=-dso_xml&-max=5&-find=&

Returns:

 <?xml version="1.0" encoding="UTF-8" ?> 
  <ERRORCODE>0</ERRORCODE>
  <DATABASE>books.fp5</DATABASE>
  <LAYOUT />
- <ROW MODID="0" RECORDID="1">
  <title>Physics of Baseball</title>
  <isbn>0-155-2558-225</isbn>
  <comment>This is a good book</comment>
  </ROW>
- <ROW MODID="0" RECORDID="2">
  <title>XTML: The best</title>
  <isbn>05-88554-48525-255</isbn>
  <comment>This is also very good</comment>
  </ROW>
- <ROW MODID="0" RECORDID="3">
  <title>The Stand</title>
  <isbn>78-4185-11</isbn>
  <comment>My books</comment>
  </ROW>
  </FMPDSORESULT>

So I can use Filemaker to maintain my list of books, query it to get an XML file, and then use XSL or something else to format up a nice looking page than I can drop in this weblog.  I could also drop the XML itself into the Radio WWW directory and have it stream up to the server, for whatever that's worth.

And if there was a real DTD that I want to stay current with, I guess I could also use XSL to convert my Filemaker XML to that form.

It all sounds great, but it means I have some work to do; I'm only familar with most of these technologies by name alone.


12:16:19 AM      comment []
XML DTD for lists for books?

I've been sitting here starting to try to build a page for books - books I'm reading, books I want to read, mini-reviews of books I've read.  I've spent the last 30 minutes struggling with the HTML to build a decent looking page, when it struck me: I shouldn't be putting my list in HTML, but in XML.  So: is there a DTD (and perhaps tools) for keeping a home library? I don't really need a full-up DTD of the type that might be useful for a real library. 

I have a feeling this kind of project would be a morass.  Ideally, I'd like to use some simple-to-use graphical database, such as Filemaker, and use it to generate XML.  There's a page about XML database products, and a related page about XML and databases, but it's not clear this would be an easy thing to do.

I'm feeling deja vu about this idea.  A couple of years back I wanted to be able keep a list of projects in a simple database - Filemaker again, because it's very simple to work with - and then periodically publish that database to a webview.  At the time, I was stymied over something stupid: the inability of Filemaker to take a URL and render it on an HTML page as a clickable item.   I spent quite a bit of time messing with Filemaker and even MS Access before giving up on it.  I have a bad feeling that the state of XML tools might be such that the best tool would be something like Emacs.  I love Emacs, but it's not the best tool for managing lots of data.


12:12:49 AM      comment []

© Copyright 2003 Paul Holbrook.



Click here to visit the Radio UserLand website.

 


April 2002
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30        
Mar   May


 4/7/03
 1/29/03
 1/26/03
 1/23/03
 1/23/03
 1/23/03
 1/21/03
 1/21/03
 1/16/03
 1/16/03
 1/15/03
 1/14/03
 1/14/03
 1/14/03
 1/5/03
 1/5/03
 1/3/03
 1/2/03
 1/2/03
 1/1/03
 12/30/02
 12/28/02
 12/26/02
 12/24/02
 10/8/02
 10/8/02

Home Page