Friday, August 29, 2003
Small World Dept.-- SCO and MS
Have a Mutual Friend
There's more than one way to fund a company, it seems. An alert reader noticed that Integral Capital Management companies just filed a 13G with the SEC regarding its shares in SCO. Here's their SEC filing. Here's the SEC Search page for Integral. Here's a list of institutional owners of SCO, and you can see that as of June, Integral was already number one on the list with 4.0%. Now it's presumably 5% or more, hence the 13G.
Who are these people? I thought I'd just check and see who thinks SCO is worth buying right along about now. I was particularly interested because last Friday I noticed a surge in buying, and all week there was a spike of institutional sell messages, and then this Friday, it suddenly stopped. No sell or buy messages, as you can see on this chart. If you switch the chart to the entire month, you can easily see that after this huge and atypical increase, suddenly nothing. So I went digging.
At first, there didn't seem much to find. Yahoo lists Integral as one of the "Top Institutional Holders" of Drugstore.com, the top holder, as a matter of fact. And you can read the stock purchase agreement the parties signed on Findlaw. Here's how they describe themselves in the SEC filing:
"This statement is being filed by Integral Capital Management V, LLC, a Delaware limited liability company ('ICM5'), ICP Management V, LLC, a Delaware limited liability company ('ICP Management 5') and Integral Capital Management VI, LLC, a Delaware limited liability company ('ICM6'). The principal business address of ICM5, ICP Management 5 and ICM6 is 3000 Sand Hill Road, Building 3, Suite 240, Menlo Park, California 94025.
"ICM5 is the general partner of Integral Capital Partners V, L.P., a Delaware limited partnership ('ICP5'). ICP Management 5 is the general partner of Integral Capital Partners V Side Fund, L.P. ("Side Fund") and the Manager of Integral Capital Partners V SLP Side Fund, LLC ('SLP Side Fund'). ICM6 is the general partner of Integral Capital Partners VI, L.P., a Delaware limited partnership ('ICP6'). With respect to ICM5, ICP Management 5 and ICM6, this statement relates only to ICM5's, ICP Management 5's and ICM6's indirect, beneficial ownership of shares of Common Stock of the Issuer (the 'Shares'). The Shares have been purchased by ICP5, Side Fund, SLP Side Fund and ICP6, and none of ICM5, ICP Management 5 or ICM6 directly or otherwise hold any Shares.
Management of the business affairs of ICM5, ICP Management 5 and ICM6, including
decisions respecting disposition and/or voting of the Shares, resides in a
majority of the managers of ICM5, ICP Management 5 and ICM6, respectively, such
that no single manager of ICM5, ICP Management 5, or ICM6 has voting and/or
dispositive power of the Shares."
Intricate, no? Why, I asked myself, would they be buying now? Google didn't have much to show.
Just when I thought I'd hit a dead end, I started to read Drugstore.com's 10Q, their most recent one, just filed this month. Because this is another company Integral has invested in heavily, maybe I'd find something there, I thought.
And lo and behold, guess who was just elected a director of Drugstore.com? Melinda French Gates. Yes, that Mrs. Gates. Well, that got my attention. I switched to some other search engines and really started digging, and I went to the SEC to see what I could find.
As of last December, Integral Capital Management V didn't own any SCO stock, according to this SEC filing. They did own Microsoft stock back in November. But they didn't the previous May of 2002. So the chain of investment timeline appears to go like this: First, they invested in Drugstore.com, then Microsoft, and then in SCO.
Small world, isn't it? But why? A venture capital firm is investing in Microsoft? Doesn't it seem like it should be the other way around?
Here's a description of Integral on VC Directory:
"20 portfolio companies brought to successful IPO 1997 - 99. This venture capitalist was incubated within Kleiner Perkins Caufield & Byers in 1991. Integral's partnerships are five years in duration. Integral IV began in March of 1998 with over $300 million of capital and the company currently has over $1.2 billion in capital. In 1999, Integral Capital Partners co-founded and now operates as a managing principal of Silver Lake Partners, a $2.2 billion buyout fund focused on technology and related growth businesses.
"Focus: Integral invests in technology businesses at all stages of company development beyond the start-up phase. . . .
"Integral typically invests between 20% and 50% of the original capital of each fund in private companies and buyout opportunities. Integral is an active investor, both in the venture and public stages. Their greatest contribution typically relates to corporate strategy, business development opportunities, and maximizing market value (either through an IPO or merger). Integral does not take board seats. Integral Capital Partners 2750 Sand Hill Rd., Menlo Park, CA 94025 . . ."
Here's their home page.
Think this is just a coincidence? Could be, but take a look at this article from June 6, 2003, entitled "Best friends -- VC buddies do many of their deals with pals", which explains that in a tough economic market, VC capitalist firms like Integral do business primarily with their friends, as its Roger McNamee explains:
"'People's time is limited,' said McNamee. Quoting a favorite saying in the VC biz, he added: 'When you have a choice, always do business with your friends. When you don't have a choice, your friends are the only ones who will do business with you.'"
There goes the coincidence theory. Well, they appear to know each other, according to this Fast Company article linked to on Integral's web site:
"As early as 1997, he says, 'I realized that we had escaped the earth's gravitational pull -- that we were in the midst of a true mania. The next question was, What do you do after you crash and burn? You need a strategy for investing in a long-term bear market.' So, along with a set of superstar partners, McNamee assembled the first large-scale private-equity fund focused on tech. Silver Lake Partners, which launched in May 1999, raised $2.3 billion in a matter of months, attracting a who's who of Silicon Valley and Wall Street, including Bill Gates, Michael Dell, Larry Ellison, major investment banks, and big institutions such as CalPERS and the World Bank. In the four years since its launch, Silver Lake has invested approximately $1.6 billion of the fund in nine megatransactions -- which included involvement in a landmark $20 billion leveraged buyout of legendary disk-drive maker Seagate in late 2000 and that company's IPO two years later."
So there you have it, folks. And what's the plan? Here's their investment strategy page:
"The Internet drives all three of Integral's major information technology investment themes. Cheaper computing power, more bandwidth at lower prices, and server-centric software are leading to a world of information-rich Internet services that empower decision-makers with accurate, real-time information. By utilizing the exponential forces of Moore's law and Metcalfe's law, the Internet encourages rapid development of low-cost business relationships. The Internet also liberates emerging competitors by allowing them to undermine traditional information control, price discrimination, customer relationships, and distribution channels. The Internet is delivering both buyers and suppliers better information and lower cost transactions."
This seems to dovetail with SCO's new web services model, and it also ties in with my long-held belief that the plan was to dump the GPL off a cliff, write a new kernel for UNIX, which will also do Windows, and then steal all the open source software applications they can find and let you run them as binaries on top of their kernel. In a word, yuck. Brand X Linux, for which you will pay a pretty penny, my friend. It seems there is a plan, and it looks to me now like Microsoft may really be part of it.
Data Mining, Spectral Analysis, and All that Jazz
not born knowing what spectral analysis is. So when SCO said that spectral analysis is
one of the methods they used to find "infringing" code, I had no idea
what they were talking about. When Sontag compared it to finding a
needle in a pile of needles, I figured it wasn't much use. And it turns
out, on further investigation, that my intuitive conclusion may be about
right, at least when it comes to using it for software code data mining
for infringing code in this case.
An alert reader noticed something
interesting. One of Canopy Group's companies is called DataCrystal.
Could that be at least one of the three groups SCO hired to try to sort
through the code of UNIX SystemV and the Linux kernel? DataCrystal
does "advanced pattern recognition" and "AI systems". They actually claim to do a
great deal more besides. One of the things listed on their what-we-do
page is data mining. Presumably, that's what SCO wanted to do. And a
look at their About page indicates that if
you are the RIAA, you probably would want to have a company like DataCrystal to hunt down pirates for you. Another reader noticed this page
about a DataCrystal, and he
wondered if it might be the same company.
It isn't, because this DataCrystal
is the name of a project at USC, not a company. While I don't know if
the Canopy Group company DataCrystal was hired by SCO, or whether there is any connection between the company and the project, it did make me start to wonder about the field in general. If you really wanted to know if two piles of code had identical or similar code, can data mining find out? And would matches
be reliable for use in the way SCO apparently is using them? Judging by
the SCOForum demo, we might think no. And we might be right.
I asked a
Groklaw resource person, a man who worked for over a decade doing basic and exploratory
research for the US DoD and the Canadian Ministry of Defence on topics related
to secure communications and signals intelligence, including cryptology,
statistical processing of natural language, signal processing, and
computational learning, if he'd be willing to explain it in general and understandable
terms, so we can follow along. Very likely this subject is going to be
a very significant part of the case when it goes to trial. Here is what
he explained to me:
"Data mining is looking for
patterns or similarities in large quantities of information. Google is a
good example of data mining-on-demand where the pattern is supplied by
the user and the large quantity of information is the entire set of
webpages on the internet. But data mining in general is potentially much
broader. For example, a typical data-mining task might be to take a
sample document and look for other documents in a database that might be
similar to it. But even beyond that, data mining can be applied to other
kinds of data -- pictures, for example, or sound
"There are lots of different ways to approach
problems like this. Beyond the most elementary, what all the techniques
have in common is that they rely on mathematical models and
transformations of the data. Part of the reason is efficiency, since
turning the problem into math usually means there's a computationally
clever way to do it. Another part of the reason is that, by transforming
the problem into math, you make it possible to find and grade a
continuum of approximate matches -- in short, to find ranges of
similarities rather than just identities. Note very well that
'similarity' here is completely dependent on the particular flavor of
math you've chosen as your technique. This is extremely
"OK, so you've taken your document or picture or
whatever, and you've mined your database for similar items. Those items
will be graded for similarity to your original, just as some search
engines will rate their returned items in terms of probable pertinence.
The most sophisticated and respectable data-mining systems will be using
grades based on probabilities. This is because the underlying math will
be using probability models. Many times the grade will reflect not
merely the strength of a match in terms of probability, but also the
likelihood that such a match would be found at random searching any old
data at all. This also is extremely important, since 'any old data at
all' can be subject to a wide range of interpretations. (This could
pertinent in the SCO case, since, if data-mining techniques are used,
it's a reasonable question whether any contamination discovered this way
is real, or whether it's spurious, i.e., capable of being found to the same
degree in other, unrelated data.)
"Now the DataCrystal webpage
consists mostly of a laundry list of any and all of the subjects ever
associated with data mining, artificial intelligence, knowledge
discovery, or machine learning. But the .pdf white papers all focus on
using data-mining techniques for indexing and retrieving digital video
and audio. What's more, they're offering not just indexing and retrieval
services, but also housing, protecting, and distributing the data
"It outlines an enhanced technique for expanding
data-mining coverage. It's a technique for building patterns out of
patterns and data mining on the derived metapatterns in
Not being a rocket scientist, I wanted to be sure I'd understood, so I wrote back
and asked these followup questions, and got this reply:
have two questions to follow up:
" 1. . .the results would depend on
how you programmed the software? In other words...it can look for
similarities, but it can't evaluate them?"
"Q:..there might be in actuality no common code at
"You know how Google sometimes matches all the words in
your query, but not necessarily conjointly or in the same order?
"In the case of computer code, especially code written in C expressing
similar or common algorithms, it would be astounding if there weren't
pattern similarities at some level. If nothing else such things are
enforced by the design of the language and commonly-held notions about
good coding style."
"Q: ...it simply would have to be the case
that some of the code is close enough that they might have a
"ANS: Just the contrary. As with the 1st slide example,
the ancestry of that memory-management code is known to virtually
anybody who's studied C from Kernighan and Ritchie's book. A similarity
like that would stand out like Devil's Tower, but what it indicates is
exactly the opposite of what they contend: it shows that everybody knows
"Q: And can they program the math to increase
"matches"? Pls. explain a bit more this part."
"ANS: Here's an
example. Suppose you came up with a hitherto-unknown page of blank
verse. The question is, was it written by Shakespeare or not?
"Data mining your way through that problem, you'd get one level of
certainty if your database contained the Bible, Goethe, Racine, Pushkin,
and the New York Times. You'd get a different level of certainty if your
database were confined to Elizabethan dramatists. The scores for
putative Shakespeare against the mixed database would probably be huge
just for matching any English. The scores against Elizabethan dramatists
would probably be quite a bit weaker, but clearly more
conclusive. The mixed-database test -- the one with the Bible, Goethe,
etc. -- will probably say 'Shakespeare indeed!' but it's
expressing the idea that 'if it's English it's Shakespeare.'
the other hand, the Elizabethan dramatist test might say
yes, might say no, but the answer will be based on such
things as a small number of very subtle differences between,
say, Shakespeare's and Marlowe's vocabulary. It expresses
perhaps the idea that 'in any 1000-line chunk of Shakespeare
and any 1000-line chunk of Marlowe, Shakespeare is likely to
use the word 'ope' once and Marlowe not at all. This example
doesn't use 'ope' at all therefore it's probably Marlowe.
can see it's still a matter of interpretation and
probability, but the second test is simply more credible on
grounds that are external to the data-mining method itself.
another point of view. How does a data-mining
search for SVr4 code look if you run it against all C programs? In all
likelihood you're going to find some matches. Are the matches
against Linux actually any stronger than matches against an arbitrary
body of C code? Against other Unix-like kernels? etc.
interpretive issues, but there are statistical grounds for
deciding them, and speaking strictly for myself, I seriously doubt
they've been fielded satisfactorily. For my money you couldn't even
start taking the matter seriously unless exactly the same tests were run
against every body of other kernel code like all the BSDs, and a
chunk of the SVr4 kernel against the rest of that same kernel. And
even then, you've only generated the raw information to start the
business of verifying and refining the procedure."
what is spectral analysis? Is that what this is?"
general, spectral analysis refers to breaking things down into component
frequencies -- sort of like how a prism breaks white light into colors,
and so on.
"In this case it refers to using the periodicities of
the individual characters of program text as frequencies to look for a
very specific set of 'colors' associated with a particular swatch of
program code. It's not determinative either. It may also refer to a kind
of computational trick using spectral-based techniques to look for
certain kinds of approximate matches very quickly."
there you have it. At least now we know in general what they are
talking about. As the case goes forward, and more is revealed, no doubt
it'll be interesting to meaningfully follow along.
analogy to Google made it all come clearer to me. On top of all that he
wrote, I know with Google, input affects output. And input means humans,
imperfect humans. I certainly know that I get different results from
Google if I plug in the identical search terms, but in a different
order, for example. So I totally get how results could be skewed by
what you tell the software to do. For example, I get different results
if I search for "Dave Farber" and IP than I do if I search for IP and
"Dave Farber", and it's different still if I search for just IP or just
"Dave Farber" or just Farber or just Farber and IP. And that's using
the same pile of data. Input affects output.
Obviously they would argue
that their methods are so refined, blah blah. But that human element
can't be removed, because humans write the software, no matter how
sophisticated. So how reliable are the matches? You use Google. What do
you think? Doesn't a human at some point have to interpret the value of
"A continuum of approximate matches" does not
infringement prove, on its face. As he says, it's an interpretive
issue. And data mining seems to be a better match with something like
matching amino acid strings than figuring out if someone stole
somebody's code, which requires knowing who has or doesn't have a valid
copyright, which way matching code travelled, who had the code first, etc.
If I've understood
what my friend has written, it means that if SCO swapped out Linux and
searched Windows 2000 code instead, it'd likely find instances that
looked like "infringing code" also. That's the same as saying that so far,
they are holding maybe nothing. It all reinforces in my
mind that, once again, nothing has been proven to date by their claims
of similarity, derivative or obfuscated code matches, and nothing can be
proven using data mining techniques, until this case goes to trial and
the experts speak, followed by a decision by a judge.
If you are interested,
here is a white paper, "Text Mining -- Automated Analysis of Natural
Language Texts" that explains the process of searching just for simple
text, and while it does the explaining, it also shows just how much
human input goes into structuring your search before you begin the
search and why the results still may not be what you want. It is hard to see how such techniques
could answer the question: "Is this infringing code?" At best, it could show you where to begin
to investigate. And here are the DataCrystal project's white papers.
other thing I found out in my investigation. Guess where most of the
cutting-edge brains working on such data-mining techniques work? . . .
No, really. Guess. . .
That's right: at IBM.