![]() |
Wednesday, December 31, 2003 |
So beware, corpus fetishists! The possibility of corpus research is a great asset to linguistics, and no one should try to work without corpus material; but there are major pitfalls for those who take the corpus to be the object of study. It is not the object of study. The language is the object of study. A corpus is just an assemblage of material through which we can study the language, and virtually any corpus is going to have errors in it. Possibly numbering in the millions, even outnumbering the correct forms. [Language Log] If the "errors" outnumber the "correct forms", isn't that a question worth investigating? What linguistic, psychological, and social processes are involved in making "error" dominant? 10:54:07 PM ![]() |
Browsing in the statistics section of the Penn bookstore, I noticed a presumably misplaced copy of Working Minimalism by Epstein and Hornstein. It's entertaining to imagine what a statistician would make of that volume. If a frequentist, would the statistician take it for a predictably opaque application of improper priors? If a Bayesian, would the statistician relate it to the fog of frequentist model selection? How did the volume get there? It's more fun to imagine that it wasn't a simple shelving mistake. Maybe a scholar, having picked the volume earlier in the linguistics section, had a change of heart while reading about hypothesis testing in the statistics section. One wonders if the scholar bought a statistics volume instead. Good choices would include The Algebra of Probable Inference and The Nature of Statistical Learning Theory. 2:34:18 PM ![]() |