![]() |
Thursday, March 9, 2006 |
I'm becoming increasingly disappointed with the inefficiencies and subtle biases of traditional means of scientific communication in computer science. There are many problems, including the gigantism of successful conferences and the growing expense of commercial journals, but I believe that the main problem is that the traditional peer reviewing process cannot scale up when a field is growing rapidly: the number of experienced reviewers lags the number of new researchers entering the field, with the result that either reviewing delays grow unacceptably or less experienced reviewers are recruited leading to less good reviewing decisions. One solution is the constant creation of new, more specialized venues (journals and conferences) to free emerging areas from the overheads imposed by size and tradition. But if we are going to put the effort in creating new venues, why not rethink the whole process? What do we need?
The start-up editorial board would set up a method by which the most effective contributors to the ongoing discussions/public reviewing would be rotated into the editorial board, providing an incentive for constructive participation. In fact, several such organizations with different purposes could coexist as overlays on the underlying archive, which could be an existing one like arXiv. I'm sure there are social, cultural, and technical issues that I did not consider sufficiently or at all. But the current conference and journal arrangements are less and less capable of promoting quality and efficient communication. 7:44:51 PM ![]() |
The December 1 DWIM effect: According to Barry R Zeeberg, Joseph Riss, David W Kane, Kimberly J Bussey, Edward Uchio, W Marston Linehan, J Carl Barrett and John N Weinstein, "Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics", BMC Bioinformatics 2004, 5:80: When we were beta-testing [two new bioinformatics programs] on microarray data, a frustrating problem occurred repeatedly: Some gene names kept bouncing back as "unknown." A little detective work revealed the reason: ... A default date conversion feature in Excel ... was altering gene names that it considered to look like dates. For example, the tumor suppressor DEC1 [Deleted in Esophageal Cancer 1] was being converted to '1-DEC.' Figure 1 lists 30 gene names that suffer an analogous fate.(Via Language Log.) One wonders how many published papers on microarray data analysis have been affected by this. Scientific data analysis should probably be done with software specifically designed for the purpose, which is less likely to employ shortcuts that are useful for other domains but not tested in scientific applications. More generally, so much science these days uses unpublished software that reproducibility is often impossible. Even in experimental computer science, quantitative results often depend on custom software that is not easily available for a variety of reasons, commonly that the software or scripts are so fragile and so dependent on local computing context that the authors can't get around to releasing them. We need really reproducible research more than ever. 7:01:41 PM ![]() |