미디어, 인터넷, 情報

The Problem with Wikipedia

이강기 2016. 5. 27. 11:32

The all-conquering Wikipedia?


TLS
MAY 25 2016

 

In the Lower Reading Room of Oxford’s Bodleian Library, there is an entire bookshelf – quite a big bookshelf – filled with concordances to Classical Greek and Latin authors. When I was an undergraduate, I worked at the desk next to it most days. The books are still there, in more or less the same place they always were: the four fat sober grey volumes of A Concordance to Livy; Isabella Gualandri’s Index nominum prop­riorum quae in scholiis Tzetzianis ad Lycophronem laudantur; A Lexicon to Herodotus, compiled by none other than the young Enoch Powell.


It is little short of a miracle that the concordances, lists of all the words appearing in a given text, have so long survived the librarians’ regular purges of the open shelves stock in the Lower Reading Room. Frankly, the entire C.Index shelfmark could be hauled off and shredded and no one would be any worse off. All 400 lb of them are utterly obsolete, obsolete as few books in history have ever been obsolete. Even out-of-date telephone directories retain some historical interest. Print concordances have no intrinsic interest at all: they are simply bad paper search engines designed for a world which had not yet invented good online search engines. These massive reference works, the products of years of human drudgery, have been entirely superseded by two or three online databases of Greek and Latin literature. If I want to find all instances of the word haimasiē, “low fencing-wall”, in Herodotus, I can do so online in a matter of seconds, thanks to the digital Thesaurus Linguae Graecae (eight hits).


Of course, things are not quite as simple as that. The online databases have no guiding intelligence behind them: they search for strings of symbols, not for words as such. If I happen to be looking for, say, early Greek references to the domestic cat (ailouros), the Thesaurus Linguae Graecae insists firmly that the word does not appear in Herodotus. Powell’s Lexicon saves the day, with the laconic entry “ailouros: see aielouros”. Herodotus in fact uses the word ai(e)louros half a dozen times, in the course of a fascinating account of cats in ancient Egypt (2.66–7, highly recommended). Because Herodotus happens to use a dialect variant of the word, the TLG database is helpless. So Enoch is reprieved for the time being. But his days are surely numbered: it can only be a matter of time before the TLG is programmed to pick up dialect variants, and then the print concordances will be pulped and turned into apple-juice cartons.


Concordances are among the simplest life forms in the rich and complex phylum of reference works – dictionaries, encyclopedias, atlases and so forth. In his delightful new history, subtitled The reference shelf from ancient Babylon to Wikipedia, Jack Lynch neatly defines the “reference work” as a text designed for users rather than readers: plenty of people read Herodotus straight through (and so should you), but no one has ever read Powell’s Lexicon from cover to cover. It is all too easy to underestimate the role played by the humble index, and its more elaborate variants, in the history of human knowledge. There is a terrific book to be written on the history of alphabetical order, for example, which is sketched out here by Lynch in an all too tantalizing three pages.


You Could Look It Up falls squarely into the flourishing genre of the Book as Listicle: A History of Ancient Greece in 50 Lives, A History of the World in Twelve Maps, A History of Sweets in 50 Wrappers (I did not make that one up, alas). Each of Lynch’s twenty-five chapters selects two more or less contemporary reference works, and briskly summarizes their vital statistics, the circumstances of their compilation, and the personal idiosyncrasies of their authors. We scamper along, roughly in chronological sequence, from the Code of Hammurabi (c.1754 BC) through to Schott’s Original Miscellany (2002), with a cursory leap forward to Wikipedia in the epilogue. Each chapter is followed by a three- or four-page mini-essay (these are waggishly designated Chapters 1½, 2½ etc) on “stories that would otherwise go untold in a strictly linear history” – inter-dictionary plagiarism, why there are so few women lexicographers, the layout of the reference shelves in Lynch’s own office, and so forth.


This book will give many readers enormous pleasure, and rightly so. Lynch has a wonderful topic to play with, and “play” is the right word. The closest point of comparison – and I intend this as the warmest of compliments – is Bill Bryson, both for the deep erudition lightly worn, and for the infectious delight in anecdote and human eccentricity. The book’s own quasi-encyclopedic structure neatly avoids placing any disagreeable demands on the reader’s attention span. Not wildly interested in Cruden’s Complete Concordance to the Holy Scriptures (1738)? Don’t worry – in a page or two’s time, Lynch will be hustling you on to the joys of Harris’s List of Covent-Garden Ladies (1757), via a quick anecdote about the Wikipedia entry for the South African butchery Mzoli’s. Perhaps the only constituency who will take serious offence are musicologists. Lynch chooses to pair Grove’s Dictionary of Music and Musicians with Mrs Emily Post’s primer of etiquette for aspirant social climbers; the insult is compounded by hauling in the Larousse gastronomique and Jancis Robinson’s Oxford Companion to Wine as “more examples of the reference genre being deployed to make life a little sweeter”.


The listicle format does have some unfortunate side effects. Most obviously, it is all but impossible to sustain a continuous narrative or argument from one chapter to the next. Lynch’s book essentially ends up consisting of seventy-five miniature self-contained essays on the history of reference works, none much longer than a meaty blog post. This format is not well suited to drawing out major historical trends. Did the Enlightenment bring about fundamental changes in the character and purpose of reference works? Why have some countries been so much more prone to producing mammoth dictionaries and encyclopedias than others? How closely is the history of reference works connected to the history of universities? (Not very, I suspect: it is worth recalling that none of the major British reference works – the OED, the DNB, Grove, the Encyclopaedia Britannica – began life within the academy, though some have ended up there.)


Equally seriously, a listicle with entries of near-uniform length (five or six pages) has, by definition, no sense of proportion. I am delighted that Lynch has found space for Wisden, Liddell and Scott’s Greek–English Lexicon, and the CRC Handbook of Chemistry and Physics, estimable works all (or at least the first two). My concern is that by granting them the same amount of space – half a chapter each – as Samuel Johnson or the Oxford English Dictionary, Lynch ends up engaging in a kind of historical reverse discrimination. In the grand scheme of themes, as the most sectarian Classicist would acknowledge, Liddell and Scott’s Lexicon is just nowhere near as interesting or important as the OED. To take an even more extreme example, is it not perhaps a shade melancholy to find Edmond Hoyle’s Short Treatise on the Game of Whist receiving more attention (six pages) than d’Alembert and Diderot’s Encyclopédie (four-and-a-half)?



One of Lynch’s highest terms of praise is “quirky”, variously applied to Thomas Browne, Bayle’s Dictionnaire (“an intensely quirky book”), Hobson-Jobson, Brewer’s Dictionary of Phrase and Fable, and Schott’s Original Miscellany (“the reason the word quirky was invented”). But relentless quirkiness, either at first or second hand, does pall after a while, and You Could Look It Up reaches its low point with a four-and-a-half-page list of “some unlikely reference books” – complete, I fear, with humorous glosses on their contents. But perhaps some readers will be set chortling at the thought of James A. Yannes’s Collectible Spoons of the Third Reich (“By the author of the more wide-ranging Encyclopedia of Third Reich Tableware”) or James and Philip Parker’s Rectal Bleeding: A medical dictionary.


Three-and-three-quarter millennia separate Hammurabi’s Code from the New Grove Dictionary of Music and Musicians (second edition, 2001); a mere fifteen years have passed since the launch of Wikipedia on January 15, 2001. Nonetheless, there can be no doubt as to which period has seen the more radical changes to the world of reference works. Lynch describes his book as “something of a eulogy”, and he is surely right to wonder whether “we may be approaching the end of the era of the reference book”. The digital hurricane has already swept away much more than just the humble concordance, and the storm is still blowing as hard as ever.


The all-conquering encyclopedia of the twenty-first century is, famously, the first such work to have been compiled entirely by uncredentialled volunteers. It is also the first reference work ever produced as a way of killing time during coffee breaks. Not the least of Wikipedia’s wonders is to have done away with the drudgery that used to be synonymous with the writing of reference works. An army of anonymous, tech-savvy people – mostly young, mostly men – have effortlessly assembled and organized a body of knowledge unparalleled in human history. “Effortlessly” in the literal sense of without significant effort: when you have 27,842,261 registered editors (not all of them active, it is true), plus an unknown number of anonymous contributors, the odd half-hour here and there soon adds up to a pretty big encyclopedia.


One of the most common gripes about ­Wikipedia is that it pays far more attention to Pokémon and Game of Thrones than it does to, say, sub-Saharan Africa or female novelists. Well, perhaps; the most widely repeated variants of “Wikipedia has more information on x than y” are in fact largely fictitious (https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_has_more…). Given the manner of its compilation, the accursed thing really is a whole lot more reliable than it has any right to be. Like many university lecturers, I used to warn my own students off using Wikipedia (as pointless an injunction as telling them not to use Google, or not to leave their essay to the last minute). I finally gave up doing so about three years ago, after reading a paper by an expert on South Asian coinage in which the author described the Wikipedia entry on the Indo-Greek Kingdom (c.200 BC–AD 10) as the most reliable overview of Indo-Greek history to be found anywhere – quite true, though not necessarily as much of a compliment to Wikipedia as you might think.


As Lynch rightly notes, the problem with Wikipedia is not so much its reliability – which is, for most purposes, perfectly OK – as its increasing ubiquity as a source of information. “Wikipedia, despite being noncommercial, still poses many of the dangers of a traditional monopoly, and we run the risk of living in an information monoculture.” Large parts of the media demonstrably use Wiki­pedia as their major or sole source of factual data; as a result, false or half-true claims (such as are found in any encyclopedia) can spread and take root with extraordinary speed.


Here is a minor but telling example. Which English-language novel has sold the most copies? The short answer is that nobody knows: we have no remotely reliable sales figures for books published more than a couple of decades ago, and books that are out of copyright might exist in literally hundreds of different editions and translations. Nonetheless, between April 24, 2008 and January 30, 2016, Wikipedia had the answer: it was Dickens’s A Tale of Two Cities, with an estimated 200 million copies sold, a third as many again as the next bestselling book, Tolkien’s Lord of the Rings trilogy.


This figure of 200 million is – to state the obvious – pure fiction. Its ultimate source is unknown: perhaps a hyperbolic 2005 press release for a Broadway musical adaptation of Dickens’s novel. But the presence of this canard on Wikipedia had, and continues to have, a startling influence. Since 2008, the claim has been recycled repeatedly on the BBC as well as in the Daily Telegraph, the Daily Mail, the Guardian and the Independent, none of which have cited Wikipedia as a source. It has even made its way into popular history books. In The Great British Dream Factory (2015), Dominic Sandbrook insists that British culture can be empirically shown to be the world’s most successful: “The facts and figures make interesting reading . . . The Lord of the Rings is the second best-selling novel ever written, behind only A Tale of Two Cities”. His footnoted source is an Independent article of July 2014, itself presumably lifted from Wikipedia.


Getting the genie back in the bottle has not been easy. The figure of 200 million was first queried on the Wikipedia talk pages in May 2009, and was deleted from the site on December 4, 2014 by Richard Farmbrough, one of the most prolific British Wikipedians. (He also provided much of the factual data in this paragraph; Wiki-editors are, in my experience, an exceptionally friendly and helpful bunch.) on December 5, the claim was reinserted, re-removed, and reinserted again. Farmbrough took it down it again on February 4, 2015; on March 1, it was reinstated and promptly re-removed; it appeared again on April 23, and survived for another nine months before the indefatigable Farmbrough deleted it yet again on January 30, 2016. Why has the claim proved so difficult to kill? No doubt part of the reason is that it has now accumulated a lengthy and, by Wiki-standards, respectable paper trail: a long article on historical fiction by the novelist David Mitchell in the Telegraph; Stephen Clarke’s 1,000 Years of Annoying the French; and so forth. (Wikipedians have their own word for malignant and self-sustaining cycles of this kind: citogenesis.)


No doubt this particular case does not matter all that much. But it does illustrate some of the benefits, and many of the perils, of the brave new world of crowd-sourced online reference. Wikipedia did, eventually, get to the right answer, but it took its time about it. one of the main worries about Wikipedia is not that its content does not improve over time (it clearly does), but that it gets ­better so much more slowly than anyone would have predicted back in 2006 or 2007. It is here – sneers the academic – that the project really feels the lack of expert editors. Wiki­pedia does just fine at uncontroversial factual information, but as soon as a topic demands critical discrimination or a bit of intelligent digging, its quality control goes completely haywire.


In short, don’t throw away your Oxford Companion to English Literature just yet. But much though Lynch and I might wish it otherwise, there is no way back from the Wikipedia revolution. Instead of grumbling, perhaps we ought to spend a bit more time editing Wikipedia ourselves. No doubt a few of us will cling to our print concordances and encyclopedias, like comfort blankets, for a few more years, as their essential absurdity becomes ever more apparent. But to the aspirant Samuel Johnson or James Murray of the 2030s, the way forward is obvious. Get down to work on your JSTOR-to-Wiki automated edit bot: that way immortality lies.