New online dictionary redefines 'look it up'
Lexicographer Erin McKean鈥檚 interactive 鈥榃ordnik鈥 is projected to be the largest online dictionary ever.
Infinite dictionary: Erin McKean, former editor in chief of US Dictionaries for Oxford University Press, has created Wordnik, expanding the meaning and uses of a dictionary.
Jasmine Scott/Special to 海角大神
Chicago
Erin McKean doesn鈥檛 look much like a revolutionary. She speaks softly. She sews her own skirts and writes a daily blog entry about vintage patterns. She does work out of a basement, but it鈥檚 got carpeting and good lighting and roughly 1,500 books, many of whose titles involve the word 鈥渨ords.鈥 Her suburban Chicago home is not exactly the picture of subversion.
This week, though, she is slated to launch what may be the biggest revolution in the printed word since, well, printed words.
Ms. McKean鈥檚 brainchild is called , and it combines the best practices of the old-fashioned desk reference with Internet innovations. Words can be tagged like a blog entry, their pronunciation recorded and replayed like streaming radio, their related words cataloged like a list of books customers also bought at an online book depot. When the paper page gives way to the Web page, everything about the way we think of words will change, McKean says. 鈥淭his project,鈥 she predicts in a quiet voice devoid of bravado, 鈥渋s going to completely revolutionize all of dictionarymaking forever.鈥
Granted, a dictionary is closer to a database than a mystery thriller, its authors nothing like, say, John Grisham. But to McKean, nothing has ever seemed more fascinating than collecting and organizing American words.
was 8 years old when she decided that when she grew up, she wanted to be a lexicographer 鈥 the technical term for a writer or editor of dictionaries. She first found it in her daily scouring of The Wall Street Journal. Her father was a Journal devotee, and McKean liked the human interest stories (but, she jokes, 鈥渆ven then, I knew enough not to read the editorial page.鈥) A feature article celebrated Oxford University Press鈥檚 1980 Word of the Year 鈥 ayatollah 鈥 and talked about preparing the newest edition of its most famous title, the Oxford English Dictionary.
鈥淚 think I was really attracted by the fact that it was taking 21 years to make the second edition of the ,鈥 she recalls. 鈥淚 was 8. Twenty-one years was forever.鈥
The lexicography bug stuck, in part because McKean loved language. She was a voracious reader, plowing through her local libraries鈥 stacks and devouring anything she found at home, she says. 鈥淚f it was lying around, I read it. If my parents didn鈥檛 want me to read it,鈥 she says, 鈥渢hey had to hide it.鈥
As her classmates abandoned childhood dreams of firefighting or Broadway stardom for teaching or nursing, McKean stuck with words. 鈥淣obody ever tried to talk me out of it. Nobody knew enough about it to know if it was easy or difficult,鈥 she recalls. 鈥淣obody had a brother who was a lexicographer the way they might have a brother who was a firefighter or an English teacher or a doctor or a lawyer. Nobody had ever met one.鈥
For good reason, she found out as she pursued joint bachelor鈥檚 and master鈥檚 degrees in linguistics at the University of Chicago: There aren鈥檛 a whole lot of jobs for lexicographers. McKean estimates there may be 200 working lexicographers in America today, and that the field sees about two full-time openings a year.
McKean got her start through a combination of luck and ingenuity: She called up the only dictionary publisher based in Chicago and asked for an internship. After graduation, the internship turned into a job, which eventually turned into a career at Oxford University Press, a move she likens to 鈥渂eing called up by the Yankees.鈥 At age 29, McKean was the chief editor of the American dictionaries group. 鈥淚f it had Oxford and American in the title,鈥 she says, 鈥渋t was my fault.鈥
She could dream up bestsellers, like the but among her favorite books is the first one she acquired at her new home, a publishing house with a reputation for erudition. 鈥淚t was called ....[It] is a treatment of the slang of Buffy the Vampire Slayer,鈥 the title character in a hit television drama from the late 1990s.
The purchase revealed as much about McKean鈥檚 sensibility as it did about her business sense. And when it comes to dictionaries, McKean says, sensibility is key. 鈥淧eople have this idea of the Platonic ideal of the dictionary. That鈥檚 why they call it 鈥榯he dictionary鈥.... They think that all dictionaries are pretty much the same.鈥 Not so, she says. There are five print dictionary publishers in the US, each choosing which of the billions of words they鈥檝e collected will make it into print.
What gets left out depends on the personality of the publishing house. On the other hand, how to evaluate what gets in is a task beyond most people. 鈥淢ost consumers don鈥檛 have a good metric for deciding on whether the dictionary they want to use is a good one 鈥 so they flip the book over, then go to the back, and it says, 鈥榦ver 250,000 entries.鈥 And they go, 鈥楪reat, this dictionary must be awesome!鈥 鈥 she says. 鈥淏ecause if you don鈥檛 know a word, how do you judge the quality of the definition?鈥
Enter Wordnik, McKean鈥檚 newest project. In the infinite space of the Internet, she can define as many words as she wants.
鈥淭here are hundreds of thousands of words that aren鈥檛 in any print dictionary today ... because there鈥檚 no space for all of them.鈥
Wordnik has space for many of them, and for their bells and whistles. Her team of seven has analyzed what print and online dictionaries do and don鈥檛 do well. They鈥檝e built a user-friendly resource that should be the best 鈥 and biggest 鈥 of both worlds. Wordnik generates its content from a database of 4 billion words, twice as many as that of her last employer. 鈥淔our billion words,鈥 she says with a shrug, 鈥渋s what you can pick up lying around on the floor of the Internet.鈥
Want to evaluate a definition of a word you鈥檝e never met? No problem; other users can tell you if they favor that definition. Want to know what other words often appear in the same sentence as what you鈥檝e just looked up? There鈥檚 a section called 鈥渞elated鈥 for words used in the same context as yours. Need to know what a farthingale, for instance, looks like? Images are imported to the page from photo-depot giant Flickr. Unsure if you really understood the definition? Every word has several example sentences, culled at random from that Internet floor and then sorted so the best rise to the top of your search page.
These, McKean says, are critical. They鈥檝e been vanishing from print dictionaries as publishers try to cram them with more words, but contextual sentences are what make people pick up reference books in the first place. 鈥淲e think people go to a dictionary to find out what a word means,鈥 she says. Not so. 鈥淢ost people go to the dictionary because they don鈥檛 want to look stupid.鈥
They don鈥檛 want to sound stupid, either, which is why every word has an audio file of its pronunciation. Users can record their own pronunciations, too.
Print dictionaries do have one clear advantage, though: They show more than one word at a time. That makes skimming the print page fun, and McKean has tried to mimic that feeling with a 鈥渟erendipity鈥 feature, which generates words at random.
Perhaps the most surprising element of McKean鈥檚 new dictionary is a frequency graph, which shows how often the word you鈥檝e looked up was used, as a written word, in a year. That can tell you more about history than just the etymological: Take 鈥渃had,鈥 for instance. The word鈥檚 frequency in 2000 is high 鈥 thanks, of course, to that year鈥檚 presidential election controversy. But there are signs of heavy usage much earlier. [Editor's Note: The original version of this story incorrectly used the word "entymological" instead of "etymological." A reader pointed this out here. You can read our response here.]听
听
鈥淲e have one text from 1870 that has the word 鈥榗had鈥 a lot, because it鈥檚 about Jacquard [weaving] looms, which used to be run on punch cards,鈥 McKean explains. 鈥淭hey had the same chad problems as the Florida 产补濒濒辞迟蝉.鈥
Ultimately, McKean鈥檚 goal is rather humble, when judged against the volume of words that have accumulated in the 400-year history of modern English.
鈥淚deally my goal is, before I die, to have some information about every word that鈥檚 ever been used in print.鈥
That may be the real revolution: digitizing a bit of data about every word we English speakers have ever put on the old-fashioned page. Byte by byte, the soft-spoken lexicographer will see her revolution through.