Adam Kilgarriff's report on eLEX2009

How to monetise a web presence (and hoover a moose)
This conference was timely.  As Michael Rundell put it in his opening plenary "two years ago, if you asked me whether paper dictionaries had a future, I responded confidently yes, for a good few years yet.  But now I'm not at all sure."

How to monetise a web presence (and hoover a moose)

A report on the e-lexicography conference at Louvain-la-Neuve, Belgium, 22-24 October 2009
Adam Kilgarriff
Lexical Computing Ltd

This conference was timely.  As Michael Rundell put it in his opening plenary "two years ago, if you asked me whether paper dictionaries had a future, I responded confidently yes, for a good few years yet.  But now I'm not at all sure."  E-lexicography can mean a number of things: using technology for making dictionaries; using dictionaries (and other lexical resources) for high-tech applications; and making (and publishing) dictionaries in electronic form.  While the first and second were well represented at the conference (as at past Euralexes, LRECs etc) it is the third which looms over the field like Vesuvius over Pompeii, and which was explored in Louvain-la-Neuve as never before.  The blunt question from the floor to one of the more theoretical talks (Uli Heid on sophisticated models for lexical databases) was "how is that going to help the team at Chambers-Harrap who have just lost their jobs?"

Rupert Murdoch recently declared war on Google.  The battle is the same one: publishers who pay people (journalists, or lexicographers) to create content are losing out to others who recycle and re-use content, and make it available for free, undermining the publishers' income stream.  As a Murdoch aide put it recently  “There is real tension surrounding the free versus pay debate. We believe that the value of high quality content is not recognised online [by giving it away for free] so something needs to happen. I don’t believe the media industry can continue to exist in this way.” (link)  Murdoch is planning to block Google from accessing, or indexing, their pages. So, no Murdoch pages in Google search hits: a very bold move indeed.

The 'baddies' in the simple version of the narrative as applied to dictionaries are the purveyors of free dictionaries on the web, at sites like dictionary.com, mydictionary.com and yourdictionary.com.  In Germany, leo.org might or might not be in the baddies' camp: its origins are in academia and its quality and market share are both too high to allow it to be written off too quickly.   Other interesting cases include wiktionary - not a tearaway success like its big brother wikipedia, but a significant player nonetheless; the streetwise UrbanDictionary.com; and Erin McKean's new site wordnik (.com). 

If there were any clear baddies at the conference I didn't come across them (perhaps they were in disguise).  'Computing' as distinct from 'editorial' companies there certainly were, but those I talked to have a deep engagement with the content.  All four main producers of specialist software for writing dictionaries were well represented.  ABBYY, the Russian company also well-known for OCR software, is an interesting player.  The ABBYY team gave me a cuddly-toy beaver. Three years ago it was the company's promotional mascot - because they were an electronic dictionary so they didn't take trees away from the beavers.  But now they have developed a wide set of collaborations with dictionary companies and are selling paper as well as electronic dictionaries - so the beaver has retired.  ABBYY also wowed us with their soon-to-be-released cellphone app: you use the phone's camera to take a picture of a word (which could be in Chinese, Japanese or Korean as well as European languages), which is then OCR'd and translated so you can read or hear what it means, in your language!

IDM, the French company, is market leader in dictionary-editing software in Western Europe, so stands to gain or lose with the fate of the dictionary publishers.  They have looked hard at the prospects for online dictionaries (and have developed an online dictionary platform which is being used by both Longman and Macmillan).  Conversations with IDM's Philippe Climent and Holger Hvelplund were maybe the best of the conference.  First, we have to think about advertising.  Can online advertising (probably mediated through Google) sustain a lexicographic team?  At first, anecdotal evidence suggests not - but it all depends on how successful you are in attracting and retaining punters.  In the 2008 edition of the KDictionaries newsletter, John Morse outlined how Merriam Webster were doing very well from advertising on their free dictionary website.  The trick is, possibly, that they started early (and have been doing it well, and are in the US - the best market for making money out of the web. They also had an exceptionally strong brand to start with). Of the UK publishers, CUP started earliest and are doing best.  Developing advertising revenue from a dictionary website is not a quick fix: assume several years for building up a customer base.
Do you have 'search engine optimisation' (SEO) in your dictionary? It is one of the forces that shapes the world we live in.  When I first heard it I thought it must mean 'making a search engine work better' and was bewildered as to why so many people were interested in it.  But then I realised it was optimising your website for the search engine, so you came top of the Google hits when someone types in searches for what you sell, and it made much more sense.

Getting people to your site is only half the issue.  The other is keeping them there.  As Philippe Climent put it "cold calling doesn't work so well.  Yes, of course we want to sell to the website visitors but if we stick a 'sign up here' page under their nose as soon as they arrive, they'll soon move on.  We need lots of attractive, interesting content, so they feel welcome, are drawn in, enjoy the experience and visit again: then, when they are our friends, they are much more inclined to buy things."  IDM's research shows that average length-of-stay at Oxford's, Cambridge’s, Longman's and Macmillan's sites is six or seven page-views, whereas for the new players it is little over one.  Web customers do respond to quality.

The same is true for news, and is part of the reason that Murdoch risks the ire of Google. They say, “the traffic which comes in from Google brings a consumer who more often than not reads one article and then leaves the site. That is the least valuable of traffic to us.”

A venue in which the free-dictionaries issue has been explored at length is Ilan Kernerman's 'Dictionaries' newsletter. It was good to have a new newsletter at the conference - and for it to contain an article from the other side, "TheFreeDictionary.com story" which paints a picture of the company in bright and sunny colours.  Traditional dictionary publishers have no monopoly on educational and altruistic motives: they, no less than the new online-dictionary players, are occasionally led astray by the profit motive.  The contrast between the two is not good versus bad, or academic vs. populist: perhaps it lies more in the areas of expertise, in the editorial aspects of dictionary-making versus website design and engineering.  That suggests grounds for collaboration.

The conference was a thoroughly European affair, with very few delegates from outside Europe, yet there was one session with two very interesting papers on Japanese (one by an Australian and the other by a Slovene.)  The former, by Jim Breen, explored neologism-finding in a very large Japanese web corpus: with no spaces between words and its four alphabets, it presents a challenge for finding new words with intriguing contrasts to the same task for European languages.  The latter, by Kristina Hjelmak, showed how a large Japanese corpus could be 'tamed' to support Japanese learners by methods for finding easy-reading sentences (using strategies similar to those that our team have been developing: we are now planning to add her tool into the Sketch Engine, for Japanese).   

It felt particularly European when, on the last morning, Thierry Fontenelle, former chair of Euralex, put in a brief appearance.  He has just moved back to Europe from Microsoft to manage the research, technology and databases at the European Union translation centre in Luxemburg.  He had many friends at the conference, all delighted to see him.  Lovely to have you back, Thierry!

In Europe the main media for dictionaries are online and paper.  In much of Asia, the handheld is bigger than either, and Hilary Nesi's plenary looked at how they are used, in Thailand and Korea.  In short, teachers don't like them but students do.  Students use them under their desks where teachers can't see them, and if asked, deny they use them at all unless pressed: whereas paper dictionary use is model student behaviour, performed in public in full view of the teacher, handheld use is clandestine, like texting your friends in class, and verges on the subversive.  Researching handheld use was like researching sexual preferences, requiring great delicacy, empathy and tenacity.

One reason teachers gave for not liking them was the quality of the dictionaries, and that they did not know what dictionaries they included.  (They typically include dozens.)  Many did include leading titles from leading publishers, though sometimes the students did not know which.  Also the students usually just used the default dictionary which was typically a low-prestige bilingual locally produced (eg in Thailand or Korea).  Sometimes it was not even possible to get full details of which titles a handheld contained. Vague use of  'Oxford' and 'Webster' was commonplace.  
My favourite corpus line from the conference was from Patrick Hanks, riotously entertaining (in the pursuit of an-ever deeper understanding of how language works) as ever. "Vacuum your moose from the snout up".  A perfectly normal thing to say in a taxidermy context.

When I was eighteen I spent a year as a volunteer teacher in rural Kenya.  Last year, for the first time, I went back.  How had it changed? Then, as now, it was a poor country with poverty visible at every turn. Perhaps now there were even more people at every turn. But the difference that really struck me was that every village or settlement or crossroads had its tea shop and its women selling their produce as it always had, but they were now complemented by a stall where you can re-charge your cellphone.  And, where before there were no billboards, there were now vast ones, shrieking, in orange or purple, ‘safaricom’ or ‘zain’, the two competing cellphone networks.

Are you a digital native?  I suspect most readers of this piece, like me, are digital immigrants.  The terms were coined by Marc Prensky, whom I had the good fortune to hear at IATEFL, the English Language Teaching conference, in April.    Most people closer to my children’s ages than mine are digital natives – in rich countries and poor. They have grown up playing with computers, Xboxes, Nintendo DS’s and cellphones.  Perhaps the central reason that Hilary Nesi’s teachers didn’t like handhelds as much as paper dictionaries was simply that, while the students were at ease with the technology, they were not.  Already over half the world’s population are digital natives, and us digital immigrants (assuming you, dear reader, are closer to my age than my children’s) are doomed.  In twenty or thirty years, people who remember typewriters will only be found in old people’s homes.

Yesterday I looked up ‘dictionary’ on my daughter’s ipod touch.  There were plenty available, some free, others for prices up to around £20.   The screen was big enough and high-enough quality to read a dictionary-entry without difficulty (specially with graceful ipod scrolling).  All the dictionaries had user ratings.  Some were by people and companies I knew: it was nice to find Jack Halpern’s name in the descriptive material for a five-star-rated Japanese dictionary.   There was a correlation between price and ratings – people found the more expensive worth paying for.  The cellphone is the new technology with highest takeup of all, and it hosts dictionaries (which may be local or on the web) and it uses web 2.0 technology for users to give feedback on how good the dictionaries are: this sounds like good news for high-quality dictionary publishers.

Many thanks to Sylviane Granger, Magali Paquot and team for an immaculately organised and deeply interesting conference.