Library Link
Finance Viewpoints

the online discussion and information forum for Librarianship and Information Management


Home
About
Join
News
Discussion
Workshops
Free Article
Free Journal
Library Journals
Library Careers
Consortia Forum
Links
Free-Trials
Viewpoints
17th January 2000

SPELLING STILL COUNTS

James H. Sweetland, North American Convenor, Library Link

For most of human history, spelling was not much of an issue even writers like Shakespeare used variants, even of his own name, apparently as whim dictated. However, with the rise of more or less authoritative dictionaries, spelling gradually became fixed. For example, in the United States Noah Webster consciously used his dictionary in part to create an Americanized English (thus plow instead of plough, or color rather than colour). By the time computers came on the scene in the 1940�s, at least English, at least among educated people, had standardized spelling. However, beginning in the 1960�s, emphasis on spelling and grammar was nearly gone in American schools. Teachers consciously avoided correcting students� work, in hopes of encouraging more creative work. At about the same time, early word processors, and later computers came equipped with spelling checkers, apparently making the need to know how to spell less important.

By the year 2000, we are now in the second generation of people who have grown up under this decline in concern about spelling. And, as more and more people join discussion groups, listservers and the like on the Internet, we seem to see more and more spelling errors, such as the interchangeable use of "site", "cite", and "sight" to refer to a homepage on the Web. In fact, in one electronic discussion, I received a note from a respected "distance educator" arguing that even university instructors should not make comments on spelling, in order to encourage all students to make comments in electronic discussions.

I submit that spelling, at least as the electronic information universe is presently designed, is utterly critical to good information retrieval. This is true because to date nearly all the search engines we have are dependent upon pattern matching. That is, they do not actually search the "word", let alone the concept, but rather they look for a match between a pattern entered and the patterns found in a set of documents, pages or the like. Spoken, "cite", "site" and "sight" sound identical; with access to a context orally or in written form, one can figure out what word is meant, even if it is incorrect. However, with few exceptions, the typical spellchecker will not alert the writer to any potential problem, since each of these words is, in fact, a correct spelling for one meaning. But, if the wrong word is used, when a searcher attempts to find material, say, on locations, and types in the string s-i-g-h-t, s/he will not retrieve a reference which spells it as s-i-t-e.

Now, software developers have been at least moderately aware of this problem, and apparently argue that, with indexing/searching of full text, the odds are that the word in question, whatever it is, will be spelled correctly at least part of the time in fact, the redundancy of language is one of the basic principles behind current relevance ranking systems (documents in which words appear many times are ranked as more relevant than those in which the words appear few times).

However, such approaches make an important assumption that the spelling error is, in effect, a typographical one the author knows the correct word, but just didn�t enter it correctly. I submit that a growing number of authors are unaware of the correct spelling, assume that they have the correct spelling, or don�t care. The latter is especially likely if, all the way from the first grade through the university, no one has ever bothered to correct spelling.

Some examples:

One student recently found almost nothing on the South American group known as the "Shining Path", even though he spent several hours online on several systems. Unfortunately, he spelling the first word as "ShiNNing".

Another student was looking for information on Schizophrenia, but a search of the Web and of several medical databases generated less than twenty records. Her problem she spelled the word as SchizIphrenia. In this case, although supposedly she has had some exposure to the term on the job (she is a Certified Nursing Assistant), and to correct spellings in the 20 documents she unearthed, she never noticed the error.

A third person was interested in information about a specific post-secondary degree program. However, he spelled the word "college", as "collAge". He found a large number of records about art programs in colleges and universities, but not much on the desired material. Again, he missed all the correct spellings of the word in the documents retrieved. Even had the system had a spell checker, it would not have pointed out his error, since collage is a legitimate word.

Of course, if we attain real artificial intelligence soon, the above will cease to be much of a problem, since computers will be able to make contextual judgments, and guesses as to variant word strings. However, I think such a level of AI is unlikely for a very long time. At present, all those who train people to search or help them to search should keep the decline in spelling in mind. And, perhaps every information technology terminal should include a sign on it to the effect Having trouble with your search? - Check your spelling!

January, 2000
James H. Sweetland
North American Convenor, Library Link

Back to Finance Viewpoints Back to Finance Viewpoints


e-mail: [email protected]   tel: +44(0) 1274 777700   fax: +44(0) 1274 785201
60/62 Toller Lane    Bradford    West Yorkshire    England    BD8 9BY