What the {em#&quo}?

Ever since the dawn of the Internet, computers have had a hard time dealing with words with diacritics, or accents. Even today, you may see text online with odd characters like # or {} or &em or ?@ in the space where an accented letter should appear. Catalogs, indexes, programs of all kinds, handle accents in different ways, or sometimes not at all.

Now for most of us perhaps, this is no big deal. We read and write in English, right? But words from other languages have crept into English. Think coup d'état or soufflé from French. And we encounter names of authors that include diacritical marks. The chair of the Hopkins German Department several years ago was named Rüdiger Campe. His name could really give you fits if you were looking for books he had written, depending on whether or not you tried to resolve the umlaut into "ue".

One blogger has amusingly written that the Internet hates her name. This isn't really far from the truth. But perhaps the problem with accented letters really just stems from the fact that the accents change pronunciation, that is spoken language, and aren't really all that important for simple writing and reading. You can make sense of a text in French that contains no accents whatsoever. But speaking would be severely impaired.

The basic rule in searching online is to ignore accents. That is, don't even try to type them in. Ignore them. When searching our own libraries' catalog for example, you can ignore accents. So for books or articles by Rüdiger Campe, just type in his name without the umlaut over the "u". For books or articles about Gabriel García Márquez, you can also ignore the accents in his name.

But when you write, you should use the correct accented letters. There are several systems for adding diacritics to digital and online texts like email, Word documents, and Web pages. I keep one pinned to my bulletin board, based on the ASCII codes, and have memorized many of the codes I use every day. Here is the Windows system based on ANSI standards. If you are coding in HTML (anyone still do that?), use these codes. Here's a handy chart I found that has all 3 systems.

We have a ways to go before the Internet speaks a truly universal language. The problem with accents is but one small stumbling block that is slowly being corrected.

About Sue Waterman

Librarian for German and Romance Languages and Literature, the Humanities Center, and the Program in Jewish Studies. Curator for Modern European Literature

Leave a reply