HOME
 
  INFORMATION
   Curriculum vitae (PDF)
   Publication list
 
  RESEARCH & PROJECTS
   Projects
   Publications
   Teaching
 
  RESOURCES
   TeX & LaTeX
   Linux
 
  LINKS
   VisioWave
   ITS
   MultiMedia SPL
   My alumni ITS page
 
  MISC
   Swiss Train Schedules
 
Created by XEmacs
Valid HTML 4.0!

Typing non-english text in LaTeX sources

Traditionally, non-english and accented characters are typed in LaTeX sources using special macro commands, instead of natively. For example, to type "allée" you would put all\'ee in your LaTeX source, instead of the native form. This has two disadvantages: the spell checker will not work on such words, and it can become cumbersome if you type a lot of accented characters.

Typing accented characters in the LaTeX source directly is in general not a good idea, since if LaTeX source file encoding does not match the encoding of the font, garbage comes out instead of the accented characters. Event if under your platform the encodings match, the LaTeX file is not portable, as they might not match under some other platforms.

As a solution you can use the inputenc LaTeX package. This package lets you define the encoding of the LaTeX source file. Once this is defined, you can happily type accented characters without using the macros and still have a portable LaTeX file. Even if compiled under another platform, the correct output will be produced. Note however that the source file might be displayed as garbage under some other platforms (if the platform's encoding does not match the source file encoding), but it will still be compilable into correct output.

There are several encodings defined in the inputenc package. The main ones are: latin1 for ISO Latin-1, ascii for pure ASCII, ansinew and cp1252 (they are synonyms) for Windows 3.1 ANSI (an MS extension of ISO Latin-1) and applemac for Apple MacIntosh.

For example, to use the latin1 encoding, add the following to the LaTeX preamble:

\usepackage[latin1]{inputenc}

It is also possible to change the encoding at an aribtrary point in the document, using the following command:

\inputencoding{encoding name}

This is seldomly used, since normally one would write the entire document using the same encoding.

The inputenc package comes with 8-bit encodings only. Notably it does not support the UTF-8 Unicode encoding. That can be solved by using the unicode package. After installing it, you can type the codument in UTF-8 encoding by adding

\usepackage[latin1]{inputenc}
in the preamble. To access other Unicode functionality you can add
\usepackage{ucs}
\usepackage[latin1]{inputenc}
instead. The unicode package can be found on any CTAN server. On the main CTAN site unicode.tar.gz.

Last Modified: Tuesday, 22-Mar-2005 13:37:41 W. Europe Standard Time


Diego Santa Cruz