What is LATEX?
Useful packages
Editing LATEX files
Doing Linguistics in LATEX
A fairly quick introduction
Paul Hagstrom
Linguistics Program
Dept. of Romance Studies
Boston University
May 28, 2008
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
1
What is LATEX?
Basic LATEX
The structure of a LATEX document
2
Useful packages
Bibliography management
Linguistic examples
Tree diagrams
Presentations
3
Editing LATEX files
Editors
Basic LATEX editing
Semantics
Tables
4
Miscellaneous notes
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
LATEX
LATEX is a typesetting system.
It is not a word processor.
It is free. It is mature. It is actively developed.
Documents are written in plain text files.
Writing is more about what you mean, and less about how it
looks.
It is very extensible and many things linguists need to do are
available as add-ons.
It has a bit of a learning curve.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
LATEX source
A LATEX source file for a very short paper:
\documentclass{article}
\title{The word order of Japanese}
\author{Paul Hagstrom}
\begin{document}
\maketitle
\section{Introduction}
Japanese is the primary language spoken in Japan.
\section{Conclusion}
In Japanese, the verb comes at the end. vgfhgfhgfhgf. gfhgf
\end{document}
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
LATEX output
This is the result:
The word order of Japanese
Paul Hagstrom
May 28, 2008
1
Introduction
Japanese is the primary language spoken in Japan.
2
Conclusion
In Japanese, the verb comes at the end. vgfhgfhgfhgf. gfhgfhgf
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Macros
Much of the action in a LATEX document happens in the use of
macros. These are instructions that start with a backslash (\)
character. For example, the macro \ldots will produce “. . . ”,
and \LaTeX will produce LATEX (in the “official” way).
A LATEX document begins with a \documentclass instruction,
that identifies the type of document being produced. The
most common is probably article, but there are also classes
for other things (books, reports). By setting the document
class, you set various things about how sections are numbered
and typeset, how title pages are produced, etc.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Preamble
After the \documentclass{article} instruction, other
information about the document appears. This section is
referred to as the “preamble.”
For example, the \title{Title} macro designates the title of
the paper, \author{Author} designates the author.
Some macros take an argument, like \title and author do.
The argument is specified after the macro in braces. Some
macros, like \LaTeX and \ldots do not need an argument.
You can also define your own macros, or use packages in the
preamble.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Packages
Packages define new macros, over and above the basic macros
included in LATEX. These provide specialized commands to
accomplish various useful things, like drawing trees, aligning
glosses, improving the looks of tables, and many other things.
Most of the specialized Linguistics-related things we do are
defined in packages.
Usual installations of LATEX make a very large number of
packages available for use.
Any packages not already installed can usually be downloaded
and installed pretty easily.
Writing your own macros and packages is also relatively easy.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Defining your own commands
If you can think of a macro that would make things easier, but
which doesn’t already exist, it is easy to create them.
It is nice to italicize the wh part of the phrase wh-word.
To italicize something, you can either write
\textit{this is italic} or \emph{this is italic}.
Note the semantics. \textit marks italic, \emph
marks emphatic (which is often italic, but could
instead be red). It seems to me that the wh part of
“wh-word ” is properly italic, not emphatic.
I generally define a macro \wh to set wh in italics.
\newcommand{\wh}{\textit{wh}}
Then, \wh-word comes out as “wh-word.”
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Using packages
In the preamble, you will often have several commands like
\usepackage{pst-jtree}
\usepackage{natbib}
\usepackage{gb4e, cgloss}
\usepackage{comment}
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{mathptmx}
\usepackage{soul}
\usepackage{wasysym}
\usepackage{stmaryrd}
These extend the basic LATEX system to include all kinds of
other useful macros or typesetting behaviors. I’ll talk about
these more individually as we continue.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
The document
After the preamble, the document itself begins with
\begin{document}
The first thing that happens is generally the following, which
transforms the information entered in with \title, \author,
and \date into a standard title block or page.
\maketitle
And, when the masterpiece is finished, it ends with
\end{document}
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Crossreferencing
Sections and subsections are introduced with the \section
and \subsection commands. They are automatically
numbered. You can attach a label to the section number and
refer to it by name elsewhere (meaning you are then free to
shuffle things around).
This is accomplished by adding a \label command.
\section{Name of section} \label{nomenclature}
Then, in the text, you can recall the section number like this:
As we saw in section \ref{nomenclature}, \ldots
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Bibliography management
One of the single most annoying parts of writing a paper is
managing the bibliography.
Do not do this by hand. You will make errors and you will
waste a lot of time that could be much better spent.
The way bibliography management is usually accomplished in
LATEX is with BibTeX.
Outside of LATEX, you will want bibliography management
software. I use BibDesk for the Mac, which is excellent (and
free).
http://bibdesk.sourceforge.net/
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
BibDesk
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Citations
Each reference in your BibTeX database has an identifier. I
usually call them things like “chomsky1995” though you can
call them whatever you want, so long as they are unique.
To insert a citation in running text, you use
\citet{chomsky1995}, which results in a citation like
Chomsky (1995).
To insert a citation within parentheses, you use
\citep{chomsky1995}, which results in a citation more like
this (Chomsky 1995).
There are some extended citation commands that I usually
define, lifted from the style sheet of Semantics & Pragmatics,
that allow use of \pgcitet{chomsky1995}{48} to get
Chomsky (1995:48), and \posscitet{chomsky1995} for
Chomsky’s (1995). See next slide.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Extensions to citation commands
You can put these in the preamble to extend the citation
commands. (These require sp.bst as well.)
\usepackage{natbib}
\bibpunct[: ]{(}{)}{;}{a}{}{,}
\newcommand{\BIBand}{\&}
\providecommand{\doi}[1]{\href{http://dx.doi.org/#1}{doi:#1}}
\setlength{\bibsep}{0pt}
\setlength{\bibhang}{0.25in}
\bibliographystyle{sp}
\newcommand{\posscitet}[1]{\citeauthor{#1}’s (\citeyear{#1})}
\newcommand{\possciteauthor}[1]{\citeauthor{#1}’s}
\newcommand{\pgposscitet}[2]{\citeauthor{#1}’s (\citeyear{#1}:~#2)}
\newcommand{\secposscitet}[2]{\citeauthor{#1}’s (\citeyear{#1}:~$\S$#2)}
\newcommand{\pgcitealt}[2]{\citealt{#1}:~#2}
\newcommand{\seccitealt}[2]{\citealt{#1}:~$\S$#2}
\newcommand{\pgcitep}[2]{(\citealt{#1}:~#2)}
\newcommand{\seccitep}[2]{(\citealt{#1}:~$\S$#2)}
\newcommand{\pgcitet}[2]{\citeauthor{#1} (\citeyear{#1}:~#2)}
\newcommand{\seccitet}[2]{\citeauthor{#1} (\citeyear{#1}:~$\S$#2)}
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
The natbib package
The standard citation and bibliography format used by BibTeX
is the one often used in other sciences, where citations are
simply indicated by a number [1], and listed by number in the
bibliography.
These are not generally what we want in Linguistics papers.
To get the more standard Linguistics citation styles, add the
following to the preamble:
\usepackage{natbib}
This will give you citations in the author-year format instead.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Producing the bibliography
At the end, generating the bibliography is easy. Just:
\bibliography{general-linguistics-utf8}
In the example above, the name of the bibliography database
is “general-linguistics-utf8”, which is where BibTeX will search
for the references.
In order to get the references right, LATEX needs to be run a couple of times. Often, the
editing programs you use will do this automatically. What it does is: runs LATEX on the
file once, to collect a list of all the things you refer to, then runs BibTeX on the list to
get the references for all of them, and then runs LATEX again to insert the references
where they belong (in the inline citations and in the bibliography at the end).
The style the bibliography appears in is governed by the
\bibliographystyle macro. I usually use the one from
Semantics & Pragmatics (which is in a file called sp.bst),
although there is also a popular one based on the Linguistic
Inquiry style as well.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Numbered examples
Numbering linguistic examples in the way with which we are
familiar is not a built-in behavior of LATEX, so we need a
package to help out with this.
There are a few package options. Probably the simplest one is
linguex, although the one I generally use is gb4e.
To use linguex, you include the following in your preamble:
\usepackage{linguex}
Then, you enter examples like this:
\ex. \a. This is an example. \label{example}
\b. *This example an is.
This will result in something like:
(1)
a. This is an example.
b. *This example an is.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Glossing examples
Lining up glosses is one of the main things that made me turn
to LATEX—it is a pain in the neck to do this properly in Word,
but it is easy with linguex (or cgloss).
To line up glosses, you use the \exg macro instead of \ex,
and provide three lines. The first line is the target language,
the second line is the word-by-word gloss, and the third line is
the paraphrase. The first and second lines will be lined up
based on where the spaces are.
\exg. *Wen liebt seine Mutter?\\
Whom loves his mother\\
‘Who does his mother love?’
(2)
*Wen liebt seine Mutter?
Whom loves his mother
‘Who does his mother love?’
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Drawing trees
Drawing syntactic trees can also be a pain in the neck in Word.
There are a few different packages available for drawing trees
in LATEX.
jTree This is the one I use. The syntax can be kind of
complicated, but it is concise.
qTree This one can do simple trees pretty easily.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Source for a tree with qtree
A LATEX source file for a tree in qtree:
\documentclass{article}
\usepackage{qtree}
\begin{document}
\Tree
[.IP [ Roses ].NP_i
[.I\1 [ are ].I\0
[.VP t_i
[ [ going ].V\0
\qroof{out of style}.PP
].V\1 ].VP ].I\1 ]
\end{document}
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
LATEX output
This is the result:
IP
I′
NPi
Roses
I0
are
VP
V′
ti
Paul Hagstrom
V0
PP
going
out of style
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
A simpler tree with qtree
\documentclass{article}
\usepackage{qtree}
\begin{document}
\Tree
[.S This [.VP [.V is ]
\qroof{a simple tree}.NP ] ]
\end{document}
S
This
VP
V
NP
is
a simple tree
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
Source for a tree with jTree
A LATEX source file for a tree in jTree:
\documentclass{article}
\usepackage{pst-jtree-beta}
\begin{document}
\jtree[xunit=2.45em,yunit=1.4em,dirA=(1:-1),nodesep=0]
\def\\{[labelgapb=-4pt]}%
\def\V{$\rm \overline V$}%
\! = {S}
<wideleft>{S}!a ^<vert>{and} ^<wideright>{S}
:({NP}<shortvert>{Fred}) {\V}
:({V}\\{knows}) {NP}
<tri>{a man} ^<right>
<right>[scaleby=3.5 1,branch=\blank]{NP}@A3 !b ^{S}
:({NP}<shortvert>{who}) {\V}@A2
<left>({V}\\{repairs}).
\!a = :({NP}<shortvert>{John}) {\V}@A1
<left>{V}\\{sells}.
\!b = <vartri>{washing machines}.
\nccurve[angleB=150,ncurvB=1.4]{A2:b}{A3:t}
\nccurve[angleB=135,ncurvA=.5,ncurvB=2.6]{A1:b}{A3:t}
\endjtree
\end{document}
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
LATEX output
This is the result:
S
S
NP
John
and
V
V
sells
S
NP
Fred
V
V
knows
NP
a man S
NP
NP
who
Paul Hagstrom
V
washing machines
V
repairs
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
A simpler tree with jTree
\documentclass{article}
\usepackage{pst-jtree-beta}
\begin{document}
\jtree \! = {S} :{This} {VP}
:{V}(<vert>{is}) {NP}(<vartri>{a simple tree}) .
\endjtree
\end{document}
S
This
VP
V
NP
is a simple tree
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
A tree with movement arrows in jTree
\documentclass{article}
\usepackage{pst-jtree-beta} \begin{document}
\jtree \! = {S} :{John}@S {VP}
:{V}(<vert>{fell}) {NP}(<vert>{t}@T ) .
\nccurve[dirA=(-1:-1),dirB=(-1:-5),ncurv=1.2]
{->}{T}{S}
\endjtree \end{document}
S
John
VP
V
NP
fell
t
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Technical note about pstricks
Adobe created the “PostScript” language to communicate with
printers. PostScript includes commands defining points, lines,
curves, colors.
Many graphics packages for LATEX (for, e.g., movement arrows)
generate PostScript instructions using the pstricks package.
PDF has since became standard, but PDF does not recognize
PostScript instructions.
If you compile LATEX directly to PDF (pdflatex) no
pstricks-generated graphics will appear.
Most editors have a way to first compile a PostScript file and
then “distill” it to a PDF file (which preserves the graphics).
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
title
You can also use LATEX as a replacement for PowerPoint, to do
presentations.
This presentation is created using beamer, which is probably
the most popular presentation package.
The result is a PDF file that can be presented with most PDF
viewers (Acrobat, Preview, Skim).
There is a presenter called KeyJNote that I hope to explore
more, which should run on various platforms, and allows
interesting slide transitions.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
LATEX editors
LATEX files are just text, so you can edit them with any text
editing program.
There are several “front-ends” available, however, that can help
make the process a lot easier. Ideally, your editor will have:
syntax coloring
bracket balancing
autocompletion
macro definitions (keyboard shortcuts)
preview capability
pdfsync support
On the Mac, I use TextMate (which is nice, but not free).
Most people use TeXShop (which is free, and also nice).
AUCTeX is a plugin for emacs (like Aquaemacs).
texmaker is a cross-platform (java) editor that works
anywhere, quite complete.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Basic LATEX editing
In general, “whitespace” does not matter. Putting 1, 3, or 47
spaces between words does not change how far apart they are
rendered.
Paragraphs are separated by blank lines.
You can put in comments (and if you do something
complicated, you should—otherwise you will forget why you
did it that way) by using the % character. Everything
including and following the % will be ignored by the compiler.
Some characters are special, like \, %, &. If you want to type
those, you generally need to “escape” them by putting a \ in
front of them (e.g., \% yields %, and % by itself starts a
comment).
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Basic LATEX editing
Em-dashes are constructed by typing three hyphens (--- yields
—). En-dashes are constructed by typing two hyphens (-yields –.)
Left and right quote marks are typed differently: Left is
backtick (‘) and right is quote (’). Quotation marks are just
two of them together (‘‘indeed’’ yields “indeed”). You do
not want to use ’ on both sides, or it will look ”goofy.”
Macros like \LaTeX will sometime “eat” the following space. It
is often necessary to put macros inside their own group, like
{\LaTeX}. This is particularly important when glossing (since
spaces are important for lining things up).
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Semantic formulae
LATEX (and TEX) is particularly well-suited to typing
mathematical and logical formulae.
There is a special “math environment” that you put LATEX into
when typesetting these.
To specify “math mode” you can either put $ at the beginning
and end of the math section, or put the whole thing inside the
argument of \ensuremath.
In math mode, ^{x} produces a superscript x, _{x} produces
a subscript x, and letters are treated as variables (italicized).
Spacing ceases to matter, everything is close up.
Symbols like → (\rightarrow) and ∃ (\exists) and ∀
(\forall) and α (\alpha) can be used.
If you need text within math mode, you mark it off with the
\mbox macro like this: ∀x.x sings.
($\forall{x}.x\mbox{ sings.}$).
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Semantic formulae
A particularly useful command that I always define is one that
will set the semantic evaluation function. This requires that
you also have \usepackage{stmaryrd} in your preamble as
well, since it defines the \llbracket and \rrbracket
characters.
\newcommand{\evalfun}[2][]{\ensuremath{
\left\llbracket \mbox{#2} \right\rrbracket^{#1}}}
With this defined, you can just type
\evalfun{everyone sings} and get Jeveryone singsK, or
type \evalfun[w,g]{everyone sings} and get
Jeveryone singsKw ,g .
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Tables in LATEX
Making tables in LATEX is pretty straightforward.
You first designate a “table” environment with
\begin{tabular} and \end{tabular}.
Inside the tabular environment, you use & to separate
columns, and \\ at the end of each row.
The \begin{tabular} macro requires that you also specify
the structure of your table, and how it aligns with the
surrounding text. The first parameter sets the baseline (e.g.,
[t] for “top”), and the second is a list of columns and
formatting instructions (e.g., {rcl}, which means that the
first column is right-justified, the second is centered, and the
third is left-justified.)
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Tables in LATEX
right
A ex
B example
A
*
ok
B
ok
*
label
interesting!
less interesting!
\begin{tabular}[t]{r|cc|l}
right & A & B & label \\ \hline
A & * & ok & interesting! \\
B & ok & * & less interesting!
\end{tabular}
Paul Hagstrom
Doing Linguistics in LATEX
Miscellaneous notes
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Using Times
Out of the box, LATEX will provide you with documents set in
Computer Modern. So, when you look at the document you
produced, you will find that it just screams “I was made with
LATEX!” Personally, I’m not a big fan of that. Computer Modern
looks ok, but I have been working with Word for so long previously
that I’ve come to like the look of Times.
\usepackage{mathptmx}%times
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Using smileys and frownies
I’ve occasionally wanted to use , and /. To get access to
those:
\usepackage{wasysym}
% for \smiley and \frownie
Then, you can place a smiley with {\smiley} and a frownie
with {\frownie}. The wasysm package provides other
symbols as well, but these are the ones I use so far.
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
Convenience macros
When glossing, I like to set syntactic morphemes in small caps,
and I set up a macro to allow me to type {\ACC} rather than
\textsc{acc} or \caps{acc} (the latter version depends on
the soul package).
To do this, you just define a new command. The way to do
that is with the \newcommand command.
\newcommand{\ACC}{\caps{acc}}
Any time after encountering the line above, LATEX will turn any
\ACC it sees into \caps{acc}. Capitalization matters, so \ACC
is different from \acc or \Acc. I have defined a whole slew of
these (\NOM, \TOP, \PAST, . . . ).
Paul Hagstrom
Doing Linguistics in LATEX
What is LATEX?
Useful packages
Editing LATEX files
Miscellaneous notes
UTF8
For special characters like é, á, and more exotic ones like , it is
very useful to have your files in a Unicode format. UTF8 is the
most straightforward form of this.
LATEX itself is mostly Unicode compatible.
Be sure your editor can save in UTF8 format.
Include the following in your preamble.
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}
I have not had much luck with XeTeX, particularly in
combination with pstricks.
Paul Hagstrom
Doing Linguistics in LATEX
© Copyright 2026 Paperzz