Formatting information

A beginner's introduction to typesetting with LATEX

Chapter 3 — Basic document structures

Peter Flynn

Silmaril Consultants
Textual Therapy Division


v. 3.6 (March 2005)

Contents

Introduction
Foreword
Preface
  1. Installing TEX and LATEX
  2. Using your editor to create documents
  3. Basic document structures
  4. Typesetting, viewing and printing
  5. CTAN, packages, and online help
  6. Other document structures
  7. Textual tools
  8. Fonts and layouts
  9. Programmability (macros)
  10. Compatibility with other systems
  1. Configuring TEX search paths
  2. TEX Users Group membership
  3. The ASCII character set
  4. GNU Free Documentation License
References
Index

This edition of Formatting Information was prompted by the generous help I have received from TEX users too numerous to mention individually. Shortly after TUGboat published the November 2003 edition, I was reminded by a spate of email of the fragility of documentation for a system like LATEX which is constantly under development. There have been revisions to packages; issues of new distributions, new tools, and new interfaces; new books and other new documents; corrections to my own errors; suggestions for rewording; and in one or two cases mild abuse for having omitted package X which the author felt to be indispensable to users. ¶ I am grateful as always to the people who sent me corrections and suggestions for improvement. Please keep them coming: only this way can this book reflect what people want to learn. The same limitation still applies, however: no mathematics, as there are already a dozen or more excellent books on the market — as well as other online documents — dealing with mathematical typesetting in TEX and LATEX in finer and better detail than I am capable of. ¶ The structure remains the same, but I have revised and rephrased a lot of material, especially in the earlier chapters where a new user cannot be expected yet to have acquired any depth of knowledge. Many of the screenshots have been updated, and most of the examples and code fragments have been retested. ¶ As I was finishing this edition, I was asked to review an article for The PracTEX Journal, which grew out of the Practical TEX Conference in 2004. The author specifically took the writers of documentation to task for failing to explain things more clearly, and as I read more, I found myself agreeing, and resolving to clear up some specific problems areas as far as possible. It is very difficult for people who write technical documentation to remember how they struggled to learn what has now become a familiar system. So much of what we do is second nature, and a lot of it actually has nothing to do with the software, but more with the way in which we view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge, please let me know so that I can correct it.

Peter Flynn is author of The HTML Handbook and Understanding SGML and XML Tools, and editor of The XML FAQ.

CHAPTER
3

 

Basic document structures

 

  1. The Document Class Declaration
  2. The document environment
  3. Titling
  4. Abstracts and summaries
  5. Sections
  6. Ordinary paragraphs
  7. Table of contents
ToC

LATEX's approach to formatting is to aim for consistency. This means that as long as you identify each element of your document correctly, it will be typeset in the same way as all the other elements like it, so that you achieve a consistent finish with minimum effort. Consistency helps make documents easier to read and understand.

Elements are the component parts of a document, all the pieces which make up the whole. Almost everyone who reads books, newspapers, magazines, reports, articles, and other classes of documents will be familiar with the popular structure of chapters, sections, subsections, subsubsections, paragraphs, lists, tables, figures, and so on, even if they don't consciously think about it.

Consistency is also what publishers look for. They have a house style, and often a reputation to keep, so they rightly insist that if you do something a certain way once, you should do it the same way each time.

To help achieve this consistency, every LATEX document starts by declaring what document class it belongs to.

ToC3.1 The Document Class Declaration

To tell LATEX what class of document you are going to create, you type a special first line into your file which identifies it.1 To start a report, for example, you would type the \documentclass command like this as your first line:

\documentclass{report}
      

There are four built-in classes provided, and many others that you can download (some may already be installed for you):

report

for business, technical, legal, academic, or scientific reports;

article

for white papers, magazine or journal articles, reviews, conference papers, or research notes;

book

for books and theses;

letter

for letters.2

The article class in particular can be used (some would say ‘abused’) for almost any short piece of typesetting by simply omitting the titling and layout (see below).

The built-in classes are intended as starting-points, especially for drafts and for compatibility when exchanging documents with other LATEX users, as they come with every copy of LATEX and are therefore guaranteed to format identically everywhere. They are not intended as final-format publication-quality layouts. For most other purposes, especially for publication, you use add-in packages to extend these classes to do what you need:

Books and journals are not usually printed on office-size paper. Although LATEX's layouts are designed to fit on standard A4 or Letter stationery for draft purposes, it makes them look odd: the margins are too wide, or the positioning is unusual, or the font size is too small, because the finished job will normally be trimmed to a different size entirely — try trimming the margins of the PDF version of this book to 185mm by 235mm (the same as The LATEX Companion series) and you'll be amazed at how it changes the appearance!

  1. Readers familiar with SGML, HTML, or XML will recognize the concept as similar to the Document Type Declaration.
  1. The built-in letter class is rather idiosyncratic: there are much better ones you can download, such as the memoir and komascript packages.

ToC3.1.1 Document class options

The default layouts are designed to fit as drafts on US Letter size paper.3 To create documents with the correct proportions for standard A4 paper, you need to specify the paper size in an optional argument in square brackets before the document class name, e.g.

\documentclass[a4paper]{report}
        

The two most common options are a4paper and letterpaper. However, many European distributions of TEX now come preset for A4, not Letter, and this is also true of all distributions of pdfLATEX.

The other default settings are for: a) 10pt type (all document classes); b) two-sided printing (books and reports) or one-sided (articles and letters); and c) separate title page (books and reports only). These can be modified with the following document class options which you can add in the same set of square brackets, separated by commas:

11pt

to specify 11pt type (headings, footnotes, etc. get scaled up or down in proportion);

12pt

to specify 12pt type (again, headings scale);

oneside

to format one-sided printing for books and reports;

twoside

to format articles for two-sided printing;

titlepage

to force articles to have a separate title page;

draft

makes LATEX indicate hyphenation and justification problems with a small square in the right-hand margin of the problem line so they can be located quickly by a human.

If you were using pdfLATEX for a report to be in 12pt type on Letter paper, but printed one-sided in draft mode, you would use:

\documentclass[12pt,letterpaper,oneside,draft]{report}
	

There are extra preset options for other type sizes which can be downloaded separately, but 10pt, 11pt, and 12pt between them cover probably 99% of all document typesetting. In addition there are the hundreds of add-in packages which can automate other layout and formatting variants without you having to program anything by hand or even change your text.


  Exercise 1. Create a new document

  1. Use your editor to create a new document.

  2. Type in a Document Class Declaration as shown above.

  3. Add a font size option if you wish.

  4. In North America, omit the a4paper option or change it to letterpaper.

  5. Save the file (make up a name) ensuring the name ends with .tex


  1. Letter size is 8½″×11″, which is the trimmed size of the old Demi Quarto, still in use in North America. The other common US office size is ‘Legal’, which is 8½″×14″, a bastard cutting close to the old Foolscap (8¼″×13¼″). ISO standard ‘A’, ‘B’, and ‘C’ paper sizes are still virtually unknown in many parts of North America.

ToC3.2 The document environment

After the Document Class Declaration, the text of your document is enclosed between two commands which identify the beginning and end of the actual document:

\documentclass[11pt,a4paper,oneside]{report}

\begin{document}
...
\end{document}
      

(You would put your text where the dots are.) The reason for marking off the beginning of your text is that LATEX allows you to insert extra setup specifications before it (where the blank line is in the example above: we'll be using this soon). The reason for marking off the end of your text is to provide a place for LATEX to be programmed to do extra stuff automatically at the end of the document, like making an index.

A useful side-effect of marking the end of the document text is that you can store comments or temporary text underneath the \end{document} in the knowledge that LATEX will never try to typeset them.

...
\end{document}
Don't forget to get the extra chapter from Jim!
      

This \begin ...\end pair of commands is an example of a common LATEX structure called an environment. Environments enclose text which is to be handled in a particular way. All environments start with \begin{...} and end with \end{...} (putting the name of the environment in the curly braces).


  Exercise 2. Adding the document environment

  1. Add the document environment to your file.

  2. Leave a blank line between the Document Class Declaration and the \begin{document} (you'll see why later).

  3. Save the file.


ToC3.3 Titling

The first thing you put in the document environment is almost always the document title, the author's name, and the date (except in letters, which have a special set of commands for addressing which we'll look at later). The title, author, and date are all examples of metadata or metainformation (information about information).

\documentclass[11pt,a4paper,oneside]{report}

\begin{document}

\title{Practical Typesetting}
\author{Peter Flynn\\Silmaril Consultants}
\date{December 2004}
\maketitle

\end{document}
      

The \title, \author, and \date commands are self-explanatory. You put the title, author name, and date in curly braces after the relevant command. The title and author are usually compulsory; if you omit the \date command, LATEX uses today's date by default.

You always finish the metadata with the \maketitle command, which tells LATEX that it's complete and it can typeset the titling information at this point. If you omit \maketitle, the titling will never be typeset. This command is reprogrammable so you can alter the appearance of titles (like I did for the printed version of this document).

The double backslash (\\) is the LATEX command for forced linebreak. LATEX normally decides by itself where to break lines, and it's usually right, but sometimes you need to cut a line short, like here, and start a new one. I could have left it out and just used a comma, so my name and my company would all appear on the one line, but I just decided that I wanted my company name on a separate line. In some publishers' document classes, they provide a special \affiliation command to put your company or institution name in instead.

When this file is typeset, you get something like this (I've cheated and done it in colour for fun — yours will be in black and white for the moment):


  Exercise 3. Adding the metadata

  1. Add the \title, \author, \date, and \maketitle commands to your file.

  2. Use your own name, make up a title, and give a date.

The order of the first three commands is not important, but the \maketitle command must come last.


The document isn't really ready for printing like this, but if you're really impatient, look at Chapter 4 to see how to typeset and display it.

ToC3.4 Abstracts and summaries

In reports and articles it is normal for the author to provide an Summary or Abstract, in which you describe briefly what you have written about and explain its importance. Abstracts in articles are usually only a few paragraphs long. Summaries in reports can run to several pages, depending on the length and complexity of the report and the readership it's aimed at.

In both cases (reports and articles) the Abstract or Summary is optional (that is, LATEX doesn't force you to have one), but it's rare to omit it because readers want and expect it. In practice, of course, you go back and type the Abstract or Summary after having written the rest of the document, but for the sake of the example we'll jump the gun and type it now.

\documentclass[11pt,a4paper,oneside]{report}
\usepackage[latin1]{inputenc}
\renewcommand{\abstractname}{Summary}
\begin{document}

\title{Practical Typesetting}
\author{Peter Flynn\\Silmaril Consultants}
\date{December 2004}
\maketitle

\begin{abstract}
This document presents the basic concepts of 
typesetting in a form usable by non-specialists. It 
is aimed at those who find themselves (willingly or 
unwillingly) asked to undertake work previously sent 
out to a professional printer, and who are concerned 
that the quality of work (and thus their corporate 
æsthetic) does not suffer unduly.
\end{abstract}

\end{document}
      

After the \maketitle you use the abstract environment, in which you simply type your Abstract or Summary, leaving a blank line between paragraphs if there's more than one (see section 3.6 for this convention).

In business and technical documents, the Abstract is often called a Management Summary, or Executive Summary, or Business Preview, or some similar phrase. LATEX lets you change the name associated with the abstract environment to any kind of title you want, using the \renewcommand command to give the command \abstractname a new value:

\renewcommand{\abstractname}{Executive Summary}
      

  Exercise 4. Using an Abstract or Summary

  1. Add the \renewcommand as shown above to your Preamble.

    The Preamble is at the start of the document, in that gap after the \documentclass line but before the \begin{document} (remember I said we'd see what we left it blank for: see the panel ‘The Preamble’ in section 3.4).

  2. Add an abstract environment after the \maketitle and type in a paragraph or two of text.

  3. Save the file (no, I'm not paranoid, just careful).


Notice how the name of the command you are renewing (here, \abstractname) goes in the first set of curly braces, and the new value you want it to have goes in the second set of curly braces (this is an example of a command with two arguments). The environment you use is still called abstract (that is, you still type \begin{abstract}...\end{abstract}). What the \abstractname does is change the name that gets displayed and printed, not the name of the environment you store the text in.

If you look carefully at the example document, you'll see I sneakily added an extra command to the Preamble. We'll see later what this means (Brownie points for guessing it, though, if you read section 2.7).

ToC3.5 Sections

In the body of your document, LATEX provides seven levels of division or sectioning for you to use in structuring your text. They are all optional: it is perfectly possible to write a document consisting solely of paragraphs of unstructured text. But even novels are normally divided into chapters, although short stories are often made up solely of paragraphs.

Chapters are only available in the book and report document classes, because they don't have any meaning in articles and letters. Parts are also undefined in letters.4

Depth Division Command Notes
−1 Part \part Not in letters
0 Chapter \chapter Books and reports
1 Section \section Not in letters
2 Subsection \subsection Not in letters
3 Subsubsection \subsubsection Not in letters
4 Titled paragraph \paragraph Not in letters
5 Titled subparagraph \subparagraph Not in letters

In each case the title of the part, chapter, section, etc. goes in curly braces after the command. LATEX automatically calculates the correct numbering and prints the title in bold. You can turn section numbering off at a specific depth: details in section 3.5.1.

\section{New recruitment policies}
...
\subsection{Effect on staff turnover}
...
\chapter{Business plan 2005--2007}
      

There are packages5 to let you control the typeface, style, spacing, and appearance of section headings: it's much easier to use them than to try and reprogram the headings manually. Two of the most popular are the ssection and sectsty packages.

Headings also get put automatically into the Table of Contents, if you specify one (it's optional). But if you make manual styling changes to your heading, for example a very long title, or some special line-breaks or unusual font-play, this would appear in the Table of Contents as well, which you almost certainly don't want. LATEX allows you to give an optional extra version of the heading text which only gets used in the Table of Contents and any running heads, if they are in effect . This optional alternative heading goes in [square brackets] before the curly braces:

\section[Effect on staff turnover]{An analysis of the 
effect of the revised recruitment policies on staff 
turnover at divisional headquarters}
      

  Exercise 5. Start your document text

  1. Add a \chapter command after your Abstract or Summary, giving the title of your first chapter.

  2. If you're planning ahead, add a few more \chapter commands for subsequent chapters. Leave a few blank lines between them to make it easier to add paragraphs of text later.

  3. By now I shouldn't need to tell you what to do after making significant changes to your document file.


  1. It is arguable that chapters also have no place in reports, either, as these are conventionally divided into sections as the top-level division. LATEX, however, assumes your reports have chapters, but this is only the default, and can be changed very simply (see section 9.6).
  1. Details of how to use LATEX packages are in section 5.1.

ToC3.5.1 Section numbering

All document divisions get numbered automatically. Parts get Roman numerals (Part I, Part II, etc.); chapters and sections get decimal numbering like this document, and Appendixes (which are just a special case of chapters, and share the same structure) are lettered (A, B, C, etc.).

You can change the depth to which section numbering occurs, so you can turn it off selectively. In this document it is set to 3. If you only want parts, chapters, and sections numbered, not subsections or subsubsections etc., you can change the value of the secnumdepth counter using the the \setcounter command, giving the depth value from the table in section 3.5:

\setcounter{secnumdepth}{1}
        

A related counter is tocdepth, which specifies what depth to take the Table of Contents to. It can be reset in exactly the same way as secnumdepth. The current setting for this document is 2.

\setcounter{tocdepth}{3}
	

To get an unnumbered section heading which does not go into the Table of Contents, follow the command name with an asterisk before the opening curly brace:

\subsection*{Shopping List}
        

All the divisional commands from \part* to \subparagraph* have this ‘starred’ version which can be used on special occasions for an unnumbered heading when the setting of secnumdepth would normally mean it would be numbered.

ToC3.6 Ordinary paragraphs

After section headings comes your text. Just type it and leave a blank line between paragraphs. That's all LATEX needs.

The blank line means ‘start a new paragraph here’: it does not (repeat: not) mean you get a blank line in the typeset output. Now read this paragraph again and again until that sinks in.

The spacing between paragraphs is a separately definable quantity, a dimension or length called \parskip. This is normally zero (no space between paragraphs, because that's how books are normally typeset), but you can easily set it to any size you want with the \setlength command in the Preamble:

\setlength{\parskip}{1cm}
      

This will set the space between paragraphs to 1cm. See section 2.8.1 for details of the various size units LATEX can use. Leaving multiple blank lines between paragraphs in your source document achieves nothing: all extra blank lines get ignored by LATEX because the space between paragraphs is controlled only by the value of \parskip.

White-space in LATEX can also be made flexible (what Lamport calls ‘rubber’ lengths). This means that values such as \parskip can have a default dimension plus an amount of expansion minus an amount of contraction. This is useful on pages in complex documents where not every page may be an exact number of fixed-height lines long, so some give-and-take in vertical space is useful. You specify this in a \setlength command like this:

\setlength{\parskip}{1cm plus4mm minus3mm}
      

Paragraph indentation can also be set with the \setlength command, although you would always make it a fixed size, never a flexible one, otherwise you would have very ragged-looking paragraphs.

\setlength{\parindent}{6mm}
      

By default, the first paragraph after a heading follows the standard Anglo-American publishers' practice of no indentation. Subsequent paragraphs are indented by the value of \parindent (default 18pt).6 You can change this in the same way as any other length.

In the printed copy of this document, the paragraph indentation is set to 12pt and the space between paragraphs is set to 0pt. These values do not apply in the Web (HTML) version because not all browsers are capable of that fine a level of control, and because users can apply their own stylesheets regardless of what this document proposes.


  Exercise 6. Start typing!

  1. Type some paragraphs of text. Leave a blank line between each. Don't bother about line-wrapping or formatting — LATEX will take care of all that.

  2. If you're feeling adventurous, add a \section command with the title of a section within your first chapter, and continue typing paragraphs of text below that.

  3. Add one or more \setlength commands to your Preamble if you want to experiment with changing paragraph spacing and indentation.


To turn off indentation completely, set it to zero (but you still have to provide units: it's still a measure!).

\setlength{\parindent}{0in}
      

If you do this, though, and leave \parskip set to zero, your readers won't be able to tell easily where each paragraph begins! If you want to use the style of having no indentation with a space between paragraphs, use the parskip package, which does it for you (and makes adjustments to the spacing of lists and other structures which use paragraph spacing, so they don't get too far apart).

  1. Paragraph spacing and indentation are cultural settings. If you are typesetting in a language other than English, you should use the babel package, which alters many things, including the spacing and the naming of sections, to conform with the standards of different countries and languages.

ToC3.7 Table of contents

All auto-numbered headings get entered in the Table of Contents (ToC) automatically. You don't have to print a ToC, but if you want to, just add the command \tableofcontents at the point where you want it printed (usually after the Abstract or Summary).

Entries for the ToC are recorded each time you process your document, and reproduced the next time you process it, so you need to re-run LATEX one extra time to ensure that all ToC page-number references are correctly calculated.

We've already seen in section 3.5 how to use the optional argument to the sectioning commands to add text to the ToC which is slightly different from the one printed in the body of the document. It is also possible to add extra lines to the ToC, to force extra or unnumbered section headings to be included.


  Exercise 7. Inserting the table of contents

  1. Go back and add a \tableofcontents command after the \end{abstract} command in your document.

  2. You guessed.


The commands \listoffigures and \listoftables work in exactly the same way as \tableofcontents to automatically list all your tables and figures. If you use them, they normally go after the \tableofcontents command.

The \tableofcontents command normally shows only numbered section headings, and only down to the level defined by the tocdepth counter (see section 3.5.1), but you can add extra entries with the \addcontentsline command. For example if you use an unnumbered section heading command to start a preliminary piece of text like a Foreword or Preface, you can write:

\subsection*{Preface}
\addcontentsline{toc}{subsection}{Preface}
      

This will format an unnumbered ToC entry for ‘Preface’ in the ‘subsection’ style. You can use the same mechanism to add lines to the List of Figures or List of Tables by substituting lof or lot for toc.


Previous Top Next