Skip to content

HTML to LaTeX

sujato edited this page Aug 7, 2022 · 6 revisions

For publications, we transform HTML to LaTeX. Here we document the transformations.

paragraphs

<p>foo</p>

transform elements with classes into commands

In certain cases, paragraphs or other elements are assigned classes for styling. LaTeX doesn’t have such a concept. Instead, we use a custom command, which will be defined in the preamble (head).

Here is a list of all such elements:

<p class='namo'>
<p class='endsection'>
<p class='endsutta'>
<p class='endbook'>
<p class='endkanda'>
<p class='end'>
<p class='uddana-intro'>
<p class='endvagga'>
<p class='rule'>
<p class='add'>
<span class='evam'>
<span class='speaker'>

Use class names to create commands, but since they may clash with LaTeX commands, use the sc prefix. We have to strip any hyphens, as LaTeX doesn’t like them in commands.

\newcommand*{\scnamo}[1]{\begin{center}\textit{#1}\end{center}}
\newcommand{\scendsection}[1]{\begin{center}\textit{#1}\end{center}}
\newcommand{\scendsutta}[1]{\begin{center}\textit{#1}\end{center}}
\newcommand{\scendbook}[1]{\begin{center}\uppercase{#1}\end{center}}
\newcommand{\scendkanda}[1]{\begin{center}\textbf{#1}\end{center}}
\newcommand{\scend}[1]{\begin{center}\textit{#1}\end{center}}
\newcommand{\scuddanaintro}[1]{\textit{#1}}
\newcommand{\scendvagga}[1]{\begin{center}\textbf{#1}\end{center}}
\newcommand{\scrule}[1]{\textbf{#1}}
\newcommand{\scadd}[1]{\textit{#1}}
\newcommand*{\scevam}[1]{\caps{#1}}
\newcommand*{\scspeaker}[1]{\hspace{2em}\textit{#1}}
\scnamo{This is scnamo}

\scendsection{This is scendsection}

\scendsutta{This is scendsutta}

\scendbook{This is scendbook}

\scendkanda{This is scendkanda}

\scend{This is scend}

\scuddanaintro{This is scuddanaintro}

\scendvagga{This is scendvagga}

\scrule{This is scrule}

\scadd{This is scadd}

\scevam{So have I heard.}

\scspeaker{said the Buddha}

data-counter

unordered lists

<ul>
<li>foo</li>
<li>foo</li>
</ul>
\begin{itemize}
  \item foo
  \item foo
\end{itemize}

ordered lists

<ol>
<li>foo</li>
<li>foo</li>
</ol>
\begin{enumerate}
  \item foo
  \item foo
\end{enumerate}

definition lists

<dl>
<dt>foo</dt>
<dd>bar</dd>
<dt>foo</dt>
<dd>bar</dd>
</dl>
\begin{description}
   \item[foo] bar
   \item[foo] bar
\end{description}

verses

<blockquote class='gatha'>
<p>a line of verse<br>
and another line</p>
<p>and a second verse<br>
and now it ends</p>
</blockquote>
\begin{verse}
a line of verse\\
and another line

and a second verse\\
and now it ends
\end{verse}

quotations

<blockquote>
<p>That’s what I said!</p>
<p>And continue to say.</p>
</blockquote>
\begin{quotation}
That’s what I said!

And continue to say.
\end{quotation}

emphasis

That <em>is</em> what I said.
That \emph{is} what I said.

cite

The HTML <cite> tag has no direct parallel in LaTeX. Use \textit

That book is called <cite>Great Things to Know</cite>.
That book is called \textit{Great Things to Know}.

inline foreign words (i.e. not titles)

For languages that use Roman script, especially pli and san, use \textit:

<i lang='pli'>evaṁ</i>
\textit{evaṁ}

For languages that do not use Roman script, especially Chinese, define a macro per language:

<i lang='lzh'>漢語</i>
\textlzh{漢語}

paragraph reference numbers

<p id="ud2.6:2.1">foo
\marginnote[ud2.6:2.1]{ud2.6:2.1}foo
\begin{enumerate}
  \item \marginnote[ud2.6:2.1]{ud2.6:2.1}foo
\begin{description}
   \item[foo] \marginnote[ud2.6:2.1]{ud2.6:2.1}bar
\begin{verse}
\marginnote[ud2.6:2.1]{ud2.6:2.1}

notes

foo<a epub:type="noteref" href="#note-5" id="noteref-5" role="doc-noteref">5</a>

…

<li epub:type="endnote" id="note-5">
<p>bar  <a epub:type="backlink" href="#noteref-5" role="doc-backlink">↩</a></p>
</li>
foo\footnote{bar}

main structure

<section id="frontmatter">
\frontmatter
<section id="mainmatter">
\mainmatter
<section id="backmatter">
\backmatter

new pages

<section epub:type="imprint" id="imprint">
<section epub:type="halftitlepage" id="halftitlepage">
\newpage

epigraph

<article class="epigraph">
<blockquote class="epigraph-text">
<p>a line of verse <br/>
and another line <br/>
and a third line <br/>
and now it ends.</p>
</blockquote>
<p class="epigraph-attribution">With Mucalinda (Mucalindasutta), Ud 2.1</p>
</article>
\epigraph{a line of verse \\and another line \\and a third line \\and now it ends. }{\textit{With Mucalinda (Mucalindasutta), Ud 2.1}}

headings

Use unnumbered (starred) form for sectional headings. LaTeX, however, won’t add starred headings to ToC. Because LaTeX wants rather aggressively to add section numbering, it's best to control this explicitly.

  • star the sectional command
  • \addcontentsline{toc}{chapter}{foo}: this adds it to the contents
  • \markboth{foo}{foo}: this makes it available for page headers.

For snp.

Add all h1 to ToC, regardless of class, etc.

<h1 {all}>foo</h1>
\chapter*{foo}
\addcontentsline{toc}{chapter}{foo}
\markboth{foo}{foo}

Add h2 to ToC, but only if they are sutta titles.

<h2>foo</h2>
\section*{foo}
<h2 class="sutta-title heading">foo</h2>
\section*{foo}
\addcontentsline{toc}{section}{foo}
\markboth{foo}{foo}

titlepage

This is the first page in the PDF.

  • define a special command for each element
  • use the names from the HTML file, in camelCase, and prefixed with titlepage.
  • add vertical space between each element. The starred vspace* forces the space. We can customize the exact quantity later.

In preamble:

\newcommand*{\titlepageTranslationTitle}[1]{{\begin{center}\begin{large}{#1}\end{large}\end{center}}}

\newcommand*{\titlepageCreatorName}[1]{{\begin{center}\begin{normalsize}{#1}\end{normalsize}\end{center}}}

In document:

\begin{document}

\frontmatter

\setlength{\parindent}{0cm}

\pagestyle{empty}

\vspace*{1em}

\titlepageTranslationTitle{Anthology of Discourses}

\vspace*{1em}

\titlepageCreatorName{Bhikkhu Sujato}

\newpage

imprint

\setstretch{1.05}

\begin{footnotesize}

\textit{Anthology of Discourses} is a translation of the Suttanipāta by Bhikkhu Sujato.

\medskip

Creative Commons Zero (CC0)

To the extent possible under law, Bhikkhu Sujato has waived all copyright and related or neighboring rights to \textit{Anthology of Discourses}.

\medskip

This work is published from Australia.

\begin{center}
\textit{This translation is an expression of an ancient spiritual text that has been passed down by the Buddhist tradition for the benefit of all sentient beings. It is dedicated to the public domain via Creative Commons Zero (CC0). You are encouraged to copy, reproduce, adapt, alter, or otherwise make use of this translation. The translator respectfully requests that any use be in accordance with the values and principles of the Buddhist community.}
\end{center}

\medskip

\begin{description}
\item[Web publication date] 2021
\item[This edition] 2022-07-20 09:59:17
\item[Publication type] html
\item[Edition] ed6
\item[Number of volumes] 1
\item[Publication ISBN] 978-1-76132-013-2
\item[Publication URL] 
\item[Source URL] https://github.com/suttacentral/bilara-data/tree/master/translation/en/sujato/sutta/kn/snp
\item[Publication number] scpub17
\end{description}

\medskip

Published by SuttaCentral

\medskip

\textit{SuttaCentral,\\
c/o Alwis \& Alwis Pty Ltd\\
Kaurna Country,\\
Suite 12,\\
198 Greenhill Road,\\
Eastwood,\\
SA 5063,\\
Australia}

\end{footnotesize}

\newpage

halftitlepage

Use the same approach as with the titlepage.

  • Note that halftitlepageFleuron requires OpenType features of Arno. Comment it out until we are ready to use Arno.

In preamble:

\newcommand*{\halftitlepageTranslationTitle}[1]{\setstretch{2.5}{\begin{center}\begin{Huge}\uppercase{\so{#1}}\end{Huge}\end{center}}}

\newcommand*{\halftitlepageTranslationSubtitle}[1]{\setstretch{1.2}{\begin{center}\begin{large}{#1}\end{large}\end{center}}}

%\newcommand*{\halftitlepageFleuron}[1]{{\begin{center}\begin{large}\ArnoProornmZero{{#1}}\end{large}\end{center}}}

\newcommand*{\halftitlepageByline}[1]{{\begin{center}\begin{normalsize}\textit{{#1}}\end{normalsize}\end{center}}}

\newcommand*{\halftitlepageCreatorName}[1]{{\begin{center}\begin{LARGE}{\caps{#1}}\end{LARGE}\end{center}}}

\newcommand*{\halftitlepageVolumeNumber}[1]{{\begin{center}\begin{normalsize}{#1}\end{normalsize}\end{center}}}

\newcommand*{\halftitlepageVolumeAcronym}[1]{{\begin{center}\begin{normalsize}{#1}\end{normalsize}\end{center}}}

\newcommand*{\halftitlepageVolumeTranslationTitle}[1]{{\begin{center}\begin{normalsize}{#1}\end{normalsize}\end{center}}}

\newcommand*{\halftitlepageVolumeRootTitle}[1]{{\begin{center}\begin{normalsize}{#1}\end{normalsize}\end{center}}}

\newcommand*{\halftitlepagePublisher}[1]{{\begin{center}\begin{LARGE}{\ArnoProNoLigatures\caps{#1}}\end{LARGE}\end{center}}}

In document:


\vspace*{-2em}

\halftitlepageTranslationTitle{Anthology of Discourses}

\vspace*{-1em}

\halftitlepageTranslationSubtitle{A refreshing translation of the Suttanipāta}

\vspace*{1em}

%\halftitlepageFleuron{•}

\vspace*{1em}

\halftitlepageByline{translated by}

\vspace*{-1.5em}

\halftitlepageCreatorName{Bhikkhu Sujato}

\vspace*{1em}

\halftitlepageVolumeNumber{}

\halftitlepageVolumeAcronym{Snp}

\halftitlepageVolumeTranslationTitle{}

\halftitlepageVolumeRootTitle{}

\vspace*{\fill}

\halftitlepagePublisher{SuttaCentral}

\newpage