title | category | permalink | redirect_from | date |
---|---|---|---|---|
Conversion from (La)TeX to HTML |
misc |
/FAQ-LaTeX2HTML |
/FAQ-latex2html |
2018-05-25 |
TeX and LaTeX are well suited to producing electronically publishable documents. However, it is important to realize the difference between page layout and functional markup. TeX is capable of extremely detailed page layout; HTML is not, because HTML is a functional markup language not a page layout language. HTML's exact rendering is not specified by the document that is published but is, to some degree, left to the discretion of the browser. If you require your readers to see an exact replication of what your document looks like to you, then you cannot use HTML and you must use some other publishing format such as PDF. That is true for any HTML authoring tool.
TeX's excellent mathematical capabilities remain a challenge in the business of conversion to HTML. Originally there were only two generally reliable techniques for generating mathematics on the web: creating bitmaps of bits of typesetting that can't be translated, and using symbols and table constructs. Neither technique is entirely satisfactory. Bitmaps lead to a profusion of tiny files, are slow to load, and are inaccessible to those with visual disabilities. The symbol fonts offer poor coverage of mathematics, and their use requires configuration of the browser.
Today, with native MathML rendering in some browsers and high quality math rendering available via JavaScript and CSS in all modern graphical browsers there are several possibilities.
The LaTeX to HTML converters listed below all handle mathematics to some extent, and further math-specific details are discussed in Math on the Web.
For today, possible packages are:
-
TeX4ht
a compiled program that supports either LaTeX or Plain TeX, by processing a DVI file; it uses bitmaps for mathematics, but can also use other technologies where appropriate. Written by Eitan Gurari, it parses the DVI file generated when you run (La)TeX over your file withtex4ht
s macros included. As a result, it's pretty robust against the macros you include in your document, and it's also pretty fast. CTAN.Configuring and calling
TeX4ht
can be quite complicated, Michal Hoftich'smake4ht
system provides an alternative easier calling convention, using thetex4ht
converter internally. -
LaTeXML
From NIST is a perl program that can parse most TeX code, including complicated macro definitions. It was used for generating the web version of DLMF from LaTeX sources. Currently it is distributed from the NIST site, and is not in standard TeX distributions. -
lwarp
by Brian Dunn is a recent TeX to HTML converter that uses TeX to parse the input document. Documents may be produced by LaTeX, LuaLaTeX, or XeLaTeX. A texlua script removes the need for system utilities such as make and gawk, and also supports xindy and latexmk. CTAN. -
LaTeX2HTML
aPerl
script package that supports LaTeX only, and generates mathematics (and other "difficult" things) using bitmaps. The original version was written by Nikos Drakos for Unix systems, but the package now sports an illustrious list of co-authors and is also available for Windows systems. Michel Goossens and Janne Saarela published a detailed discussion ofLaTeX2HTML
, and how to tailor it, in TUGboat 16(2). CTAN.A mailing list for users may be found via TUG.
-
Hevea
a compiled program that supports LaTeX only, and uses the font/table technique for equations (indeed its entire approach is very similar toTtH
). It is written in Objective CAML by Luc Maranget.Hevea
isn't archived on CTAN; details (including download points) are available via INRIA. -
TtH
a compiled program that supports either LaTeX or Plain TeX, and uses the font/table technique for representing mathematics. It is written by Ian Hutchinson, usingflex
. The distribution consists of a single C source (or a compiled executable), which is easy to install and very fast-running. However the resulting HTML does not really reach modern standards, and only very simple mathematics can be converted. -
plasTeX
a Python-based LaTeX document processing framework. It gives DOM-like access to a LaTeX document, as well as the ability to generate multiple output formats (e.g. HTML, DocBook, tBook, etc.). -
TeXpider
a commercial program from Micropress, which is described on the Micropress web site; it uses bitmaps for equations.