Discussion:
LyX Export via LaTeXML
Frédéric WANG
2014-05-11 17:12:25 UTC
Permalink
Dear LyX developers,

I've just built the git development version of LyX and had a look at the
(X)HTML export options. As I see, the current possibilities are (please
tell me if I missed anything):

1) HTML export: this is done by tex4ht and generates images of math
formulas.
2) LyXHTML export: this is done by tex4ht and generates MathML output by
default for math formulas.

Moreover, the Document => Parameters => Output menu allows to configure
2), to output HTML, Images or the LaTeX source instead of MathML.

LaTeXML 0.8 has recently been released with new exciting features, and
I'd like to add new export modes using LaTeXML. My idea is:

1) HTML5 export (LaTeXML)
2) XHTML export (LaTeXML)
3) EPUB3 export (LaTeXML)

In the Document => Parameters => Output menu, I wish to add the
following configurations for LaTeXML:

1) Export for maths

a) MathML (default)
b) MathML + CSS fallback
c) MathML + MathJax fallback
d) PNG

For b) and c), see
https://developer.mozilla.org/en-US/docs/Web/MathML/Authoring#Fallback_for_Browsers_without_MathML_support
c) will probably be ignored for the EPUB export, as that would mean
having a copy of the MathJax library available to lyx and packaging a
copy into each ebook...

2) Split the document into multiple pages, see --split-at
http://dlmf.nist.gov/LaTeXML/manual/usage/usage.splitting.html
a) do not split (default)
b) chapter
c) section
d) subsection
e) subsubsection

What do you think about this proposal? And do you have any hints /
recommendations about how to do that?

Well, if I don't hear any strong complaints, I think I'll just go ahead
and try to write a patch implementing that proposal...

Thanks,

PS: Other formats such as DOC (MS Word) or ODT (Open/LibreOffice Writer)
might be possible in the future: https://github.com/KWARC/LaTeXML-Plugin-Doc

PPS: I see that the XHTML output also has a "scale math" parameter. I
guess the rationale is to adjust the size to fix inconsistencies between
text & math fonts. As you probably know, the Open Type MATH table has
started to be implemented in Gecko/WebKit native MathML and I expect
LaTeXML will get some options to set up and handle (Web) fonts better.
So it would be possible in the future to get consistent text & math
fonts and hopefully a MathML rendering close to XeTeX/LuaTeX...

Frédéric Wang
maths-informatique-jeux.com/blog/frederic
Richard Heck
2014-05-11 22:01:19 UTC
Permalink
Hi, Frederic,

I'll reply in more detail later, as I've got a bunch of grading to do
Post by Frédéric WANG
1) HTML export: this is done by tex4ht and generates images of math
formulas.
This is done by whatever converter LyX finds and decides to use. It
would be tex4ht, or latexhtml, or elyxer, or other things, too.
Post by Frédéric WANG
2) LyXHTML export: this is done by tex4ht and generates MathML output
by default for math formulas.
This does not involve tex4ht. The export routine is part of LyX itself.
See all the xhtml() methods scattered through the code.
Post by Frédéric WANG
Moreover, the Document => Parameters => Output menu allows to
configure 2), to output HTML, Images or the LaTeX source instead of
MathML.
Right.

Richard
Frédéric WANG
2014-05-11 22:12:31 UTC
Permalink
Thanks Richard,

I just had a quick look at the code. Indeed, I realized that the case of
LyXHTML is a bit special. So in a first step, I'll just focus on
EPUB3/HTML5 export via LaTeXML without special option. I see this would
be essentially modifying lib/configure.py. What I did for now is:

- merging the "checkviewer" of XHTML & HTML and adding the HTML5 format
to it.
- adding a LaTeX -> HTML5 converter, with LaTeXML as the program.

That seems to work, except that LaTeXML fails to process the .tex
document generated by LyX (I've reported
https://github.com/brucemiller/LaTeXML/issues/487).
Post by Richard Heck
Hi, Frederic,
I'll reply in more detail later, as I've got a bunch of grading to do
Post by Frédéric WANG
1) HTML export: this is done by tex4ht and generates images of math
formulas.
This is done by whatever converter LyX finds and decides to use. It
would be tex4ht, or latexhtml, or elyxer, or other things, too.
Post by Frédéric WANG
2) LyXHTML export: this is done by tex4ht and generates MathML output
by default for math formulas.
This does not involve tex4ht. The export routine is part of LyX
itself. See all the xhtml() methods scattered through the code.
Post by Frédéric WANG
Moreover, the Document => Parameters => Output menu allows to
configure 2), to output HTML, Images or the LaTeX source instead of
MathML.
Right.
Richard
--
Frédéric Wang
maths-informatique-jeux.com/blog/frederic
Guenter Milde
2014-05-12 08:23:21 UTC
Permalink
Dear Frédéric,

thanks for your efforts merging LyX and LaTeXML - this looks promising.
Post by Frédéric WANG
Thanks Richard,
I just had a quick look at the code. Indeed, I realized that the case of
LyXHTML is a bit special. So in a first step, I'll just focus on
EPUB3/HTML5 export via LaTeXML without special option. I see this would
be essentially modifying lib/configure.py.
This is, IMO, the second step.

First, I recommend, try a custom converter:

Open Lyx and go to

Tools>Preferences>File Handling>Converters

Add Converters for the desired conversion. You may have to add dummy
fileformats for different ways to create the same format (HTML, say).
(See how this is done with PDF1 ... PDF5.)

This is also described in Chapter 3 of the Help>Customization documentation.


If this works, report back here or at the LyX wiki.
Then as a next step an auto-configuration can be set up.

Günter
Frédéric WANG
2014-05-12 08:39:05 UTC
Permalink
This is, IMO, the second step. First, I recommend, try a custom
converter: Open Lyx and go to Tools>Preferences>File
Handling>Converters Add Converters for the desired conversion. You may
have to add dummy fileformats for different ways to create the same
format (HTML, say). (See how this is done with PDF1 ... PDF5.) This is
also described in Chapter 3 of the Help>Customization documentation.
If this works, report back here or at the LyX wiki. Then as a next
step an auto-configuration can be set up. Günter
Thank Günter. Well, given that I already built the development program
and started to modify configure.py, I think it will be more convenient
for me to continue like that instead of using the LyX user interface.
However, I see that the latter will be useful to allow users to use
LaTeXML on the LyX release, so I'll try & report that too when I submit
a patch for configure.py.
--
Frédéric Wang
maths-informatique-jeux.com/blog/frederic
Frédéric WANG
2014-05-12 10:10:43 UTC
Permalink
Hi all,

I provide in attachment a small patch for what I think is essentially
needed for LaTeXML export in LyX

1) merging the checkViewer call for HTML previewer + adding the HTML5
format to it.
2) adding checkViewer for EPUB. At the moment, the viewer is Firefox and
it is assumed that one EPUB reader add-on is installed.
3) adding "latexmlc --format=html4" as another HTML converter.
4) adding latexml as an EPUB3 and HTML5 converters. More could be added
later, in particular I'm aware of
https://github.com/michal-h21/tex4ebook for an extension of tex4ht with
EPUB export.

I tested the HTML5 and EPUB3 export. The files are generated and I can
preview them in Firefox (even if the document contains babel errors).
Things that remains to consider:

a) Fix LaTeXML bug with babel
https://github.com/brucemiller/LaTeXML/issues/487 and do more testing to
see how LaTeXML deals with LaTeX files generated by LyX.
b) Improve detection of EPUB reader, perhaps add more EPUB readers in
the list.
c) Check if we need to handle directories instead of a single file for
HTML5/HTML export (LaTeXML generates png & css files, so I wonder if the
ext_copy.py script should be used).
--
Frédéric Wang
maths-informatique-jeux.com/blog/frederic
Loading...