Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The main workhorse is https://dlmf.nist.gov/LaTeXML/ as you can find once you reach https://github.com/arxiv-vanity/engrafo.

LaTeXML converts tex to XML by running latex ("only latex can parse latex") and working on the DVI output.

But nevertheless this is a hard job so I will loook into the engrafo code soon because I want to apply this to a book we have written.



Yup, @kpsns nailed it. LaTeXML does the heavy lifting in converting TeX to XML. From there some post processing does the job of converting it to a nice responsive template (that's done by Engrafo / the ArXiv Vanity team).

We love OSS at AI2, and are looking to collaborate with the Engrafo / ArXiv Vanity team as we expand the functionality.


So I digged into the code (engrafo repository) and was quite surprised that -- contrary to the suggestive title -- the method inherits all the problems LaTeXML already has. This is the fact that (for instance compared to the TeXLive distribution), tons of widespread sty files miss a LaTeXML integration and thus the conversion fails for a wide range of papers. Converting a TeX document to XML with LaTeXML really requires a lot of debugging and ideally starting from a plain LaTeX paper/book and compiling with pdflatex and latexml at the same time, making sure nothing breaks.


Yea, it's tough work. We're hoping to invest more in the conversion library (and support the Engrafo's team to do so).

It's going to take a lot of time and elbow grease to get it to where it needs to be!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: