Document Liberation project announces EPUB exporting tool

Document Liberation project announces EPUB exporting tool

David Tardon of LibreOffice and Document Liberation fame started working on a new library for exporting EPUB files that are typically used for the distribution of ebooks.

The library, predictably called libepubgen, makes use of HTML and SVG content generators from librevenge to create EPUB files and provides API for slightly fancier things like splitting HTML data into multiple files.

David explained a few technical details in his blog post, but there's always more to find out, and he kindly agreed to answer a few questions.

According to the commits log the tool should make valid EPUB 2.0 files. However in the blog you say "What is still missing is handling of foreign binary objects... I think I will convert these to SVG images". However, SVG support is an EPUB 3.0 feature. Same goes for MathML equations. So what's up with that?

I have not mentioned in explicitly, but I am only interested in EPUB 2.0. In my opinion, EPUB 3.0 went the same way as XSLT 2.0 — over-complication rather than improvement. Also, my PRS-505 does not handle EPUB 3.0 :-)

(Editor's note: Book Industry Study Group's compatibility grid for EPUB 3.0, even being somewhat outdated, suggests that EPUB 3.0 is still not quite popular among major vendors.)

As for SVG and MathML, that kind of inline code is only supported in EPUB 3.0 indeed. However, SVG images, that is, included through HTML tag "img", can be used in 2.0, as SVG is one of OPS Core Media Types.

How useful would libepubgen be for a project with its own document model, one that is different from OpenDocument?

I would say that converting the internal document model of an application to librevenge model and then using libepubgen is simpler than generating EPUB directly, unless the document model is already very close to EPUB. But it is really a matter of tradeoffs.

Libepubgen, being based on librevenge, offers the use of a reasonable (we think :-) API for creating the document and it shields the user from the gory details of HTML, CSS and EPUB structure. So if the person implementing the export is already familiar with librevenge API, the libepubgen way seems better. On the other hand, if the implementer has detailed knowlegde of HTML, CSS, and EPUB structure, the advantage of libepubgen is much smaller.

Also, using libepubgen means an additional dependency for the project, which might or might not be a problem.

One additional advantage of using libepubgen is that, once one has implemented the document model to librevenge model conversion, it can be reused to export ODF too, through libodfgen.

That was the general view. Now, some thoughts about concrete applications. Scribus bundles the external libraries it uses, so the barrier of adding a new one is higher than in other projects.

On the other hand, Franz [Schmid] at least has had some experience with librevenge API. He has already implemented the librevenge to internal model conversion (for the libmspub and libvisio import), so, if I simplify this a bit, it is only a matter of "reversing" the code to get the other way. Also, they have ODG import, but not export. So the possibility of gaining export to two formats with one code might be appealing to them.

What about adding this exporter to LibreOffice?

It is rather easy to add a new one in our build system. The only concern is the increase in size of the installation sets. So the main question is whether it is needed: there already are two extensions that handle conversion to EPUB, so adding a third way to do it might be excessive :-)

There would also have to be someone willing to write the integration code (maybe a GSoC task to write conversion code from UNO API to librevenge API for all 4 kinds of documents librevenge supports,and then use that to provide EPUB export through libepubgen).

About those two other projects — are you planning any sort of collaboration with e.g. eLAIX?

No. But I tried both eLAIX and writer2epub in the past. And I heard from other people who have used it; most of the comments were not flattering...

Anyway, there is difference in scope and intended use: eLAIX is apparently written in StarBasic, therefore it is very LibreOffice specific. libepubgen, however, is usable by any tool that can produce the librevenge document stream. This is most easily achieved by plugging one of our import libraries into it, like writerperfect does, but an application can produce it directly too.

So, any application that wishes to export EPUB can do it by producing the doc. stream. But it also allows applications, that can import EPUB, to use our import libraries. In that case EPUB is used as an intermediate format, in the same way like LibreOffice and Calligra use ODF (through libodfgen) or Inkscape uses SVG (through the internal librevenge::RVNGSVGDrawingGenerator).

But that is really just a rationalization. What it boils down to is: I thought EPUB generator was a good idea. I had some time. I did it .-)


Alessandro Rimoldi, who's been working on EPUB exporting in Scribus for the past few years, commented:

Since I have a good unserstanding of how EPUB, HTML, and CSS work, David himself suggests that there is not much interest for me in his library.

That said, since Scribus is already using librevenge for reading formats, I think it makes sense to use libepubgen... This is also true for long term maintainability and also for providing ODT export!

So libepubgen won't help me for now, but I think it's probably be a good investment to move to it in the future.

Source code of libepubgen is available on SourceForge. First official release will be supposedly announced later this year.

Was it useful? There's more:

1 Responses. Comments closed for this entry.

  1. Hopefully this helps push towards a single ebook format standard.