Microsoft Publisher support makes its way to LibreOffice
Reading Microsoft Publisher documents with LibreOffice Draw has just become available thanks to Google Summer of Code program.
Fridrich Strba officially announced public availability of libmspub, a free library for reading and converting MS Publisher documents. The code will be first used in LibreOffice 3.7.
So far the library is capable of reading v2003+ files with bitmaps, basic text formatting features (typeface, font size and color), shapes with fills.
Here is an example from stocklayouts.com opened with LibreOffice Draw:
There will be certain issues with the built-in SVG converter that are easy to predict. First of all, SVG doesn't yet have pagination, and according to Tavmjong Bah, an Inkscape's representative in the W3C SVG working group, it's a low priority feature at this point.
SVG also doesn't have a notion of linked text frames, albeit this could be solved thanks to recent Adobe's work on CSS. And then there is the whole sad story of flowed text in SVG. The example below is a good illustration of that, because contrary to that LibreOffice renders the text in frames just fine.
It is important to note, however, that libmspub will just make sure that as many features of Publisher files as possible will be understood, so that anyone could later plug in the code for converting those features to SVG. The library will also provide API for requesting single pages. As for LibreOffice Draw, it simply imports all pages.
This project is being worked on by Brennan Vincent, a Google Summer of Code student who is co-mentored by Fridrich Strba of LibreOffice team and Valek Filippov of truly yours re-lab team. Fridrich also keeps working on both Corel DRAW and Visio support in LibreOffice.
The libmspub library is the 3rd collaborative project between LibreOffice and re-lab. Architecturally it's a lot like both of the other libraries and has pretty much the same prerequisites: libwpd, libwpg, writerperfect. All source code is in a public Git repository.
The story of the libmspub project dates back to late 2010 when the Scribus team expressed an interest in at least a basic reverse-engineered specification of Microsoft Publisher files. The re-lab project did that, but the Scribus team turned out to be undermanned to have a go at a converter.
Hence the work on reverse-engineering .pub was temporarily put on hold. However OLE Toy app which was specifically created for examining .pub files eventually started supporting all kinds of proprietary file formats such as Visio, Corel DRAW, Macromedia Freehand etc.
Today OLE Toy is the central part of reverse-engineering workflow in both teams, and with this GSoC project it's destined to fulfill its original role. Better late than never.