Mobileread
[Plugin] ODTImport
#1  Doitsu 05-25-2016, 04:52 PM
ODTImport: Import ODT documents into Sigil as epubs.
(based on Writer2LaTeX)

Current Version: "0.3.2"

This plugin is a very simple Writer2LaTeX 1.4 wrapper, which allows you to import OpenOffice ODT files.

Credits: Since I'm not that familiar with the ODT file format and Writer2LaTeX, I asked st_albert for help with the configuration and stylesheet files and he kindly provided sample configuration/stylesheet files (config.xml and epub.css) as well as an ODF text document template file (custom-styles.ott).
If you want to test the template file, unzip Custom-Stylesheet.zip and add the custom styles in LibreOffice/OpenOffice via Styles > Load Styles > From file > ODF Text Document Template > custom-styles.ott. This will add a number of custom styles that all start with custom, e.g. custom-body-text.
Note that these styles are not intended as all purpose styles, they're merely provided to demonstrate how to map custom styles to stylesheet classes.
For example, the following line in config.xml maps the LibreOffice/OpenOffice custom-body-text style to the p.custom-bodytext class in epub.css.
Code
<xhtml-style-map name="custom-body-text" family="paragraph" element="p" css="custom-bodytext" />
If you don't want to use the default config/css files, you can simply delete them from the plugin folder to force Writer2LaTeX to use the default settings. (To display the ODTImport plugin folder select Edit > Preferences > Open Preferences Location.)
In this case you might find the RemoveInLineStyles plugin helpful, which will allow you to convert all inline styles to classes.

Warning: Like all other input plugins this plugin will destroy the contents of the currently loaded ePub; make sure to only run it when no epub is loaded.

System requirements:

If you're using an older Sigil version, you'll need to install a Python interpreter and select its path in the Manage Plugins dialog box. Since Writer2LaTeX is a Java app, you'll also need to install Java, if you haven't already done so and the Java executable path must be added to the system path. I.e., if you enter java -version in a DOS/terminal window you should get a version number. Otherwise the plugin won't work.
If you don't get a version number, please read the troubleshooting section of the epubcheck (Java) plugin.

Installation

1. Select Manage Plugins from the Plugins menu. In the Manage Plugins dialog box, select Use Bundled Python, if it isn't already selected. (If your Sigil version doesn't have a Use Bundled Python option, click one of the Auto buttons to detect the path or Set to manually select the Python interpreter path. )
2. Click Add Plugin and select ODTImport_v0.1.zip. This will install the plugin, which you can select via Plugins > input > ODTImport.

(ODTImport_v0.3.2.zip comes with Writer2LaTeX 1.6.)

Troubleshooting: If you get a WindowsError: [Error 2] error message, the Java binary couldn't be found. Installing/updating Java and re-booting your machine should take care of this problem.

License: GNU General Public License v3 (GPL-3)
[zip] ODTImport_v0.1.zip (524.5 KB, 1219 views)
[zip] Custom-Stylesheet.zip (1.7 KB, 1247 views)
[zip] ODTImport_0.3.2.zip (571.9 KB, 1077 views)
Reply 

#2  st_albert 05-25-2016, 10:11 PM
I just gave it a pretty good workout in 0.9.5 on Kubuntu 16.10 and it works great. I was even able to replace the config.xml and epub.css with my own production versions with no trouble.

This makes it essentially as easy for me to go from LibreOffice to Sigil as it would be to use the LO writer2xhtml extension, but with greater flexibility, and no worries if the LO version gets ahead of the writer2xhtml extension version, and no longer functions. (As has happened from time to time in the past).

For that matter, I suppose the import plugin should work with .odt files from other sources than LibreOffice or OpenOffice, perhaps with some tweaks of the config.xml file.

Great work!

Albert
Reply 

#3  roger64 05-26-2016, 03:04 AM
Hi

Congratulations!

As a first try, both of you pretty welll succeeded. I imported quite a complex odt file with some titles, lots of footnotes... It was converted straight out the box using the provided "custom" style-sheet.

Now I will make further tries to tweak it to my own style-sheet and style-mappings. As we do not have the initial front panel writer2xhtml, provides, I think it will probably be necessary to use the metadata editor of Sigil to complete the missing fields we can't fill in LibreOffice.

Reply 

#4  roger64 05-26-2016, 10:52 AM
I did a second test with the same -complex- odt file.

This time I changed three elements
- writer2latex.jar for version 1.5.2.alpha which is quite stable BTW
- config.xml from my -renamed- writer2xhtml.xml file
- epub.css from my -renamed- writer2xhtml-styles.css

The result was even better than the previous one because I could use my custom settings.

To get 100% of what I have with writer2xhtml, I should find a way to select some missing options. With writer2xhtml, these are usually selected on the panel (see screenshot). Using writer2latex there is probably a different way to select or express our choice before processing.

It mainly concerns:
- EPUB2 or EPUB3?
- values in px or cm
- one font-family for all the styles (and not for each style)
- Document division (splitting the html file along h1 or h2 headings, etc.)
- producing a toc.ncx

Have you an idea how to do it?
options.png 
Reply 

#5  Doitsu 05-26-2016, 12:06 PM
Quote roger64
It mainly concerns:
- EPUB2 or EPUB3?
- values in px or cm
- one font-family for all the styles (and not for each style)
- Document division (splitting the html file along h1 or h2 headings, etc.)
- producing a toc.ncx

Have you an idea how to do it?
As I've mentioned in the plugin description, I'm only familiar with the basic feature of Writer2Latex and using an alpha version in a production environment is generally not advisable.

Hopefully, st_albert will chime in with some helpful feedback.

As for your points:

- EPUB2 or EPUB3?

For ePub3 output you'll need to change -epub command line parameter to epub3.

Locate the following block in plugin.py and replace it with the following version:

Spoiler Warning below






Code
 # define command line arguments if os.path.isfile(w2l_config_path) and os.path.isfile(w2l_css_path): args = ['java', '-jar', w2l_path, '-epub3', '-config=' + w2l_config_path, '-stylesheet=' + w2l_css_path, odt_file] elif os.path.isfile(w2l_config_path) and not os.path.isfile(w2l_css_path): args = ['java', '-jar', w2l_path, '-epub3', '-config=' + w2l_config_path, odt_file] elif not os.path.isfile(w2l_config_path) and os.path.isfile(w2l_css_path): args = ['java', '-jar', w2l_path, '-epub3', '-stylesheet=' + w2l_css_path, odt_file] else: args = ['java', '-jar', w2l_path, '-epub3', odt_file]


- values in px or cm

Should be enabled with the the following line in config.xml:

Code
<option name="convert_to_px" value="true" />
(It was set to false in the sample config file.)

- one font-family for all the styles (and not for each style)

No idea. Why don't you define a document font for the body element?

- Document division (splitting the html file along h1 or h2 headings, etc.)

AFAIK, this is defined via the following config.xml parameter:

Code
<option name="split_level" value="1" />
1 = h1; change this to the desired heading level number.

- producing a toc.ncx

Should be enabled by the following option:

Code
<option name="include_ncx" value="true" />
Note that I haven't tested these settings with the alpha version.
Reply 

#6  roger64 05-26-2016, 02:26 PM
@Doitsu
Thanks very much for your detailed answer. I will try all this tomorrow.

I already select a font-family in the body of epub.css but that's not enough to get rid of all the unwanted font-family which are set in the styles. It's also like that for writer2xhtml and there is a need to order it specifically somewhere.

As for this alpha version, I have been using it without any trouble for nearly a year... as a hobbyist.
Reply 

#7  st_albert 05-26-2016, 02:40 PM
I think Doitsu covered most everything I know about configurations. As for controlling splits in the epub, I will add that there is another config option which more-or-less determines how to handle explicit page breaks in the ODF document. It is
Code
 <option name="page_break_split" value="explicit" />
I say "more or less" because in my experience it seems to interact with the "split level" option, and whether or not the heading level specified does or does not include a page-break option in the LO style definition. If you find you're getting blank pages in the epub, tweak these values. Likewise, if you are not getting splits where you think you should.

@roger64: By the way, what does ver 1.5.2-alpha bring to the table vs. 1.4?

I know both Doitsu and Roger64 are already familiar with this, but for those new to writer2latex, here is the user manual:
[zip] user-manual.odt.zip (56.3 KB, 564 views)
Reply 

#8  st_albert 05-26-2016, 08:45 PM
Oh, and as for metadata, I see from the user manual that there is a way to add it to the .odt file via the writer2xhtml extension, but I don't see how to do so if using the stand-alone (and therefore, the Sigil plug-in).

This is an area where I'm blind. I never used it in the LO/OO extension, because I just edit the content.opf directly in Sigil. I gave my boss a "skeleton" file with blank metadata entries that we need to use. For a given book, she fills in what she wants, saves it as a .txt file, and I copy and paste it into content.opf as needed. There is a lot of metadata included. But, as JSWolf will no doubt point out, no one makes use of it at present. My boss wants it there, though, for the future.

My feeling is, if you will excuse a slightly off-color analogy, that it is like losing bladder control while wearing a dark suit. It gives you a nice warm feeling, but nobody notices.

That said, if anyone finds the way to include metadata via the plugin, I'd like to know how.
Reply 

#9  roger64 05-26-2016, 09:02 PM
Up to now (till I find another odt file with more complex and untried features), everything seems to be working fine.

Two things are missing but can be easily dealt with :
1. - there seems to be no way to add automatically "resources" (fonts we wish to embed). We can however insert automatically any @font-face declaration.
2. - no metadata editor in the plugin but there is no real need for it(see further down)


The missing option I was looking for was:
Code
 <option name="use_default_font" value="true" />
@st_albert
road map
Henrik Just published his road map from 1.4 to 1.6.

for writer2xhtml, the two planned items have been implemented with version 1.5.2.alpha
- Support for EPUB3 (including metadata)
- toolbar for launching writer2xhtml or writer2latex from LibreOffice

for writer2latex, I do not know since it's the first day I use it.

metadata editor
Of course since Sigil implemented a brand new metadata Editor, it seems spurious to try to double it in the odt plugin.

writer2xhtml has one which is useful because it's used within LibreOffice (on the lower part of the screenshot above, click on "Edit document properties").

usermanual
I was aware of the existence of this manual, but every time I tried to read it, I found it overwhelming.
Reply 

#10  roger64 05-30-2016, 12:54 PM
Using this plugin, I had a closer look at my config.xml file (coming from writer2xhtml). It was a total mess.

So I reordered all its options following the order of the user manual. I added the names of each group, using also the names provided by the user manual.

Spoiler Warning below







Code
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<!-- Style options -->
<option name="template_ids" value=",,," />
<option name="pretty_print" value="true" />
<option name="no_doctype" value="false" />
<option name="encoding" value="UTF-8" />
<option name="hexadecimal_entities" value="true" />
<option name="use_named_entities" value="false" />
<option name="add_bom" value="false" />
<option name="multilingual" value="false" />
<option name="separate_stylesheet" value="false" />
<option name="custom_stylesheet" value="" />
<!-- Control the conversion of formatting -->
<option name="formatting" value="ignore_hard" />
<option name="frame_formatting" value="ignore_hard" />
<option name="section_formatting" value="convert_all" />
<option name="table_formatting" value="convert_all" />
<option name="table_size" value="relative" />
<option name="list_formatting" value="css1" />
<option name="tabstop_style" value="2em" />
<option name="use_default_font" value="true" />
<option name="default_font_name" value="Linux Libertine O" />
<!-- Handling of dimensions -->
<option name="convert_to_px" value="false" />
<option name="scaling" value="100%" />
<option name="column_scaling" value="100%" />
<option name="image_size" value="relative" />
<option name="relative_font_size" value="true" />
<option name="font_scaling" value="100%" />
<!-- Options for special content -->
<option name="formulas" value="image+starmath" />
<option name="use_mathjax" value="false" />
<option name="embed_svg" value="true" />
<option name="embed_img" value="false" />
<option name="endnotes_heading" value="Notes" />
<option name="footnotes_heading" value="Notes" />
<option name="use_dublin_core" value="true" />
<option name="notes" value="true" />
<option name="display_hidden_text" value="false" />
<option name="include_toc" value="true" />
<option name="include_ncx" value="true" />
<option name="float_objects" value="true" />
<!-- AutoCorrect options -->
<option name="ignore_double_spaces" value="false" />
<option name="ignore_empty_paragraphs" value="false" />
<option name="ignore_hard_line_breaks" value="false" />
<!-- File options -->
<option name="external_toc_depth" value="3" />
<option name="split_level" value="2" />
<option name="repeat_levels" value="5" />
<option name="page_break_split" value="styles" />
<option name="split_after" value="0" />
<option name="image_split" value="none" />
<option name="cover_image" value="true" />
<option name="save_images_in_subdir" value="false" />
<option name="uplink" value="" />
<!-- Options specific for spreadsheet documents -->
<option name="calc_split" value="false" />
<option name="display_hidden_sheets" value="false" />
<option name="display_hidden_rows_cols" value="false" />
<option name="display_filtered_rows_cols" value="false" />
<option name="apply_print_ranges" value="false" />
<option name="use_title_as_heading" value="true" />
<option name="use_sheet_names_as_headings" value="true" />
<!-- Options for batch conversion -->
<option name="directory_icon" value="" />
<option name="document_icon" value="" />
<!-- Style maps -->
<xhtml-style-map after="" before="" block-css="(none)" block-element="dl" css="(none)" element="dt" family="paragraph" name="List Heading" />
<xhtml-style-map after="" before="" block-css="(none)" block-element="" css="(none)" element="address" family="paragraph" name="Sender" />
<xhtml-style-map after="" before="" block-css="(none)" block-element="dl" css="(none)" element="dd" family="paragraph" name="List Contents" />
<xhtml-style-map after="" before="" block-css="(none)" block-element="blockquote" css="(none)" element="p" family="paragraph" name="Quotations" />
<xhtml-style-map after="" before="" block-css="(none)" block-element="" css="(none)" element="hr" family="paragraph" name="Horizontal Line" />
<xhtml-style-map after="" before="" block-css="(none)" block-element="" css="(none)" element="pre" family="paragraph" name="Preformatted Text" />
<xhtml-style-map after="" before="" block-css="(none)" block-element="" css="(none)" element="p" family="paragraph" name="Text body" />
<xhtml-style-map after="" before="" css="(none)" element="em" family="text" name="Emphasis" />
<xhtml-style-map after="" before="" css="(none)" element="var" family="text" name="Variable" />
<xhtml-style-map after="" before="" css="(none)" element="code" family="text" name="Source Text" />
<xhtml-style-map after="" before="" css="(none)" element="strong" family="text" name="Strong Emphasis" />
<xhtml-style-map after="" before="" css="(none)" element="kbd" family="text" name="User entry" />
<xhtml-style-map after="" before="" css="(none)" element="samp" family="text" name="Example" />
<xhtml-style-map after="" before="" css="(none)" element="dfn" family="text" name="Definition" />
<xhtml-style-map after="" before="" css="(none)" element="cite" family="text" name="Citation" />
<xhtml-style-map after="" before="" css="(none)" element="tt" family="text" name="Teletype" />
<xhtml-style-map css="(none)" element="sup" family="attribute" name="superscript" />
<xhtml-style-map css="(none)" element="b" family="attribute" name="bold" />
<xhtml-style-map css="(none)" element="i" family="attribute" name="italics" />
</config>

Options values are mine and can of course be modified. Consult the user manual from page 25.
Reply 

  Next »  Last »  (1/4)
Today's Posts | Search this Thread | Login | Register