Mobileread
Failure to convert from mobi to epub
#1  jlmwrite 01-06-2011, 06:30 PM
Long time Calibre user here. Anyway, I recently converted all 700+ books from mobi to epub -- using v 0.7.37 -- and 4 of them failed to convert. I've never had Calibre fail me before, and I thought I'd post to see if there are any suggestions.

"Grapes of Wrath" was one of the books that failed, so I took it into my head to mess with it specifically again today. Converted it from mobi to mobi (thinking there might be a problem with the original non-DRM file) then took that output and converted it to epub. It still failed with this error detail:
ERROR: Conversion Error: <b>Failed</b>: Convert book 1 of 1 (The Grapes of Wrath)

Convert book 1 of 1 (The Grapes of Wrath)
Resolved conversion options
calibre version: 0.7.37
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0.0,
'book_producer': None,
'change_justification': u'original',
'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']",
'chapter_mark': u'pagebreak',
'comments': None,
'cover': 'c:\\users\\lee\\appdata\\local\\temp\\calibre_0.7 .37_tmp_vit7yl\\calibre_0.7.37_6rqesh.jpeg',
'debug_pipeline': None,
'disable_font_rescaling': False,
'dont_split_on_page_breaks': False,
'epub_flatten': False,
'extra_css': None,
'extract_to': None,
'flow_size': 260,
'font_size_mapping': None,
'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' ,
'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' ,
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x056D6CD0>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0.0,
'linearize_tables': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.NookColorOutput object at 0x056E6090>,
'page_breaks_before': u"//*[name()='h1' or name()='h2']",
'prefer_metadata_cover': False,
'preprocess_html': False,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': 'c:\\users\\lee\\appdata\\local\\temp\\calibre_0.7 .37_tmp_vit7yl\\calibre_0.7.37_hbw8g8.opf',
'remove_first_image': False,
'remove_footer': False,
'remove_header': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'series': None,
'series_index': None,
'smarten_punctuation': False,
'tags': None,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: MOBI Input running
on C:\Users\Lee\Calibre Library\John Steinbeck\The Grapes of Wrath (253)\The Grapes of Wrath - John Steinbeck.mobi
Extracting text...
Adding anchors...
Extracting images...
Cleaning up HTML...
Parsing HTML...
Converting style information to CSS...
Creating OPF...
Parsing all content...
Parsing The_Grapes_of_Wrath.html ...
Forcing The_Grapes_of_Wrath.html into XHTML namespace
Parsing styles.css ...
Generating default TOC from spine...
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Cleaning up manifest...
Trimming unused files from manifest...
Parsing stylesheet.css ...
Trimming 'images/00001.jpg' from manifest
Trimming 'images/00002.jpg' from manifest
Creating EPUB Output...
Splitting on page-break
Splitting on page-break
Looking for large trees in The_Grapes_of_Wrath.html...
Found large tree #1
Splitting...
Split point: {http://www.w3.org/1999/xhtml}div /*/*[2]/*[5539]
Split tree too small
Splitting...
Split point: {http://www.w3.org/1999/xhtml}div /*/*[2]/*[5538]
Split tree too small
Splitting...
Python function terminated unexpectedly
Could not find reasonable point at which to split: The_Grapes_of_Wrath.html Sub-tree size: 1843 KB (Error Code: 1)
Traceback (most recent call last):
File "site.py", line 103, in main
File "site.py", line 85, in run_entry_point
File "site-packages\calibre\utils\ipc\worker.py", line 107, in main
File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert
File "site-packages\calibre\ebooks\conversion\plumber.py", line 965, in run
File "site-packages\calibre\ebooks\epub\output.py", line 169, in convert
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 56, in __call__
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 66, in split_item
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 188, in __init__
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 392, in split_to_size
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 392, in split_to_size
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 385, in split_to_size
calibre.ebooks.oeb.transforms.split.SplitError: Could not find reasonable point at which to split: The_Grapes_of_Wrath.html Sub-tree size: 1843 KB
Reply 

#2  kovidgoyal 01-06-2011, 07:19 PM
Use the preprocessing option under structure detection.
Reply 

#3  jlmwrite 01-06-2011, 07:50 PM
Hi,

Thanks for the hint. I did that, however, but it failed with:
ERROR: Conversion Error: <b>Failed</b>: Convert book 1 of 1 (The Grapes of Wrath)

Convert book 1 of 1 (The Grapes of Wrath)
Resolved conversion options
calibre version: 0.7.37
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0.0,
'book_producer': None,
'change_justification': u'original',
'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']",
'chapter_mark': u'pagebreak',
'comments': None,
'cover': 'c:\\users\\lee\\appdata\\local\\temp\\calibre_0.7 .37_tmp_vit7yl\\calibre_0.7.37_wk4wwt.jpeg',
'debug_pipeline': None,
'disable_font_rescaling': False,
'dont_split_on_page_breaks': False,
'epub_flatten': False,
'extra_css': None,
'extract_to': None,
'flow_size': 260,
'font_size_mapping': None,
'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' ,
'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' ,
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x056D6CD0>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0.0,
'linearize_tables': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.NookColorOutput object at 0x056E6090>,
'page_breaks_before': u"//*[name()='h1' or name()='h2']",
'prefer_metadata_cover': False,
'preprocess_html': True,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': 'c:\\users\\lee\\appdata\\local\\temp\\calibre_0.7 .37_tmp_vit7yl\\calibre_0.7.37_zbw47m.opf',
'remove_first_image': False,
'remove_footer': False,
'remove_header': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'series': None,
'series_index': None,
'smarten_punctuation': False,
'tags': None,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: MOBI Input running
on C:\Users\Lee\Calibre Library\John Steinbeck\The Grapes of Wrath (253)\The Grapes of Wrath - John Steinbeck.mobi
Extracting text...
Adding anchors...
Extracting images...
Cleaning up HTML...
Parsing HTML...
Converting style information to CSS...
Creating OPF...
Parsing all content...
Parsing The_Grapes_of_Wrath.html ...
Forcing The_Grapes_of_Wrath.html into XHTML namespace
Parsing styles.css ...
Generating default TOC from spine...
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Cleaning up manifest...
Trimming unused files from manifest...
Parsing stylesheet.css ...
Trimming 'images/00001.jpg' from manifest
Trimming 'images/00002.jpg' from manifest
Creating EPUB Output...
Splitting on page-break
Splitting on page-break
Looking for large trees in The_Grapes_of_Wrath.html...
Found large tree #1
Splitting...
Split point: {http://www.w3.org/1999/xhtml}div /*/*[2]/*[5539]
Split tree too small
Splitting...
Split point: {http://www.w3.org/1999/xhtml}div /*/*[2]/*[5538]
Split tree too small
Splitting...
Python function terminated unexpectedly
Could not find reasonable point at which to split: The_Grapes_of_Wrath.html Sub-tree size: 1843 KB (Error Code: 1)
Traceback (most recent call last):
File "site.py", line 103, in main
File "site.py", line 85, in run_entry_point
File "site-packages\calibre\utils\ipc\worker.py", line 107, in main
File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert
File "site-packages\calibre\ebooks\conversion\plumber.py", line 965, in run
File "site-packages\calibre\ebooks\epub\output.py", line 169, in convert
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 56, in __call__
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 66, in split_item
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 188, in __init__
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 392, in split_to_size
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 392, in split_to_size
File "site-packages\calibre\ebooks\oeb\transforms\split.py", line 385, in split_to_size
calibre.ebooks.oeb.transforms.split.SplitError: Could not find reasonable point at which to split: The_Grapes_of_Wrath.html Sub-tree size: 1843 KB
Reply 

#4  DoctorOhh 01-06-2011, 08:18 PM
Quote jlmwrite
Could not find reasonable point at which to split: The_Grapes_of_Wrath.html Sub-tree size: 1843 KB (Error Code: 1)

calibre.ebooks.oeb.transforms.split.SplitError: Could not find reasonable point at which to split: The_Grapes_of_Wrath.html Sub-tree size: 1843 KB
This is a problem with finding a place in the html to split the html into bite size chunks so that when converted to ePub the internal html chunks won't choke the reader. Typically the chunks have to be less than 300k. By default calibre shoots for less than 260k.

You can run the conversion with a directory in the debug area. This will produce, in the directory indicated, the html at various points in the process.

Open up the first (input directory?) html in Sigil then insert chapter breaks to split up the html. Then save the file as a Epub. I then take and run this epub through a epub to epub conversion in calibre to add in the cover, book jacket and get the font size, indents and spacing the way I like it.

If you try this you can further inquire in the Sigil forum.
Reply 

#5  theducks 01-06-2011, 09:12 PM
Hint: temporarily raise the 260K limit to about 400K.
Convert
Now set it back to 260K
Convert EPUB to EPUB
Reply 

#6  jlmwrite 01-06-2011, 09:13 PM
I took the coward's way out and found another copy of the book. Edited the metadata slightly, changed to a better cover, and it converted to epub without a hiccup.

Interesting that the original files sizes were fairly similar; I still don't understand why calibre choked on the first one but not the second. Anyway, problem sidestepped...
Reply 

#7  DoctorOhh 01-06-2011, 09:27 PM
Quote jlmwrite
I took the coward's way out and found another copy of the book. Edited the metadata slightly, changed to a better cover, and it converted to epub without a hiccup.
Best choice.

Quote jlmwrite
Interesting that the original files sizes were fairly similar;
Book size is immaterial. Your first book had a large expanse of html without any natural breaks.

Quote jlmwrite
I still don't understand why calibre choked on the first one but not the second. Anyway, problem sidestepped...
The first book was badly formatted.
Reply 

#8  jlmwrite 01-06-2011, 09:56 PM
Quote dwanthny
Best choice. Book size is immaterial. Your first book had a large expanse of html without any natural breaks. The first book was badly formatted.
Ahhhh -- that makes sense. Thanks!

BTW, tried the tips posted earlier on the other books that had previously refused to convert. Two converted, and one refuses to convert, no matter what I try. Off to the wild blue yonder to try and find another copy of it.
Reply 

#9  theducks 01-06-2011, 10:06 PM
Quote jlmwrite
Ahhhh -- that makes sense. Thanks!

BTW, tried the tips posted earlier on the other books that had previously refused to convert. Two converted, and one refuses to convert, no matter what I try. Off to the wild blue yonder to try and find another copy of it.
Remember to do the second conversion or your reader may choke on too large a file.
Reply 

#10  DoctorOhh 01-06-2011, 11:12 PM
Quote jlmwrite
BTW, tried the tips posted earlier on the other books that had previously refused to convert. Two converted, and one refuses to convert, no matter what I try.
If you can't find another source file, what I wrote in post 4 works every time.
Reply 

Today's Posts | Search this Thread | Login | Register