Mobileread
mobi2oeb
#11  kovidgoyal 03-04-2008, 07:24 PM
I appreciate the gesture, but I have to say I like 'em with a leetle more meat on the bones
Reply 

#12  brecklundin 03-07-2008, 01:12 AM
your wish is our command oh great code breaker...

image »
Reply 

#13  IceHand 03-07-2008, 09:41 AM
Nice work, thanks! One question though: is it normal that the exploded html file has only three lines? Line one is always "<html><head>" line two is "<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />" and line three is the rest. It's no problem to make some breaks with par, but the resulting html code is not very cleary arranged for manual editing.
Reply 

#14  llasram 03-07-2008, 10:56 AM
Quote IceHand
Nice work, thanks! One question though: is it normal that the exploded html file has only three lines? Line one is always "<html><head>" line two is "<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />" and line three is the rest. It's no problem to make some breaks with par, but the resulting html code is not very cleary arranged for manual editing.
All of the pre-.epub HTML-based e-book formats seems to do this – strip out all “unnecessary” whitespace to save space. ConvertLIT tries to fix this for LIT files by adding whitespace to the generated HTML, but it gets it wrong often enough to be troublesome. For adding whitespace and otherwise cleaning up grody HTML check out HTML Tidy.
Reply 

#15  IceHand 03-07-2008, 11:33 AM
Thanks for the tip, but I already knew of HTML Tidy and it won't generate a cleaned up version if the source file has errors – which includes most exploded Mobipocket html files.

Anyway, I had a closer look at the html code and it seems that running a search and replace for "> <" with ">\n<" does the trick. Maybe an idea for the next mobi2oeb version?
Reply 

#16  kovidgoyal 03-07-2008, 03:20 PM
Quote IceHand
Thanks for the tip, but I already knew of HTML Tidy and it won't generate a cleaned up version if the source file has errors – which includes most exploded Mobipocket html files.

Anyway, I had a closer look at the html code and it seems that running a search and replace for "> <" with ">\n<" does the trick. Maybe an idea for the next mobi2oeb version?
That's not quite safe, what if you have something like
Code
<font size=4>W</font><font size=2>ord</font>
Reply 

#17  IceHand 03-07-2008, 04:36 PM
Quote kovidgoyal
That's not quite safe, what if you have something like
Code
<font size=4>W</font><font size=2>ord</font>
Then nothing will happen for that line. It's >space< that would be replaced with >line break< which gives the same output.

>< with no space between should of course not be separated by a line break.
Reply 

#18  kovidgoyal 03-07-2008, 07:26 PM
Are there spaces in the output HTML? Seems odd there would be, if the creation tools are stripping unneeded whitespace characters.
Reply 

#19  IceHand 03-08-2008, 06:59 AM
Yes, there are. To me it doesn't look like that the creation tools are stripping unneeded whitespace characters, but rather like either they are converting line breaks to whitespaces (would seem odd to me, if they would do that) or the script used for exploding to html misinterprets line breaks as whitespaces (that's only a guess of course).

Here's a small sample output from mobi2oeb from a selfmade mobi file. Notice that whereever there is "> <" there should have been a line break between:

Code
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<guide></guide></head><body><br/><br/> <h1 align="center"><b>Book Title</b></h1> <br/> <h2 align="center">Author Name</h2> </body></html>
Reply 

#20  kovidgoyal 03-08-2008, 02:39 PM
OK will be in next release.
Reply 

 « First  « Prev Next »  Last »  (2/6)
Today's Posts | Search this Thread | Login | Register