Mobileread
KindleUnpack (MobiUnpack): Extracts text, images and metadata from Kindle/Mobi files
#511  pdurrant 03-13-2013, 05:32 AM
I shall have to leave detailed discussion of DATP sections to KevinH and DiapDealer.

Nice to see you back at MobileRead. Thanks again for the original code that's been developed into KindleUnpack.
Reply 

#512  Hitch 03-13-2013, 06:51 PM
Quote adamselene
So I actually decided to try the splitting feature, to see how much space it would save on a Kindle loaded with converted ePub books. The answer turned out to be about 11.5% on my corpus of 1277 books. But that's not what this post is aboutÂ…

I compared a KF8 stripped by KindleUnpack with one generated from the same original file by Amazon's Personal Document Service, and found that, aside from some minor changes in the metadata (including addition of the atv:kin:1 tag that they harvest and upload to track documents), it does something different with DATP sections near the end of the document. There are two in the original file and the KindleUnpack KF8, but one is removed in the Amazon KF8, and it's put in a slightly different location in the file. I wonder whether this matters. Do you have any idea what this section is for? It looks like a table of offsets.
Hi:

Most of this is beyond me, but if I may, you can't actually "generate" a K8 file from Amazon's PDS. If you email a file to your own Kindle addy, or use the PDS in any other way, what you get back is not a K8; it's the old mobi (prc) format. So, you can't compare apples-to-apples (in any sense) for an actual K8 created by KindleGen/KP versus the "mobi" (prc) file that you'll get from PDS. You can test this yourself by sending a K8 with, say, an embedded font to the PDS--what you get back will be equivalent to a book made with MBPC. Then sideload the same K8 file with an embedded font to a Fire device directly, either by USB or via an actual (not faux) wifi connection. The "send to Kindle by wifi" prompt you can see on your computer does not use Wifi; it emails the document/book via the PDS. So when I say "wifi," I mean an app like "Wifi File Explorer," which is genuine wifi. You'll see the difference; the USB or wifi-d book will have the embedded font; the PDS book will not.

Hope that helps. The DATP stuff is too deep for yours truly, but I thought before you tried to sort this, you should use files that are equivalent.

Hitch
Reply 

#513  adamselene 03-13-2013, 07:06 PM
Quote Hitch
Most of this is beyond me, but if I may, you can't actually "generate" a K8 file from Amazon's PDS. If you email a file to your own Kindle addy, or use the PDS in any other way, what you get back is not a K8; it's the old mobi (prc) format.
That turns out not to be true. I emailed some combo KF7/KF8 files made by KindleGen, and PDS turned them into standalone KF8s for transfer to the Kindle Paperwhite.

I note that the file size listed on Amazon site indicates that it stored a combo file in the cloud. I haven't tested downloading to an older device without KF8, but presumably it gets stripped to KF7.
Reply 

#514  adamselene 03-13-2013, 07:16 PM
Quote adamselene
I haven't tested downloading to an older device without KF8, but presumably it gets stripped to KF7.
I just tested that, and it works as I expected. In this case, the only difference from the KindleUnpacked version is in the metadata; all the data sections are identical to the Amazon stripped version.
Reply 

#515  Hitch 03-13-2013, 08:08 PM
Quote adamselene
That turns out not to be true. I emailed some combo KF7/KF8 files made by KindleGen, and PDS turned them into standalone KF8s for transfer to the Kindle Paperwhite.

I note that the file size listed on Amazon site indicates that it stored a combo file in the cloud. I haven't tested downloading to an older device without KF8, but presumably it gets stripped to KF7.
Well, if that's true, that's new. As of merely 6 weeks ago, the K8 files that were emailed via PDS were still being converted as if they were K7 (or K6, or whatever). The files mailed to my Fire were not converting properly, and I discussed this with Amazon, and my Tech. Account Manager, in...the end of January, I think it was. I'll check it again.

ETA: Yup--I just sent a K8-formatted book to my Fire, and now it's working. That's very cool, thank you for this discussion--I wouldn't have found out for ages, given that I'd stopped using the PDS for this very reason. (That, and it's faster to just wifi it, but, still...this way I can tell my clients to email the files to their devices. It will save me untold brain-damage. COOL!)

Hitch
Reply 

#516  adamselene 03-13-2013, 09:00 PM
Well, it would hardly be the first time Amazon quietly changed something without bothering to tell anyone.

Transferring over USB is why I wanted to strip the files myself. Using PDS has the advantage that more content can be kept in the cloud and fetched from the device, and it syncs reading location. (The latter two things don't seem to work on my Kindle 2, but content can be pushed from the web site.)
Reply 

#517  Hitch 03-14-2013, 06:35 AM
Quote adamselene
Well, it would hardly be the first time Amazon quietly changed something without bothering to tell anyone.

Transferring over USB is why I wanted to strip the files myself. Using PDS has the advantage that more content can be kept in the cloud and fetched from the device, and it syncs reading location. (The latter two things don't seem to work on my Kindle 2, but content can be pushed from the web site.)
No: it certainly wouldn't (be the first time Amazon changed something...);

I could harangue for days over the horsepucky with the SRL change in/around December, which doesn't show up until after the Publishing Workflow (in other words, after the book is put on sale)...and then only in books for which there's no discernible or describable or document-able criterion. I've had not less than 20 back and forth emails with the Mgr of Digital Operations about this one, because it's just WHACK.

Anyway, though: thanks again. I really wouldn't have found out for ages, simply because it's not a method we ever used a lot, and on the rare occasions we did, post the advent of K8, the doc conversion was still old-school.

Hitch
Reply 

#518  nickredding 03-16-2013, 10:34 PM
I'm occasionally getting a codec error unpacking calibre-generated news downloads:
Code
...
Write ncx
Find link anchors
Insert data into html
Insert hrefs into html
Remove empty anchors from html
Insert image references into html
Write opf
Error: 'ascii' codec can't decode byte 0xe2 in position 84: ordinal not in range(128)
Error: Unpacking Failed
I can't determine where in the unpacking code this is happening (I'm assuming it's in WriteOPF since there is no OPF file after this crash).

It would be nice if KindleUnpack would report (including where the offending byte data is) and then ignore this type of error and carry on instead of terminating. I suppose it's possible there is an error somewhere in calibre, but the resulting files work fine on kindles, ipads, etc., so whatever it is it's harmless, and anyway there is no way to figure out where the issue might be in calibre without some useful information from KindleUnpack.

Normally I would try to isolate the issue in KindleUnpack myself, but the code has changed and grown so much since I last worked on it that would be a major project for me to get back into it. Hopefully, someone who is up to speed on the current code can deal with this.
Reply 

#519  KevinH 03-16-2013, 11:28 PM
Hi Nick,
That error typically can be generated deep inside the python library code when unicode data is passed between threads but somehow the default python encoding is used and on some platforms this is ascii which causes an error. I thought all of those were fixed in the very latest version of KindleUnpack. Perhaps not. Or perhaps some full unicode data is used in a filename or book title or link target, that should have been properly converted to utf-8 before being written to the opf. Either way please post a zip archive of the problem news feed ebook and I will try to track down what is happening and get it fixed.

KevinH
Reply 

#520  nickredding 03-17-2013, 12:12 AM
Kevin - attached is a file that generates this fault.
[zip] np.zip (13.95 MB, 46 views)
Reply 

 « First  « Prev Next »  Last »  (52/93)
Today's Posts | Search this Thread | Login | Register