Mobileread
Annotations: Capture/Convert Kobo-Kindle (uses OpenWith and Annotations Plugins)
#1  EnergyLens 11-05-2014, 01:00 PM
When I started using both a Kobo and a Kindle I still wanted to be able to upload my Annotations to Clippings.io for sorting and tagging...

This python script can be run using the Calibre OpenWith Plugin. It is dependent on Annotations having been imported into a Calibre column by the Annotations Plugin.

By default this script searches first for an Annotations column, and failing that for Annotations in the Comments column. It can be configured for any column.

By default the script will, when run from within Calibre, generate a .clip file and insert it into the Calibre database. You can change it to any extension, but I chose clip so that I didn't have conflicts with other .txt files.

If you stick with the .clip extension you just need to have your OS associate the file extension with your favorite text editor.

There are settings in the script such that you can have it write all exported annotations to a directory of your choice and not update the Calibre database at all. This is what I do when I am converting Kobo annotations for uploading to Clippings.io

Also, if you run the script from the command line in the directory where your Calibre database is located, it will export all Annotations found in the database.

~~~~~~~~~

In the future I plan to establish an Annotations database that is independent from the Calibre database because I find that with the volatility of news downloads there is the danger of Highlighted Content being lost.

There is the possibility that a Kindle's MyClippings.txt could be recreated from exported Annotations such that when books are removed from a Kindle and then copied back to a Kindle, the annotations could be restored (If the MyClippings.txt file is corrupted or deleted).

~~~~~~~~~

I also noticed a question from turelur about storing annotations within ebook files. I'm going to think about this some more, but off the top of my head I would save an ORIGINAL_EPUB, then generate an EPUB with Annotations from a the .clip file I'm currently generating (or in the future a database) and then use something like EPUB MERGE...
[zip] annotations.py.zip (3.1 KB, 290 views)
Reply 

#2  turelur 11-07-2014, 02:35 AM
Wow! That is a very interesting lead! Thanks!
Reply 

#3  eschwartz 11-07-2014, 05:34 PM
Here is an interesting idea.

Perhaps you could crack the format of the Kindle .mbp1/.azw3r/.azw3f files, extract an up-to-date listing of kindle annotations, and convert it to-from a plaintext representation to keep in a custom column. That would allow syncing annotations between Kindle devices at least, without resorting to My Clippings.txt -- which is nice in terms of listing annotations, but would be even nicer if it was incorporated into the book itself.

I think the annotations plugin would really benefit from some serious work done in that area, but no one seems to be interested. (Including me sadly though I would be grateful for anyone else's efforts.)
Reply 

#4  davidfor 11-07-2014, 07:25 PM
The intention of the Annotations plugin was purely record the annotations from the devices. It has the ability to merge them from multiple devices, but doesn't collect enough information to restore them anywhere. I have never been happy with that, but as my main use for annotations is to mark errors to fix or something to look up when I'm at my PC, I haven't had much desire to do anything

I have thought about it a little, but haven't been happy with anything I could come up with. If the calibre viewer supported annotations, I would probably use that as the method. At the moment, creating a ADE annotations file is the most attractive method. That might be transportable to other RMDSK based readers. One problem I have is keeping the data associated with the book. I haven't come up with a solution that I like.
Reply 

#5  davidfor 11-07-2014, 08:08 PM
EnergyLens: Very nice idea. I hadn't seen that clipping service, but I can see a use for it.

A couple of comments.

I have to admit I cringed when I saw the code to get the annotations column. But, I'm so used to writing calibre plugins, hadn't thought about other ways to do this. It would be safer to get the name of the annotations column from the annotations configuration. That is plugins/annotations.json in the calibre configuration directory.

This would be great as a plugin. Collecting the annotations and adding the file to calibre can be done with no problems. Doing this for all books in the library is easy. And adding something to collect and upload the files to the service would be easy as well.

And when I say that, it would actually be natural to add this to the Annotations plugin. That could generate the file when retrieving the annotations. Or an export function could generate them to be saved elsewhere.

In any case, if there is something the annotations plugin could do differently that would help, I am willing to consider it. Especially it come in the form of a patch

One thing I don't understand is:
Quote EnergyLens
In the future I plan to establish an Annotations database that is independent from the Calibre database because I find that with the volatility of news downloads there is the danger of Highlighted Content being lost.
How does the news downloads affect this? Unless news articles are being annotated on the devices, there shouldn't be an issue. And if they are, it should only be these annotations that are at risk.
Reply 

#6  EnergyLens 11-08-2014, 03:57 AM
@ eschwartz
I can't even find the .mbp1/.azw3r/.azw3f files, though I only gave it a squiz.

@ davidfor
I'm really just a hack, and a green one at that.

I do annotate news articles on my devices! Doesn't everyone?

The reason I see volatility with annotations is that I follow a number of blogs that have voluminous commentary, which is constantly expanding. I typically download the "news" feed once a day, and I want to be sure to capture the highlights/annotations from each download of the same blog entry. Then after a period of time I download an anthology of the blog and want to merge all of the highlights back into to anthology. My best hack (and longest bit of python code) was to automate the interleaving of author/moderator responses at the correct positions (directly following) comments which was necessary to make sense of the content when reading on an eReader where it is impossible to scroll back and forth constantly. (http://www.mobileread.com/forums/sho...d.php?t=249514)

I'd be happy if you were inspired to take any ideas from my annotations hack to extend the Annotations Plugin, because as you say it would be natural to add the generate/export function.

With regards to News Article Annotations, I've noticed that the Annotations plugin works great with annotations from News "books" that I read on my Kobo, but that it doesn't find annotations for News "books" that I read on my Kindle. Why is that?

My long term goal is to enable some serious natural language processing/phrase frequency analysis of my content, initially in the process of capturing content as it is downloaded, then later within arbitrary books in my Calibre database. I'd like to be able to generate Lombardi networks from book bibliographies and other references (URLs/book titles/Authors) I've pulled from web content. Right now I'm just cutting my teeth.

P.S. I lived on several bays around Sydney Harbor for a few years in the late '90s. Wonderful memories of riding the ferries to work every day!
Reply 

#7  davidfor 11-08-2014, 05:45 AM
Quote EnergyLens
@ eschwartz
I can't even find the .mbp1/.azw3r/.azw3f files, though I only gave it a squiz.

@ davidfor
I'm really just a hack, and a green one at that.

I do annotate news articles on my devices! Doesn't everyone?
I don't read news on my devices, so I don't know,
Quote
The reason I see volatility with annotations is that I follow a number of blogs that have voluminous commentary, which is constantly expanding. I typically download the "news" feed once a day, and I want to be sure to capture the highlights/annotations from each download of the same blog entry. Then after a period of time I download an anthology of the blog and want to merge all of the highlights back into to anthology. My best hack (and longest bit of python code) was to automate the interleaving of author/moderator responses at the correct positions (directly following) comments which was necessary to make sense of the content when reading on an eReader where it is impossible to scroll back and forth constantly. (http://www.mobileread.com/forums/sho...d.php?t=249514)
OK, I can see the concern. Moving to a separate database for that would be a good idea.
Quote
I'd be happy if you were inspired to take any ideas from my annotations hack to extend the Annotations Plugin, because as you say it would be natural to add the generate/export function.
I'm going to park it in the back of my brain and see what it comes up with. My subconscious is a lot smarter than the bit that does the typing.
Quote
With regards to News Article Annotations, I've noticed that the Annotations plugin works great with annotations from News "books" that I read on my Kobo, but that it doesn't find annotations for News "books" that I read on my Kindle. Why is that?
I think it is something in how the books are put onto the Kindle. The Kobo devices don't differentiate between news and books. But, I think the Kindle can. Or at least I think that's what Kovid has said when asked.
Quote
My long term goal is to enable some serious natural language processing/phrase frequency analysis of my content, initially in the process of capturing content as it is downloaded, then later within arbitrary books in my Calibre database. I'd like to be able to generate Lombardi networks from book bibliographies and other references (URLs/book titles/Authors) I've pulled from web content. Right now I'm just cutting my teeth.
That's very interesting and sounds like lot of work. I'll look forward to seeing the result when finished.
Quote
P.S. I lived on several bays around Sydney Harbor for a few years in the late '90s. Wonderful memories of riding the ferries to work every day!
It's a long time since I've been on a ferry. On a nice day, it would be a great way to commute. Though I think I'd be tempted to stay on and go for another ride.
Reply 

#8  eschwartz 11-09-2014, 03:26 PM
Quote davidfor
The intention of the Annotations plugin was purely record the annotations from the devices. It has the ability to merge them from multiple devices, but doesn't collect enough information to restore them anywhere. I have never been happy with that, but as my main use for annotations is to mark errors to fix or something to look up when I'm at my PC, I haven't had much desire to do anything

I have thought about it a little, but haven't been happy with anything I could come up with. If the calibre viewer supported annotations, I would probably use that as the method. At the moment, creating a ADE annotations file is the most attractive method. That might be transportable to other RMDSK based readers. One problem I have is keeping the data associated with the book. I haven't come up with a solution that I like.
Yeah, I know it was never designed to do so. I agree, it would be nice if it could.
Reply 

#9  eschwartz 11-09-2014, 03:28 PM
Quote EnergyLens
@ eschwartz
I can't even find the .mbp1/.azw3r/.azw3f files, though I only gave it a squiz.
They are kept next to the .mobi and .azw3 files, on a Kindle. On recent Kindles (>=KT) they will be in a subdirectory "{book_filename_minus_ext}.sdr/"
Reply 

#10  EnergyLens 11-10-2014, 04:40 PM
Quote eschwartz
They are kept next to the .mobi and .azw3 files, on a Kindle. On recent Kindles (>=KT) they will be in a subdirectory "{book_filename_minus_ext}.sdr/"
Thanks!

Funny, I was looking (in terminal) for .hidden directories and saw .sdr as a file!
Reply 

  Next »  Last »  (1/2)
Today's Posts | Search this Thread | Login | Register