Mobileread
Calibre2OPDS Catalog --> Dropbox Upload Time
#1  Mickey330 10-19-2010, 02:05 PM
Hi.

My question is: Is it supposed to take a long time to upload a new Calibre2OPDS catalog to Dropbox? I wanted to see how well everything still worked if I made a new catalog, so I made one. Now, it's taking "forever" to sync to Dropbox.

Background:

1. I put my Calibre library in (on?) my Dropbox account. No problems accessing it or in using Calibre to look at my books.

2. I created a catalog using Calibre2OPDS and it seems that it works fine. I can get to both the html catalog and the xml catalog (for Stanza on my iPad). The library has 1360 books and that is what is reflected in the catalog(s). I can download books (epubs - if that matters) to Stanza or to any compatible reader via the web catalog.

3. So, I wanted to test to see what happens when a) I add books to Calibre and b) then create a new catalog to reflect/show those new books.

4. I added 3 epubs to my Calibre library. Then, I ran a new catalog by running Calibre2OPDS. Didn't take very long to create the catalog and then I told it to sync.

5. This was about 45 minutes ago and Dropbox is still telling me that it is uploading 1300+ files... Calibre is closed (no system tray running) and I've not accessed the Dropbox.

I would have thought it would be a simple matter of syncing the new catalog to Dropbox. Or - am I confused in what is happening? All my books will have to re-sync anytime I make a new catalog?

I can live with that ... if that is what happens. Just seems odd that a "simple" catalog change will create an entire re-sync.

Your thoughts? Am I mucking it up or is it working as expected?

Thanks in advance for any sage/wise advice.

Marilyn
Reply 

#2  suecsi 10-19-2010, 03:02 PM
AFAIK it makes a brand new catalog every time - mine is uploading 800 files at the mo .....
Reply 

#3  itimpi 10-19-2010, 07:19 PM
At the moment a new catalog is generated every time so there are a lot of files to upload.

We have been thinking of ways to minimise the overheads. The first simple optimisation that was tested was to assume if a files size was unchanged it did not need copying. This worked for the vast majority of the files, but not all - and the ones that went wrong were often important ones. A shame as it really made the copy phase run a lot faster.

One option that is possible is to do a generate time check to check each generated file to see if the new file has identical contents to the old one, and if so leave the old one in place. This would add a significant cost at generate time but might well be more than gained back in such a scenario by reduced upload time. Definitely something to think about.

These sots of optimisations can easily be ignored on small libraries but become more valuable as the libraries get larger.
Reply 

#4  Xenophon 10-19-2010, 10:46 PM
Have you considered computing a hash value for each file? It'd be easy to do while generating, and would save the cost of reading the old one -- you'd only need to read the old hash value. And if you store the hash values in a file of their own, that's only one extra file to read (and re-generate).
Reply 

#5  ldolse 10-20-2010, 12:18 AM
There has been discussion on adding a hash function to Calibre in general to be able to track changed files, but I don't think anyone has jumped on doing that yet. If it got implemented it would work well with this requirement...
Reply 

#6  itimpi 10-20-2010, 02:55 AM
Quote Xenophon
Have you considered computing a hash value for each file? It'd be easy to do while generating, and would save the cost of reading the old one -- you'd only need to read the old hash value. And if you store the hash values in a file of their own, that's only one extra file to read (and re-generate).
We had thought of doing this. It would still have a cost as because of the way files end up being generated it it is probably not easy to do this dynamically so it would involve re-reading generated files to get the hash value generated. It would definitely be faster, though, in that as you say, if the results are saved the original files do not need to be re-read. This is particularly important in the case of a large library stored remotely on a network share (as mine is).
Reply 

#7  Mickey330 10-20-2010, 04:38 PM
Snicker.

I got the part where, essentially, the catalog upload time is what is is ... for now. And, I can live with that, especially now that school's started back up, I really don't have time to mess about with my books as much as I did in the summer.

Thanks for the answers!

I snicker because after y'all said "that's the way it works, for now - oh, and we are working on making it better," y'all WAY lost me in your discussion! I reecognized the words "the" and "and" in your conversation, but that's about it.

Just a quick reminder to me not to get too uppity when I think I've figured something out...

In all honesty, this putting ebooks in a cloud and creating a catalog to access that library seems a bit like magic to me. I am just loving all the new technology and the education I am getting about it. So, my thanks to the developers for this Calibre2OPDS prgram and to you who posted here to answer my question.

I am always amazed that when I don't know the answer to an e-book related question, MobileRead posters are there for me. You guys rock!

Thanks.
Reply 

Today's Posts | Search this Thread | Login | Register