New programmer seeking to work on the lack of vertical/Japanese ebook formatting.
#1  Moosatronic 09-06-2020, 04:42 PM
Well I lost my draft of this thread I was writing so I guess i'll write the treatise again.

I'm a new programmer that wants to work on the lack of vertical/japanese ebook reading capability in Calibre. (As far as i've gathered this is something that hasn't been solved for in Calibre as of yet)
I want to work on this because Calibre's dictionary tool is very powerful and would make learning Japanese from Japanese ebooks very easy. Plus I guess this would open up Calibre to be useful for Japanese people in general.

Currently in Calibre (from what I understand):

1. Calibre will not always display vertical text correctly.

2. Calibre, when going from page to page, will instead skip from chapter to chapter, making it impossible to read many Japanese ebooks.

I would like to try to solve these problems, presumably by changing how Calibre recognizes and displays Japanese ebooks. I'm new to open source so I was just wondering what steps I can take to work on this and what languages and other things I will need to learn to work on Calibre's ebook reader to potentially solve this problem. Currently I'm learning C++ and have experience with Python. Sorry if this is a dumb thing for me to ask about and aspire to fix as a new programmer but I feel like i'll learn more working on an actual problem that I want to fix and that will have a effect on other people.

I guess the roadmap currently is to:

1. see how other ereaders format japanese ebooks (kindle and bookwalker) and determine by looking at the ebook contents how those readers learn to display Japanese ebook content as opposed to other content.

2. program calibre to recognize japanese ebooks and then display them correctly.

Anyways let me know if this is dumb or whatever i'm relatively new to this so yeah. I just want to see what I can do, even i can't really do anything this will probably be a good exercise for the future. Thanks. Also if this is actually already solved in Calibre then please ignore this thread I guess and/or let me know how to display japanese ebooks correctly.

#2  Moosatronic 09-06-2020, 04:57 PM
Just to anticipate some potential answers, I am going through the Calibre wiki and starting with trying to set up a calibre development environment. It also does seem that Python will likely be all i'll need to work on this issue. So i guess i'll install the development point, try to find the ebook viewer part of the code and go from there I suppose.

#3  jackie_w 09-06-2020, 05:26 PM
Setting up your calibre development environment is the essential first step so you're off to a good start.

Before you put too much effort in you might like to know that calibre v5 is imminent. You can have a closer look here in the beta thread.

As an English-only speaker/reader I have to confess that I have no idea how well calibre v5 supports vertical text, but it might be good to find out before you spend a lot of time on your project.

My own experiences with calibre development for personal projects is limited to calibre plugins. Python is all I've ever needed for my stuff, but I believe the Viewer code also uses a lot of Java.

#4  Moosatronic 09-06-2020, 05:39 PM
ok thanks jackie_w, i'll first check to see if someone is already working/fixed this for v5. I think if I do work on this in any case it'll be a slow project. I first need to get a lay of the land as to how the style sheets are set out on various Japanese ebooks I think and compare that to western books.

I wonder if working on this as a plug-in might be an easier approach, as I guess depending on how plug-ins work compared to messing with calibre proper, it might be easier to do a plug-in at least with my skill level. I am also concerned that if I mess with calibre proper, along with how calibre displays books, I may also have to deal with how Calibre converts Japanese books into other formats. So i might also have to get access to Japanese books using different formats and see how Calibre converts them and preserves or does not preserve, the vertical text and all that. Also figuring out vertical text support might not be an easy solution and may involve messing with how Calibre works with displaying text in general so firstly i think i'm just going to try to understand Calibre code, and also understand Japanese ebooks, and then slowly move towards understanding how Calibre can be changed in keeping with the design philosophy to present Japanese ebooks well.

#5  jhowell 09-06-2020, 05:47 PM
mwgabby-li made some changes for Vertical RTL Book Reading Support on GitHub and these are included in the recent beta releases for calibre 5. I don’t know how well this works or whether more work needs to be done on this.

#6  Moosatronic 09-06-2020, 06:01 PM
well it looks like mwgabby-li has fixed the problems i've listed so i guess nevermind for my whole project lol. guess i'll find something else to work on. The only thing i've noticed is that Furigana (basically phonetic superscript that tells you how to read some characters) doesn't format correctly but that isn't exactly a huge problem.

I also know the japanese gutenberg equivalent "Aozora bunko" has a specific .txt format that could possibly be supported in Calibre if it isn't already. There are other apps that can convert the files but it would obviously be helpful if it worked in Calibre I guess. That seems like something that would fit well as a plug in i suppose.

#7  Moosatronic 09-06-2020, 06:04 PM
but i'm sure mwgabby-li is probably aware of the problem.

Just for those you don't know what i'm talking about and happen for whatever reason to be curious here's an image with the formatting issues circled

#8  mwgabby-li 09-25-2020, 02:20 AM
@Moosatronic Yeah, I'm aware of the furigana issue, but good eye! Um, if you have any clever ideas to solve the problem, let me know. The issue is that the HTML library Calibre uses doesn't split furigana across columns correctly.

I was thinking of at least adding a feature to toggle furigana on and off. I was using this CSS in the "Styles" options in the reader to disable them locally, so I think this could be pretty easily modified to add something in the reader settings somewhere:
rt {
("rt" is the "ruby" tag, which is what these annotations are called in HTML)

Today's Posts | Search this Thread | Login | Register