Mobileread
Russian dictionary in Kindle 4.1.0
#11  PoP 01-07-2013, 04:44 PM
And for more fun, here is what happens with the book Also sprach Zarathustra by Friedrich Nietzsche which has the metadata language de(1031)

1) With the Deutsches Universal-w├Ârterbuch by Duden (Definition dictionary freely downloadable from amazon). which has metadata language de(1031) input language de(1031) output language de(1031).
show attachment »

2) With the German-English Translation dictionary by Michael Sheldon which has identical metadata language de(1031) input language de(1031) output language de(1031) (though it would be more representative to set output language to en(1033)).
show attachment »

Survey: When the two dictionaries are present in the Kindle which one would get chosen to pop up the definition/translation of einmal:
a) Always the first one you loaded?
b) Always the the second one you loaded?
c) Either one at random?
d) The first one if it contains the word searched, otherwise the second one?
e) None of them since lab26 code would be too confusing to maintain?
f) All of them since lab26 coding skills are paramount?
g) The Chinese dictionary since nobody would be able to tell the difference?
Reply 

#12  muzzex 01-07-2013, 05:18 PM
Great!
I got it working.
The first Russian book I was testing was a .txt file. This has no metadata, so I guess it might not be possible. But I also learned how to create .mobi or .prc files from .txt and then change the language settings. So, now I got it all working also with that book.

Thanks a lot for the help!
Reply 

#13  knc1 01-07-2013, 05:46 PM
Quote PoP
Survey: When the two dictionaries are present in the Kindle which one would get chosen to pop up the definition/translation of einmal:
a) Always the first one you loaded?
b) Always the the second one you loaded?
c) Either one at random?
d) The first one if it contains the word searched, otherwise the second one?
e) None of them since lab26 code would be too confusing to maintain?
f) All of them since lab26 coding skills are paramount?
g) The Chinese dictionary since nobody would be able to tell the difference?
Inverted Urdu. (a.k.a: f above)
Reply 

#14  wakawaka 01-22-2013, 12:24 PM
Quote PoP
Yes it does, ixtab is right:
show attachment »
I used the Russian-English Dictionary by A. I. Smirnitsky & A. L. Smirnitsky which has the metadata language ru(25), input language ru(25), output language en(9) and the book is 1001 by Sergei Aleksandrovich Rachinskii which has the metadata language ru(25) and which I obtained from the Project Gutenberg
Hey Pop, thanks for the tip! Very cool to get the popup dictionary working with Russian, however I haven't been able to find a dictionary to work well with all the declensions and conjugations in Russian. Where did you find the Russian-English Dictionary by A. I. Smirnitsky & A. L. Smirnitsky dictionary? Any thoughts on getting something like a 'closest match' lookup, i.e. find the word in the dictionary that matches a given word the closest if no exact matches are found?

Thanks for any input or ideas!
Reply 

#15  PoP 01-22-2013, 02:01 PM
Quote wakawaka
Hey Pop, thanks for the tip! Very cool to get the popup dictionary working with Russian, however I haven't been able to find a dictionary to work well with all the declensions and conjugations in Russian. Where did you find the Russian-English Dictionary by A. I. Smirnitsky & A. L. Smirnitsky dictionary? Any thoughts on getting something like a 'closest match' lookup, i.e. find the word in the dictionary that matches a given word the closest if no exact matches are found?

Thanks for any input or ideas!
I found it from a link posted here. I think that the dictionary has to be created with all the possible inflections for the declensions and conjugations to be searchable. AFAIK the Kindle does not lookup "closest matches'... closest or partial match would certainly be useful... searching *all* dictionaries too... I'm afraid, it would require a rewrite of the Kindle framework.
Reply 

#16  wakawaka 01-23-2013, 08:36 AM
Quote PoP
I found it from a link posted here. I think that the dictionary has to be created with all the possible inflexions for the declensions and conjugations to be searchable. AFAIK the Kindle does not lookup "closest matches'... closest or partial match would certainly be useful... searching *all* dictionaries too... I'm afraid, it would require a rewrite of the Kindle framework.
That dictionary is perfect, just what I was looking for - thanks! Somehow it does work with various declensions/conjugations, though I'm not exactly sure how, it looks like there's only a single entry per word. Other Ru-En dictionaries I've found so far, for example http://www.the-ebook.org/forum/viewtopic.php?p=483630#483630, haven't worked with declensions/conjugations, interested to figure out what the differences between the two are. Anyway, thanks again!
Reply 

#17  PoP 02-04-2013, 09:32 PM
Quote wakawaka
[snip]... Somehow it does work with various declensions/conjugations, though I'm not exactly sure how, it looks like there's only a single entry per word... [snip]... interested to figure out what the differences... [snip]
Humm, me too! Maybe a more knowledgeable dictionary developer could shed some light into this?
Reply 

#18  PoP 02-05-2013, 10:22 PM
Quote wakawaka
[snip]Other Ru-En dictionaries I've found so far, for example http://www.the-ebook.org/forum/viewtopic.php?p=483630#483630, haven't worked with declensions/conjugations [snip]
Agreed, the Smirnitsky dictionary seems to have a single entry per definition but still appears to resolve inflections... I couldn't download your previous problematic dictionary, the URL gives me
show attachment »
and I can't test further. Any chance for another public or PM link?
Reply 

#19  PoP 02-06-2013, 04:01 PM
Read a bit more, posting status.

The show attachment » available from the Kindle Publishing Programs describes in section 7 how to code inflections in dictionaries.

As I thought:
Spoiler Warning below







7.3 Inflections for Dictionaries
When building dictionaries, you may have multiple inflected forms of a single root word that should access the same entry. However, adding all of these inflected forms under the orthography (pronunciation) of a single entry leads to the generation of a large index, which negatively affects performance and user experience. Kindle has a disinflection engine that uses a set of rules for disinflecting any given word to its headword. The index then has only the headword to look up.
To generate the set of disinflection rules for the dictionary, the input must include some information about the inflections. There are two ways to provide this information: simplified inflection syntax and advanced inflection syntax.
7.3.1 Advanced inflection syntax
Inflections are handled by the inflection index, which is built into the dictionary based on the inflected forms which are tagged in the content using the <idx:infl> tag. Inflections are attached to the orthography of the entry. They must be specified inside of an <idx: orth> tag. If an entry has multiple orthographies, each must have its own inflections.
Example:
Code
<idx:orth>record <idx:infl inflgrp="noun"> <idx:iform name="plural" value="records" /> </idx:infl> <idx:infl inflgrp="verb"> <idx:iform name="present participle" value="recording" /> <idx:iform name="past participle" value="recorded" /> <idx:iform name="present 3ps" value="records" /> </idx:infl>
</idx:orth>
The inflgrp and name attributes are optional. The idx:infl, idx:iform, and value attributes are mandatory.
7.3.2 Simplified inflection syntax
For English dictionaries, simplified inflection syntax is a very simple way of giving information about the inflections. Previous versions of the file format supported using the infl attribute in either the <idx: orth> or the <idx:gramgrp> tag and specifying a comma-separated list of inflected forms. This syntax is now deprecated, as it is not as accurate when disinflecting, particularly for non-English languages.

So it must be that the Smirnitsky dictionary has these defined. I am attempting to decompile so I can verify by inspecting the source .opf

So far, Calibre conversion .mobi to .htmlz shows single entries and Calibre conversion .mobi to .epub never completes

To be continued...
Reply 

#20  PoP 02-06-2013, 06:23 PM
...Continued

I used Kindle Mobi Unpack to successfully extract the source .html from the .mobi. Yay!

For the lampshade entry search
show attachment »

Here is the extracted html. Please note the value= field
show attachment »

Since file is UTF-8 encoded, here are the escaped UNICODE values \u0430\u0431\u0430\u0436\u0443\u0440 for абажур:
show attachment »

According the Kindle Publishing Guidelines previous document, The value= is the hidden label to store in the index -- what the user enters in the search box to pop up the dictionary reference. Shown in hex, one sees that it matches the UNICODE :
show attachment »

Humm, all entries in the dictionary are similar and I see no trace of html <idx:infl> inflection tags.

I am still puzzled

To be continued...
Reply 

 « First  « Prev Next »  Last »  (2/5)
Today's Posts | Search this Thread | Login | Register