Mobileread
conversion pyglossary pdf
#11  DNSB 09-09-2022, 10:59 PM
If you can't post the whole text file here as an attachment to your message, then snip a chunk of text and post that. It'll make looking at your issues a lot simpler.

To attach the file, either use the paperclip next to the smiley icon at the top of the message entry box or the Manage Attachments in the Attach files box below the message entry box. A .txt file is limited to 1MB but you can attach a .zip file of up to 20MB.

#12  Sarmat89 09-10-2022, 01:57 AM
It should be simple.

Get yourself an editor with regex support, like Notepad++ or VSCode.

Replace
Code
^([^[]+?) *(?=\[)
with
Code
\1\t
.

#13  Doitsu 09-10-2022, 02:03 AM
Sarmat beat me to the answer.

#14  pzack 09-10-2022, 11:41 AM
Dear Sarmat89,

Thank you for the information and for responding. The code is greek to me.

Here is how my text file looks as an example(I did not build this file);

cours [kur] n.m. definition........................................ .................................
.................................................. .................................................. .........
.................................................. .................................................. ...........

.................................................. .................................................. ..........
.................................................. .................................................. .............
coursier [kursje] n.m. definition........................................ .............................
.................................................. .................................................. .............
.................................................. .................................................. ..............

Thus, you have headword space [prononciation] gender definition.
The definitions can be in separate paragraphs and sometimes a number of paragraphs in a long definition and it is, I think, the bracketed prononciation with its headword before it that delimits the definitions.

If I understand tab-delimiting correctly, then the headword and brackets would have a tab but I don't know where to place the tab and how to actually tab the text.

There are over 100,000 words with definitions(6,000 pages plus)so the program has to run through the file placing somewhere the tab. Or tabs?

If your example of code applies here, how would you plug in the actual format in this code, that is what represents what in your code looking at my example?

I don't know what regex is and how it works. I have notepad++ under win 11 and I have never formatted a text file least of all built a tab-delimited file.

I assume that the problem in pryglossary is getting the headword with the brackets tabbed so that stardict can find the word.

Very cordially,
pz

#15  Markismus 09-10-2022, 11:54 AM
Dear pzack,

This is not working. The example given reiterates the problem as you've described it. But it is not a sample. We already given you multiple solutions to that problem, but it doesn't seem to help you.

Zip the text-file and post in on a file-hoster such as Dropbox or pCloud and share the link. Maybe we can help you, if you stop repeating the same information.

#16  pzack 09-10-2022, 11:57 AM
Dear Saramt89,

Adding to what I just posted as reply to your response. Would you need to write the code so that the beginning bracket would be the indicator of the headword or more exactly the beginning of the line that contains the headword? Sometimes there are two headwords on the same line if you have masculine and feminin endings. But, always there is a headword(s)before the first or leading bracket. These brackets do not appear in the text of the definitions. There may be parentheses but not brackets which only are used for the prononciation of the headword.

Thus, do we need a find and replace(or insert?)a tab instructionthat will put a tab somewhere beginning with brackets or the leading bracket and the headword before it?

pz

#17  pzack 09-10-2022, 01:27 PM
Dear Markismus(and Sarmat89)

Here, enclosed is the real skinny; Couldn't figure out how to attach file but below is the actual text taken from the full text file.

zymogène [zims3en] adj. (de zymo- et de
-gène, du gr.gennân, engendrer, produire ;
1888, Larousse, comme qualificatif d’une
substance qui produit un ferment soluble,
par une transformation spontanée ; sens
actuel, 1964, Larousse). Pouvoir zymogène,
propriété des cellules de fabriquer leurs
propres enzymes ; propriété des glandes
spécialisées de produire les enzymes néces-
saires à l'organisme.


© n. m. (1964, Robert). Précurseur inactif
d'un enzyme. (Syn. PROENZYME.)


zymotechnie [zimotekni] n. f. (de zymo-
et de -fechnie, du gr. tekhné, art [manuel],
industrie, métier ; 1762, Acad.). Art de
produire et de diriger une fermentation.


zymotechnique [zimoteknik] adj. (de
zymotechnie ; 1872, Littré). Qui se rapporte
à la zymotechnie.


zymotique [zimotik] adj. (gr. zumôtikos,
propre à faire fermenter, de zumôtos, fer-
menté, dér. de zumoün, faire fermenter, de
zum, levain ; 1855 [d'après Robert, 1977],
puis 1868, Souviron, 585). Qui se rapporte
aux ferments solubles.


zythum {zitsm] ou zython [zit5] n.m.
(lat. zythum, bière, boisson faite avec de
l'orge, du gr. zuthos, décoction d'orge,
bière ; 1710, Richelet — additions —
[zythum], et 1923, Larousse [zython]). Bière
que les Égyptiens préparaient avec de l’orge
fermentée.

Very cordially,
pz

#18  pzack 09-10-2022, 01:30 PM
Dear Markismus,

I wanted to make clear that the text just sent to you is the actual text as it appears in the full text file copied in bloc-notes win 11. No alterations on my part.

pz

#19  DNSB 09-10-2022, 03:13 PM
zymotique [zimotik] adj. (gr. zumôtikos,


zythum {zitsm] ou zython [zit5] n.m.

Is there a reason for the use of [ and {? An error in your original .xml file?

Again, click the manage attachments button, click on browse. Locate and select the file that you want to attach (it must be one of the supported file types). Once you have selected the file you want, click on upload.

#20  Sarmat89 09-10-2022, 05:13 PM
You need to unfold the lines first. Try replacing
Code
(?<=\S)\n(?=\S)
(insert "\r" before "\n" if the expression fails) with a space.

 « First  « Prev Next »  Last »  (2/15)
Today's Posts | Search this Thread | Login | Register