![]() |
#16 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Saramt89,
Adding to what I just posted as reply to your response. Would you need to write the code so that the beginning bracket would be the indicator of the headword or more exactly the beginning of the line that contains the headword? Sometimes there are two headwords on the same line if you have masculine and feminin endings. But, always there is a headword(s)before the first or leading bracket. These brackets do not appear in the text of the definitions. There may be parentheses but not brackets which only are used for the prononciation of the headword. Thus, do we need a find and replace(or insert?)a tab instructionthat will put a tab somewhere beginning with brackets or the leading bracket and the headword before it? pz |
![]() |
![]() |
#17 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Markismus(and Sarmat89)
Here, enclosed is the real skinny; Couldn't figure out how to attach file but below is the actual text taken from the full text file. zymogène [zims3en] adj. (de zymo- et de -gène, du gr.gennân, engendrer, produire ; 1888, Larousse, comme qualificatif d’une substance qui produit un ferment soluble, par une transformation spontanée ; sens actuel, 1964, Larousse). Pouvoir zymogène, propriété des cellules de fabriquer leurs propres enzymes ; propriété des glandes spécialisées de produire les enzymes néces- saires à l'organisme. © n. m. (1964, Robert). Précurseur inactif d'un enzyme. (Syn. PROENZYME.) zymotechnie [zimotekni] n. f. (de zymo- et de -fechnie, du gr. tekhné, art [manuel], industrie, métier ; 1762, Acad.). Art de produire et de diriger une fermentation. zymotechnique [zimoteknik] adj. (de zymotechnie ; 1872, Littré). Qui se rapporte à la zymotechnie. zymotique [zimotik] adj. (gr. zumôtikos, propre à faire fermenter, de zumôtos, fer- menté, dér. de zumoün, faire fermenter, de zum, levain ; 1855 [d'après Robert, 1977], puis 1868, Souviron, 585). Qui se rapporte aux ferments solubles. zythum {zitsm] ou zython [zit5] n.m. (lat. zythum, bière, boisson faite avec de l'orge, du gr. zuthos, décoction d'orge, bière ; 1710, Richelet — additions — [zythum], et 1923, Larousse [zython]). Bière que les Égyptiens préparaient avec de l’orge fermentée. Very cordially, pz |
![]() |
Advert | |
|
![]() |
#18 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Markismus,
I wanted to make clear that the text just sent to you is the actual text as it appears in the full text file copied in bloc-notes win 11. No alterations on my part. pz |
![]() |
![]() |
#19 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,675
Karma: 168431851
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
zymotique [zimotik] adj. (gr. zumôtikos,
zythum {zitsm] ou zython [zit5] n.m. Is there a reason for the use of [ and {? An error in your original .xml file? Again, click the manage attachments button, click on browse. Locate and select the file that you want to attach (it must be one of the supported file types). Once you have selected the file you want, click on upload. |
![]() |
![]() |
#20 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 515
Karma: 2268308
Join Date: Nov 2015
Device: none
|
You need to unfold the lines first. Try replacing
Code:
(?<=\S)\n(?=\S) |
![]() |
Advert | |
|
![]() |
#21 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear David(DNSB),
Thank you for responding. I didn't notice the change in bracket style, however, I think that what is important is the fact that the headword(s) will be located before the first leading bracket no matter the style. I guess the code could be written to look for the first leading bracket in the two styles. I hope that this is a help. Cordially, pz |
![]() |
![]() |
#22 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Sarmat89,
Thank you for responding. I must ask you to forgive my obtuseness when it comes to programming and code. For example "unfold the lines first"? How do I "unfold a line". "Replacing code". Where and what code am I replacing? Please don't assume that I know the techinical language that you probably are comfortable with. May I ask you, since you have the real sample of the text to work with, to actually illustrate using the provided text of what you suggest needs to be done. You are communicating with a first-grader when it comes to this type of programing. I have to be lead by the hand here. Hopefully, you have the patience to walk me through this. I sense some impatiernce among some of the respondants, and I understand this, but I am not at the level of expertise of my respondants. Very cordially, pz |
![]() |
![]() |
#23 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Sarmat89,
Adding to the message just sent to you. I also don't know what text editor that you are using and what program(and how to obtain the program)that would be doing the text modifications. I have bloc-notes under win 11. Perhaps, I need a different text editor? pz |
![]() |
![]() |
#24 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,675
Karma: 168431851
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
I would suggest installing Notepad++ (not absolutely certain but is bloc-notes the same as notepad?).
As for unfolding lines, what is meant is to take: Code:
zymotique [zimotik] adj. (gr. zumôtikos, propre à faire fermenter, de zumôtos, fer- menté, dér. de zumoün, faire fermenter, de zum, levain ; 1855 [d'après Robert, 1977], puis 1868, Souviron, 585). Qui se rapporte aux ferments solubles. Code:
zymotique [zimotik] adj. (gr. zumôtikos, propre à faire fermenter, de zumôtos, fer- menté, dér. de zumoün, faire fermenter, de zum, levain ; 1855 [d'après Robert, 1977], puis 1868, Souviron, 585). Qui se rapporte aux ferments solubles. |
![]() |
![]() |
#25 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 942
Karma: 149883
Join Date: Jul 2013
Location: Rotterdam
Device: HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
So the problem is of course in the assumptions.
Conversion to csv-file I've used sublime3, because it supports Perl regex. However, with a bit of googling you'll find the slight differences in regex implementation in editors. I've also included the perl commands. If you use the following substitutions in order, you get a csv-file. Find --> Replace ALL, e.g. perl -pe 's/\n\n+/||/sg' '\n\n+' --> '||' , masking of the lines separating articles '\n' --> ' ' , removal of the <EOL>-characters inside an article '\|\|' --> '\n' , insertion of <EOL>-character at the end of an article. The article is now on 1 line. '^(\S+)' --> '$1|,|$1' , Repeating the first word and introducing a delimiter, e.g. |,|. The reason for a complex delimiter is that it will not occur naturally in the article. '^(\S+)' --> '$1,' , Splitting the first word and introducing a comma The last two replacements are alternatives. I've added the original text-file and the intermediate results. You can recreate them with the commands Code:
perl -pe 's/\n\n+/\|\|/sg' <original.txt> output1.txt perl -pe 's/\n/ /sg' <output1.txt> output2.txt perl -pe 's/\|\|/\n/sg' <output2.txt> output3.txt perl -pe 's/^(\S+)/$1 /sg' <output3.txt> output4.csv Code:
zymogène, [zims3en] adj. (de zymo- et de -gène, du gr.gennân, engendrer, produire ; 1888, Larousse, comme qualificatif d’une substance qui produit un ferment soluble, par une transformation spontanée ; sens actuel, 1964, Larousse). Pouvoir zymogène, propriété des cellules de fabriquer leurs propres enzymes ; propriété des glandes spécialisées de produire les enzymes néces- saires à l'organisme. ©, n. m. (1964, Robert). Précurseur inactif d'un enzyme. (Syn. PROENZYME.) zymotechnie, [zimotekni] n. f. (de zymo- et de -fechnie, du gr. tekhné, art [manuel], industrie, métier ; 1762, Acad.). Art de produire et de diriger une fermentation. zymotechnique, [zimoteknik] adj. (de zymotechnie ; 1872, Littré). Qui se rapporte à la zymotechnie. zymotique, [zimotik] adj. (gr. zumôtikos, propre à faire fermenter, de zumôtos, fer- menté, dér. de zumoün, faire fermenter, de zum, levain ; 1855 [d'après Robert, 1977], puis 1868, Souviron, 585). Qui se rapporte aux ferments solubles. zythum, {zitsm] ou zython [zit5] n.m. (lat. zythum, bière, boisson faite avec de l'orge, du gr. zuthos, décoction d'orge, bière ; 1710, Richelet — additions — [zythum], et 1923, Larousse [zython]). Bière que les Égyptiens préparaient avec de l’orge fermentée. So what's the problem? You now have an article with the key '©' that has a quite new meaning. Apparently, there are articles that have subsections separated from the main article in the same way that articles are separated. Stardict Using my script I've added to the txt-file a csv-extension and ran it using Code:
perl pocketbookdic.pl zymogène.S-delimiter .txt.csv fr '|,|' The screen output (with '$isTestingOn = 1;' in the script) is like this: Last edited by Markismus; 09-11-2022 at 08:50 AM. |
![]() |
![]() |
#26 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear David(DNSB)
It's Sunday and I don't know if you want "work" on Sunday. I see your example, thankyou. Now, what is the reason for the one line and how do I actually do this in notepad++ and have it go through the over 100,000 listed words and the attached definitions? Looking at the text you see that some lengthy definitions are separated into paragraphs with space between paragraphs; how would the separate paragraphs be included in the one line? After everything is put on one line for each headword what would be the next step for getting pyglossary to convert the file to stardict? Would I be putting a tab somewhere? and if so, how would this be done? Do I need a special "sub-editor" to work inside notepad++? As for notepad++, it may be the same as bloc-notes, however, I will try to install notepad++. I assume, then, that you prefer to have me work under win 11 with notepad++ than linux. Whatever is the most simple is best for me. Cordially, pz |
![]() |
![]() |
#27 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Markismus,
You have put not a little work in your response to me and I am very appreciative of your efforts to help me. Let me try to understand what you are proposing; Firstly, I need to build a csv file. You applied the four lines of perl code to convert the one word to a csv formated file. Thus, do I plug in the original text file name in your first of four lines of code(perl -pe 's/\n\n+/\|\|/sg' <original.txt> output1.txt)and then follow through to the fourth line insserting the actual file names? This then, would give me a complete csv file of the full text file of which you have the example? Secondly, and I quote you: "Stardict Using my script I've added to the txt-file a csv-extension and ran it using Code: perl pocketbookdic.pl zymogène.S-delimiter .txt.csv fr '|,|' I thought that we already built the csv file with your four lines of perl code. What txt file are you now adding a csv extension too. And what am I doing with the .xml and .zip files? Have I created these with your code? Do I understand correctly that pyglossary will convert the csv file created? Does this side-step the tab-delimiting of the text file or was this accomplished in your code? Where do I find "perl" and is this an instruction set to be used in a particular text editor. Is this under Linux terminal? What text editor are you using? Is sublime3 the editor? I am a little confused about what I actually need to impliment what you want me to do. I hope that my understanding(or what little there is of)is not completely off base! cordially, pz |
![]() |
![]() |
#28 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Markismus,
Adding to the just-sent message this Sunday, I have installed notepad++ and have installed Perl in it. This is under windows 11. cordially, pz |
![]() |
![]() |
#29 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Dear Markismus,
I was finally able to install ActivePerl for windows. I copied your first line of code into the command line for perl to execute it and it gave me back this message: [ActiveState/ActivePerl-5.28] C:\Users\k\ActivePerl-5.28>perl -pe 's/\n\n+/\|\|/sg' grandl.txt output1.txt '\' n’est pas reconnu en tant que commande interne ou externe, un programme exécutable ou un fichier de commandes. Which means that the '\' is not recognised as an internal nor external commande nor an executanle programm nor an file of commands. How do execute the code, then, that you wrote? I didn't want to impose upon you but would you convert the full text file that I have into a stardict dictionary for me? Otherwise, I can continue on this way-with your guidance. It is a learning experience in any event. Thus, I think I have Perl installed under windows but I am stuck executing the code that you wrote. pz |
![]() |
![]() |
#30 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 942
Karma: 149883
Join Date: Jul 2013
Location: Rotterdam
Device: HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
You can post a link to the full txt-file.
|
![]() |
![]() |
Tags |
pyglossary |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF to PDF conversion causes all the text to be aligned to the left | Swifty4635 | Conversion | 1 | 01-16-2022 10:17 PM |
Desktop App How do I run PyGlossary on Windows ? | Bilingual | Kobo Reader | 2 | 07-12-2020 01:54 PM |
epub 2 PDF conversion with OCR in PDF possible? | hobi2000 | Conversion | 2 | 03-25-2019 03:20 AM |
PDF conversion keeping pdf page | highstream | Conversion | 3 | 05-31-2016 11:46 AM |
PDF to PDF conversion creates much larger file? | rocketcat | Conversion | 11 | 09-30-2011 07:37 PM |