![]() |
#61 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 2146264
Join Date: Nov 2015
Device: none
|
Your source is too low quality to rely on automatic procedures. You already have a file with long lines, open it in Notepad++ or like, paste the expressions into "Search" and "Replace" boxes, set the "Regular expression" switch, and press "Replace All". Then find where that expression failed, and replace those manually.
What "spaces" are you talking about? |
![]() |
![]() |
#62 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Good afternoon M Samart89,
Thank you for your response. Would you give me some concrete examples of the search and replace commands in notepad++. I have never really worked with a text in notepad. The spaces are the empty lines between definition text; I don't know why the definition is not consecutive lines of text and if you have to eliminate these lines. Look for the first beginning(string)headword on the line with the bracket "[" following the string headword; if found insert a tab,otherwise,continue to next line. This would be the english language instruction set(or something similar to this) to be put into code for tab delimiting the text file. I am not sure where you would insert the tab. What would have been a "high quality" text file? Are you really suggesting manual modification to this voluminous text file? Cordially, dk |
![]() |
Advert | |
|
![]() |
#63 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 2146264
Join Date: Nov 2015
Device: none
|
Your file contains OCR errors.
Code:
^([^[]+?) *(?=\[) Code:
\1\t If you are going to use perl, try Code:
perl -pe "s:^([^[]+?) *(?=\[):\1\t:" <your-file-here >destination.tsv |
![]() |
![]() |
#64 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Hello M. Samart89,
Thank you for your response and the Perl codes. You have me creating a TSV file I think. However, pyglossary, as far as I know, supports csv files for conversion and not tsv. According to Github the tsv extension is not listed as supported extensions in pyglossary. Any suggestions? cordially, pz |
![]() |
![]() |
#65 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 2146264
Join Date: Nov 2015
Device: none
|
Works fine here.
|
![]() |
Advert | |
|
![]() |
#66 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
pyglossary conversion of tsv file
Good afternoon M. Samart89,
Pyglossary did indeed use the tsv file. Here are two images included(hope they are attached correctly). Lots of no tab errors, not the headwords,but just text. And the index .idx file is probably corrupt. I didn't put the files into koreader to try them thinking it useless to do so. Please take a look at the images enclosed. Please not the question marks on the index and synonym files. Cordially, pz Last edited by pzack; 09-18-2022 at 11:37 AM. |
![]() |
![]() |
#67 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
pyglossary tsv file conversion message 2
M. Sarmat89,
Here is another sceen shot of the pyglossary tsv conversion summary. Cordially, pz |
![]() |
![]() |
#69 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Pyglossary conversion to stardict
Hello M. Sarmat89,
I think that I have been trying to drown a fish; the pyglossary conversion of my files/s has not been working. I will admit that there may be "operator error"-myself-to be exact. You have been gracious enough to give me some perl code to try and convert the csv and txt files but the indexes are not constructed properly. I have an .xml file-I am not sure if this is the complete dictionary-but pyglossary wants the formats and extensions supported included here: Format Extension Read Write ABBYY Lingvo DSL .dsl X AppleDict Source .xml X Babylon .bgl X Babylon Source .gls X DictionaryForMIDs X DICTD dictionary server .index X X FreeDict .tei X Gettext Source .po X X SQLite MDic .m2 Sib .sdb X X Octopus MDic .mdx X Octopus MDic Source .txt X X Omnidic X X PMD X X Sdictionary Binary .dct X Sdictionary Source .sdct X SQL X StarDict .ifo X X Tabfile .txt, .dic X X TreeDict X XDXF .xdxf X xFarDic .xdb The xml that I have is probably not in an appledictsource format. Can't say for certain. I tried it in pyglossary but pyglossary seems to hang and nothing appears in the window. There are the stardict and penelope converters but I cannot seem to get someone to tell me how to install them in windows or linux much less work with these applications. Plus, I am not sure that these apps would be successful where pyglossary was not. I am not a programmer and this puts me at a disadvantage and I am certainly thankful for the time and help(you and M. Markismus) that you have given me. I am disappointed, to say the least, that I couldn't succeed in getting this dictionary into stardict under koreader. The pdf version is a searchable file but has nowhere near the convenience of use under stardict and koreader. I don't know what else can be done. Very cordially, pz Last edited by pzack; 09-21-2022 at 12:38 PM. |
![]() |
![]() |
#70 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 864
Karma: 144987
Join Date: Jul 2013
Location: Netherlands
Device: HiSenseA5ProCC, OnyxNotePro, Note5, Kobo Glo
|
Well, you could upload the file and have us take a look at it.
|
![]() |
![]() |
#71 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Good evening, M Markismus,
Good to hear from you. I will try to upload a folder containing several files including txt and xml. Or, please let me know if it might be easier to upload a torrent file containing all the files. Cordially, pz |
![]() |
![]() |
#72 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
up torrent of grand dictionnaire for conversion to stardict
Good evening M Markismus, M. Sarmat89,
Perhaps, you will succeed where I did not. It would give me great pleasure if you can convert this dictionary to stardict for use under Koreader. Have at it! And good luck! Please, kindly confirm that you have received the torrent attached. Very cordially, pz |
![]() |
![]() |
#73 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,026
Karma: 71684510
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
Before sharing that dictionary, is it public domain or copyrighted?
If not public domain you should not be sharing it with other people. Sent from my Pixel 4a using Tapatalk |
![]() |
![]() |
#74 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 864
Karma: 144987
Join Date: Jul 2013
Location: Netherlands
Device: HiSenseA5ProCC, OnyxNotePro, Note5, Kobo Glo
|
@pzack No torrent is attached. If the source pdf-file it is copyrighted, it would be better to sent a link via PM, so that mobileread isn't hosting data derived from copyrighted material.
|
![]() |
![]() |
#75 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 2146264
Join Date: Nov 2015
Device: none
|
What exactly did you try to do, what was the result, and what went wrong, step by step?
|
![]() |
![]() |
Tags |
pyglossary |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF to PDF conversion causes all the text to be aligned to the left | Swifty4635 | Conversion | 1 | 01-16-2022 10:17 PM |
Desktop App How do I run PyGlossary on Windows ? | Bilingual | Kobo Reader | 2 | 07-12-2020 01:54 PM |
epub 2 PDF conversion with OCR in PDF possible? | hobi2000 | Conversion | 2 | 03-25-2019 03:20 AM |
PDF conversion keeping pdf page | highstream | Conversion | 3 | 05-31-2016 11:46 AM |
PDF to PDF conversion creates much larger file? | rocketcat | Conversion | 11 | 09-30-2011 07:37 PM |