Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 09-13-2022, 04:22 PM   #46
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Good evening, M. Markismus,

Well, I've read your message and it is written in the king's english but my brain has shorted out trying to understand what you are saying.

But, first things first. Pyglossary gave me these files; .idx,.ifo,.dict, and .syn. I installed these files in a folder named GrandL and installed the folder on my kobo in koreader.

I did a word look-up and the dicitonary did not give back the word. Tried several common words and it was no-go, nada.

I noticed that in my other stardict dictionary folders the .dict file is actually *.dict.dz. There isn't any .dict file but all the dictionaries have the .dict.dz ending.

Is this the reason that the words are not found or is the answer found in your last response?

I would be unable to do what you suggest to do in your last message; way above my expertise.

I thought, at least, that the csv file conversion would get me "through the door" and I'd have a stardict dictionary-not perfect-but serviceable.

You mention not searching for keywords; do you mean "headwords"? If that is the case, than I won't find any words! If that is the case, what has all this work that you helped me with accomplished?

Please let me know about this .dict.dz file; it seems that should be among the stardict files for the word look-up.

One more thing; bizarrely, my koreader lists this dictionary as output4.csv. But the folder I installed containing the stardict files is called GrandL; output4.csv was not installed.

Bizarre.

cordially,
pz

Last edited by pzack; 09-13-2022 at 04:25 PM.
pzack is offline  
Old 09-13-2022, 05:57 PM   #47
Sarmat89
Evangelist
Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.
 
Posts: 482
Karma: 2267928
Join Date: Nov 2015
Device: none
Use 'dictzip' utility to make .dict.dz files. On Arch, it is available in the 'dictd' package.
Sarmat89 is offline  
Advert
Old 09-13-2022, 09:50 PM   #48
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Goodevening M. Sarmat89,

Thank you for responding and following this thread. I was able to create the .dict.dz file, however, koreader-stardict is not locating words in this dictionary.

Koreader lists the dictionary file folder as output4.csv but the actual folder that contains the stadict files is named GrandL and it is this folder that should have appeared in my dictionary list under koreader.

I don't have any idea how koreader-stardict is listing the folder as output4.csv.

I got pyglossary to make the stardict files from the csv with one info error that pyglossary could not read the source language and I don't know what that means and how to correct that. But I am not sure that this caused the failure of stardict to search the dictionary in koreader.

M.Markismus suggested using the stardict convertor/stardict-tools but I am not able to get it installed in linux with all the necessary packages and dependencies.

cordially,
pz

Last edited by pzack; 09-13-2022 at 09:55 PM.
pzack is offline  
Old 09-13-2022, 10:35 PM   #49
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Good morning M. Markismus,

I was able to create the .dict.dz file, M. Sarmat89 gave a suggestion on how to do this.

But, koreader-stardict is still listing this dictionary as output4.csv.

The output4.csv file is not installed on the Kobo so where is,how is koreader or stardict picking up this file? The files in the GrandL folder have gl as prefix before the stardict file type; "output4" is nowhere to be seen in this folder.

Perhaps, then, stardict is not seing the stardict files made from the csv file.
Cordially,
pz

Last edited by pzack; 09-13-2022 at 10:38 PM.
pzack is offline  
Old 09-14-2022, 01:22 AM   #50
Markismus
Guru
Markismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicing
 
Markismus's Avatar
 
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
There is a tagged field for dictionary name. Apparently, pyglossary filled it with the original csv-filename.

You could try pointing Goldendict to the dictionary with F3. Maybe it can work with it.
You could try Penelope on GitHub for the conversion.
You could try Stardict-editor which is part of Stardict-tools.

I don't know what happened. Either pyglossary failed or Koreader or you did. Can't help you without the actual data. Sorry.
What you could try is convert it to a human readable format, such as xdxf- or Stardict-xml. You could asaess the result and see what went wrong.
Markismus is offline  
Advert
Old 09-14-2022, 11:34 AM   #51
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Good afternoon M. Markismus,

Thank you for your response.

I am curious to know if you have and use an e-reader with koreader or some other program that uses stardict which allows word look-up in the text that you are reading.

Because, it would be difficult for you to investigate this problem without duplicating what I use on my end. And you do have the actual data; I sent you a small chunk of the file which is the same throughout the file. You had, I assume, run the csv file that you created with perl through pyglossary? You could install the converted csv file on your e-reader in stardict and search one of the words in the snippet that was converted. You could see if Pyglossary tags a file differently when you convert the csv file.

I have used pyglossary on another dictionary conversion of a .dsl file which went rather smoothly. Pyglossary may have tagged a field with output4 but it does not appear in the four files produced as stardict files. Thus, does that really explain this occurrence?

Goldendict is not stardict, as I am sure you know, as the files are different. This does not help me as Goldendict is a stand-alone application for windows and maybe linux and not used on e-readers-at least not on mine with koreader installed. Koreader uses Stardict. And Stardict has the instant word look-up feature.

I have Goldendict with some dictionaries installed on my windows machine.

If you are willing, or someone reading this post is willing, to give me CLEAR!, step-by-step instructions for installing stardit-tools and stardict-editor on either windows or linux(whatever is easiest)I would try Stardict. I tried to install these applications without success; there was always something missing or I could not get it to run. Googling was not a help. And it seems Stardict has been substituted by Pyglossary.

Well, I could convert it to stardict XML but I don't know how and I would need help doing it. However, you could do this since you have the txt snippet and you have the csv. But, again, if you can't duplicate stardict in koreader then I think that you are limited in achieving a functioning stardict dictionary with my file.

I really do appreciate your time and efforts to help me and I hate to give up on this.

I assume that the csv file is a good file to use in pyglossary; the headwords are seen and searchable in the files converted to stardict.

Cordially,
pz

Last edited by pzack; 09-14-2022 at 11:42 AM.
pzack is offline  
Old 09-14-2022, 02:29 PM   #52
Markismus
Guru
Markismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicing
 
Markismus's Avatar
 
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
I am one of the owners of the repository of Koreader. It's a claim that mainly emphasizes that I have some knowledge of how Koreader might be used. As we're on a forum called mobileread one can also conclude that some kind of e-reader might have been on the horizon for some users. Let's assume it's happened on my horizon, too.

I've you've tested the files I've uploaded, you should have them working with both Koreader and Goldendict. If not, it's not due to the uploaded dictionary-snippet. (Clearly, that was not the data I was alluding, too.)

For installing and working with Stardict-tools, I would suggest Google. It's really much easier than converting a dictionary.
Markismus is offline  
Old 09-14-2022, 07:45 PM   #53
Sarmat89
Evangelist
Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.
 
Posts: 482
Karma: 2267928
Join Date: Nov 2015
Device: none
You can edit .ifo file directly in a text editor. Check "bookname=" and "description=" keys.

When you open the .idx file in a text editor, can you see the correct headwords inside?
Sarmat89 is offline  
Old 09-14-2022, 08:22 PM   #54
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Hello M. Markismus,

Thank you for your kind reply. As far as I know stardict and goldendict, though sharing similarities are somewhat different in file format support. I think that Goldendict will support stardict files but you must convert Goldendict files for use in Stardict.

Goldendict does not help me here as I indicated earlier Koreader integrates Stardict in on-the-fly word lookup. I am not aware of any Goldendict installation on an e-reader, in my case, Kobo(sage,elipsa,forma).

You state that the files should be working with Koreader and Goldendict but the issue is not with Goldendict; I need the files to work in Stardict!

The question then is were you able to test the files in Stardict under Koreader on an e-reader?

I am still mystified why stardict-koreader is listing output4.csv. It seems stardict sees this file, though I didn't install the file in stardict, and hence Stardict is not accessing the stardict files that were installed in a folder. I have a folder for each stardict dictionary in koreader.

Please believe me when I tell you that google is no help for installing stardict-tools. I don't need the stardict editor, that is, the stardict dictionary shell but the stardict tools for converting a file into stardict files.

Another question is does Stardict-tools conversion support a csv file format?

Would it not be so much easier to tell me how to install stardict tools-perhaps sending me this or these files with installation instructions. I only see the stardict program via stardict.exe for download and I have that but it is not a conversion app.

Not sure what you mean about much easier than converting a dictionary. What is much easier?

Cordially,
pz

Last edited by pzack; 09-14-2022 at 08:37 PM.
pzack is offline  
Old 09-15-2022, 02:19 PM   #55
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Hello M. Markismus,

I was able to get koreader to see the dictionary. But, the index was skewed and nothing matched the word searched.

Here's what I did; I deleted all preliminary material in the txt file and just had words with definitions. I did this in notepad++ and I did not change lines but notepad changed the line count.

In pyglossary I got a lot of "[error]invalid row["]" messages and sometimes the same type of message appeared but text appeared in the [] brackets instead of just ["].

This time pyglossary really worked the text taking 691.4 seconds to convert rather than the 3 seconds the other time before I edited the text file. And this time, at least, the dictionary was seen in stardict even though I couldn't get a match of the word found to the word searched.

I also received an [info] message in pyglossary telling me it failed to detect sourceLang and targetLangue frome the csv glossary name(I named the csv file l-nantes.csv).

The .idx file had a "?" mark on the file icon that is seen in file manager.

In the dicitonary list in koreader the dictionary name still has a .csv tag.

There is then, a partial success. Can you suggest a solution to getting stardict to find the correct word in the data file?

Slowly, but surely, I think we're getting to a functioning stardict dictionary.

cordially,
pz

Last edited by pzack; 09-15-2022 at 02:22 PM.
pzack is offline  
Old 09-15-2022, 06:01 PM   #56
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Hello Sarmat89,

I thought that maybe you could help me with this; how could I tab delimit the first word of each line in my text file. That is, treat every first word of a line of text in my dictionary text file as a searchable headword? This would be my tab-delimited file to use in pyglossary for a conversion to stardict.

I have notepad++ and I have perl installed in linux. Would you have a cut and paste line or lines of code that would do this on every line of my text file. Either do this in linux terminal or in a text editor like notepad in windows. Or if you have something else that I could use.

Cordially,
pz
pzack is offline  
Old 09-15-2022, 06:50 PM   #57
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Hello Samart89,

This message is a cancellation of the just-sent message to you sept 15.



I thought that maybe you could help me with this; how could I tab delimit the first word of each line in my text file ending with a space and a beginning bracket "[". The first word would be consecutive characters including letters or other signs but all of it would end with a space and then the bracket [.

This would be my tab-delimited file to use in pyglossary for a conversion to stardict.

I have notepad++ and I have perl installed in linux. Would you have a cut and paste line or lines of code that would do this on every line of my text file. Either do this in linux terminal or in a text editor like notepad in windows. Or if you have something else that I could use.

Cordially,
pz
pzack is offline  
Old 09-15-2022, 07:04 PM   #58
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Dear DNSB-David,

I have come back to your message about unfolding a line, specifically, placing the headword and definition all on one line in my text file. There are some quite lengthy definitions in this text file.

Do you have some cut and paste code that I could use to do this on my dictionary text file with over 100,000 words. The file would then need to be tab-delimited for each line so that I would have a tab-delimted text file for a pyglossary conversion to stardict.

I have notepad++ in windows and Perl in linux if this is something that can be used. Or, if you have something else.

cordially,
pz
pzack is offline  
Old 09-15-2022, 08:32 PM   #59
Sarmat89
Evangelist
Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.
 
Posts: 482
Karma: 2267928
Join Date: Nov 2015
Device: none
1. Use the output4 file as a starting point and check that every line in that file contains a complete definition. If the definition is split between several lines, join them by <BR> (or <BR><BR>), if you want an empty line between them)

2. Find and replace all misplaced curly brackets { with [.

3. Use the code from that post to separate the headwords with tabs.

4. Search for ^[^\t]+$ to fix the remaining lines without a tab.
Sarmat89 is offline  
Old 09-15-2022, 10:23 PM   #60
pzack
Connoisseur
pzack began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
Evening M. Sarmat89,

I should re-explain again that my idea is to tab-delimit all the lines in output4 that have a word that begins the line followed by a space and then a "[". All the text that follows this,the lines that follow this, is definition for that headword until the next line that has a word that begins the line with a space after this word and then the bracket "[". So it's;

Headword(word beginning a line with any type of consecutive characters)then space then beginning [. All that follows goes on one line until the next headword defined above.

This may not be perfect but I think that I should see most of the headwords.

I assume that you only tab-delimit the headwords for pyglossary to set up a searchable stardict file.

May I trouble you to list the code that I can paste that would unfold the line and tab-delimit the headword as explained above. I know that you have listed some code in previous messages but it would be easier for me if you could list the codes here in one place. Also, I think that you may be changing your code given the above.

I need the code that uses the <BR> function that eliminates the spaces in certain long defintions. Or can these spaces remain in the unfolded line of definition?

Thus, I am asking if you would kindly list the codes for this and I suppose I should see output4 somewhere in the code, otherwise, how does the code know what file to work on?

Is the code Perl code to be used in linux terminal? Stupid question perhaps, but I wanted to confirm this. I don't know Perl code so I can only cut and paste what you give me.

Thank you again for taking the time to help me with this file conversion.

Cordially,
pz

Last edited by pzack; 09-15-2022 at 10:36 PM.
pzack is offline  
Closed Thread

Tags
pyglossary


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to PDF conversion causes all the text to be aligned to the left Swifty4635 Conversion 1 01-16-2022 10:17 PM
Desktop App How do I run PyGlossary on Windows ? Bilingual Kobo Reader 2 07-12-2020 01:54 PM
epub 2 PDF conversion with OCR in PDF possible? hobi2000 Conversion 2 03-25-2019 03:20 AM
PDF conversion keeping pdf page highstream Conversion 3 05-31-2016 11:46 AM
PDF to PDF conversion creates much larger file? rocketcat Conversion 11 09-30-2011 07:37 PM


All times are GMT -4. The time now is 11:10 PM.


MobileRead.com is a privately owned, operated and funded community.