![]() |
#136 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 864
Karma: 144987
Join Date: Jul 2013
Location: Netherlands
Device: HiSenseA5ProCC, OnyxNotePro, Note5, Kobo Glo
|
If you put the file from archive.com through ABBYY Finereader 15. You can choose to save it as html formatted text, without header/footer, without pictures and without linebreaks or hyphens.
The result of the first 100 pages looks like this. Simply put, to convert a file to another format the previous file should be generated by an algorithm. Otherwise there are too many odd things. With an algorithm all things are within brackets and nicely recognisable. Now every entry is encapsulated in following p-tags. A new entry is bold followed by [a word between brackets] or followed by 'ou' and another word and [a word between brackets]. You can filter out the very large capitals at the beginning of a new letter because they have a rather large font. In this case 61pt. I'll have a go at it in the coming week, to see how far these simple rules come. Last edited by Markismus; 10-02-2022 at 04:32 PM. |
![]() |
![]() |
#137 | |
Grumpy old git
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,959
Karma: 5053935
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
In effect you are saying "Because I am not selling or distributing the dictionary I can do what I want." Well, no. Downloading copyrighted content doesn't give you any right to use it unless you can claim a "fair use exemption", which by the way the EU doesn't seem to recognize (see this FAQ answer that makes it clear that the copyright owner must approve). I agree that you didn't know in the beginning that the dictionary was pirated. Regardless, you and MobileRead now know. Section 8 of MobileRead's rules say that no one can help someone use pirated material. An example: I download a copy of a copyrighted ebook for personal use, where I don't have the physical book and am therefore not format-shifting. In this case I have "pirated" the book and MobileRead's rules rule out assisting me with reformating or otherwise making this downloaded book "better" for my own use. Simply asking can get me permanently banned from MobileRead. In the end this is a question of whether you think copyright is valid when dealing with electronic copies. I know people who say "No, it isn't. There isn't any harm." I know others (including me) who say "The author gets to choose." I am not a MobileRead moderator so cannot make any final decisions, but such moderators are here looking at things. |
|
![]() |
Advert | |
|
![]() |
#138 |
null operator
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,029
Karma: 22381319
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Moderator Notice
Given this thread hasn't mentioned calibre since Doitsu's post #2, I'm moving it to the Workshop Forum BR |
![]() |
![]() |
#139 | ||||
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,260
Karma: 111597957
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Quote:
Quote:
Quote:
As for making an issue of it? I would prefer if Mobileread was not involved in a copyright infringement case. Quote:
This does not come fall into the case where someone is attempting to make an ebook from a physical book they own which, as far as I am aware, is not permitted within the EU. What we have here is a plain and simple act of piracy committed by an incompetent. |
||||
![]() |
![]() |
#140 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Hello, M Sarmat89,
I tried the new code on the original text file. The index file had that question mark; some words were found, others not. There were, in some cases, several lines of definition included but from definitions quite long. Like you said the file would need a lot of manual modification, something that I am not going to devote time too. Please let me know if you have any other suggestions, maybe you might think of something and I certainly am open to try anything that might have a chance to work. But a manual manipulation on such a large file I think would be out of the question. Very cordially, pz Last edited by pzack; 10-03-2022 at 01:38 PM. |
![]() |
Advert | |
|
![]() |
#141 |
Connoisseur
![]() Posts: 79
Karma: 10
Join Date: Aug 2022
Device: kobo sage,elipsa
|
Hello M Markismus,
Sorry to get back to you a little late as I was finally able to try some new code that M Sarmat89 gave me. Unfortunately, nothing really changed. I looked at your file and it looks real nice, better than the pdf! You have put the file into a different format but of course we will need to have something that pyglossary will accept for conversion to stardict. I look forward to what you might come up with in a weeks time. On another note, the moderator is moving this thread as it wasn't about or related to Calibre. Frankly, this thread was never about Calibre; I certainly never mentioned Calibre. I don't know what happens when it is moved and how to find it or how to contact you if I can't connect to the thread. Would you, at your earliest convenience, explain this to me. Glad that you are still sticking with helping me with a stardict conversion, thankyou and a thanks to M Sarmat89 who has put a lot of effort, as you have, into this conversion-or rather-attempted conversion up to now. Very cordially, pz |
![]() |
![]() |
#142 | |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 55
Karma: 221652
Join Date: Sep 2007
Device: ipaq
|
Quote:
Regarding the moved thread: You found it. jmurphy |
|
![]() |
![]() |
#143 |
o saeclum infacetum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,585
Karma: 212489934
Join Date: Oct 2010
Location: New England
Device: H2O, Aura One, PW5
|
Moderator Notice
Closing the thread. MR does not condone pirating copyrighted works nor helping those who pirate. This also serves as a warning; a similar request will result in a ban. |
![]() |
![]() |
Tags |
pyglossary |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF to PDF conversion causes all the text to be aligned to the left | Swifty4635 | Conversion | 1 | 01-16-2022 10:17 PM |
Desktop App How do I run PyGlossary on Windows ? | Bilingual | Kobo Reader | 2 | 07-12-2020 01:54 PM |
epub 2 PDF conversion with OCR in PDF possible? | hobi2000 | Conversion | 2 | 03-25-2019 03:20 AM |
PDF conversion keeping pdf page | highstream | Conversion | 3 | 05-31-2016 11:46 AM |
PDF to PDF conversion creates much larger file? | rocketcat | Conversion | 11 | 09-30-2011 07:37 PM |