|
|
#1 |
|
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Apr 2026
Device: none
|
Converting from pdf to docx & epub - Suggestion
It is a suggestion!!!
Sometimes I downloaded pdf files which are able to read but when you convert to docx or epub you receive a garbage because these pdf files has been created.... WITHOUT SPACES. As a result you get a solid block of characters. Yes, you can split them manually but it would take a lot of time. Two application solve this problem correctly - Adobe Acrobat and online service online2pdf.com. All other converters (paid and free) FAILED. BY the way I spent couple day and I found out an open source library which solved this problem. It is SymSpell ported at different languages including C++, C#, Rust, Python... I did a simple application for myself where I copied a text with problems and receive a fixed text. It is not an OCR at all. I have idea why this absolutely simple solution is not a standard feature of ALL OF CONVERTERS. And the second suggestion. Tesseract is an open source project. Why not to include into conversion from pdf? P.S. This forum doesn't allow me to attach such problem files. If developers want to see samples of such files please reply in this post |
|
|
|
|
|
#2 |
|
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Apr 2026
Device: none
|
sorry, not such simple. 95% is recognized but 5% something incorrect
|
|
|
|
| Advert | |
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Converting Epub>docx | Winnito | Conversion | 2 | 03-24-2022 11:38 AM |
| Help with indents when converting docx to epub | MJParker | Conversion | 2 | 09-29-2021 04:40 AM |
| Question about converting epub to .docx or PDF | andi1235 | Conversion | 17 | 07-22-2020 09:59 PM |
| Converting from EPUB to DOCX - styles | tage fredheim | Conversion | 2 | 10-16-2019 11:21 AM |
| Converting a play in docx to epub | sir_despard | Conversion | 1 | 01-29-2014 08:00 AM |