|
|
#1 |
|
Junior Member
![]() Posts: 8
Karma: 10
Join Date: Nov 2025
Device: none
|
Plugin Epub2Text to export an Epub ebook to Text
I've written a simple plugin to extract the text from an Epub.
Only the text is extracted from the HTML file, no links, images etc. Before exporting the ebook, I always run Tools - Reformat HTML - Mend and Prettify All HTML Files. The output file has the same filename as the Epub, but extension .txt. Existing files are not overwritten. I've used the plugin a few times myself and it works for me. I've also compared the output to the Calibre text output, and while there are differences (can't remember exactly, but I think Calibre replaces non-breaking spaces (and ligatures?) and keeps spaces at the beginning of a line, etc.), I still like my output. I hope some of you will find the tool useful. Feedback is appreciated. |
|
|
|
|
|
#2 |
|
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,536
Karma: 62543878
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
??? What does this do different from the builtin "Output to" Conversion setting(s)
|
|
|
|
|
|
#3 |
|
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 50,507
Karma: 178402706
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Where do we find the 'builtin "Output to" conversion' setting? Are you perchance thinking about calibre's convert to text? The output is different between calibre and this plugin though I haven't decided which if either I prefer.
Last edited by DNSB; 11-25-2025 at 05:20 PM. |
|
|
|
|
|
#4 | |
|
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,536
Karma: 62543878
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Dang, this is the second time I gave a Calibre answer in Sigil
Last edited by theducks; 11-25-2025 at 08:36 PM. Reason: oops |
|
|
|
|
|
|
#5 |
|
Junior Member
![]() Posts: 8
Karma: 10
Join Date: Nov 2025
Device: none
|
Epub2Text plugin update
I have updated the Epub2Text plugin. The new version 0.1.4 produces the same output as the initial version 0.1.0, but is much faster for large ebooks.
A few examples for the performance improvement (of the HTML parser step only, but that was the performance problem): Code:
Text size old new ---------------------------------- 200 KB 0,06 sec. 0,03 sec. 450 KB 2,80 sec. 0,32 sec. 900 KB 4,53 sec. 0,09 sec. 1100 KB 3,81 sec. 0,06 sec. 1600 KB 75 sec. 0,99 sec. 2000 KB 153 sec. 1,39 sec. 9850 KB 391 sec. 0,84 sec. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Calibre export to ePub fails: "Failed to initialize plugin: ... DeDRM.zip" | res555 | Conversion | 4 | 06-14-2022 11:55 PM |
| ebook-convert - text to epub conversion failing | flink | Conversion | 0 | 06-23-2017 07:03 AM |
| creating an output (export) plugin | foobert5 | Development | 1 | 02-12-2016 07:21 PM |
| PRS-T1 Coolreader in rooted Prs T1 export all highlighted text in epub without limits? | Talayero | Sony Reader Dev Corner | 2 | 04-11-2012 12:47 PM |
| Text entry into an ePub eBook | peadra | ePub | 1 | 09-22-2011 01:06 AM |