01-27-2024, 10:05 AM | #1 |
Member
Posts: 17
Karma: 10
Join Date: Jan 2024
Device: none
|
Build up an epub file from text file somewhat nicely
Hey there,
I've transcribed an audiobook with whisper and the end result is a huge wall of text with no indication about paragraphs, if a word should be bold/italic and aside from some general punctuations, you can't really tell how it should look like. So my question is, without a ton of manual work, is there a way to turn this wall of text into a nice looking epub from scratch? I was thinking maybe a script or some automation of sort. Thank you. Last edited by Pocok; 01-27-2024 at 10:10 AM. |
01-27-2024, 12:09 PM | #2 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Without some indicator or predictor of paragraph end points, you are pretty much out of luck. You might be able to infer paragraph end points from longer "pauses" if timings are included.
|
01-27-2024, 12:11 PM | #3 |
Well trained by Cats
Posts: 29,809
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Oh this has so many possible issues and I am not even a medium Audio Book user. Or is this just a book from a lecture?
The only time I see it having a chance is the original was a TTS version of the printed book. Many of these are 'voice performances'. The "He Said" or character name is now missing because of the 'voice' used to imply the character speaking. Then there needs to be some sort of speech pause indication provides by the speech to TXT, for each paragraph. It would take a really smart app to mark different words for emphasis (bold or italic) or even Punctuation. Help! Just peachy. |
01-27-2024, 12:22 PM | #4 |
Member
Posts: 17
Karma: 10
Join Date: Jan 2024
Device: none
|
Background story: my hearing is not that great, however the book was not available as purchasable ebook, so my only option was the audio book format and transcribe it.
Its a narrated book, the person just reads out the text on the paper with (most likely) proper tone and volume. The end result is pristine, it provided a huge text file + .srt, .tsv and .vtt files with timestamps, without any formatting whatsoever. Yeah, going through it all by hand is not gonna be easy, and as my hearing is not that great, perhaps hearing out how a word is said would be even a bigger challenge (emphasis, pauses, tone etc) if you guys want a closer examination, I can provide the files, I don't know. Last edited by Pocok; 01-27-2024 at 12:27 PM. |
01-27-2024, 01:54 PM | #5 |
Wizard
Posts: 1,103
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Probably don't need to see the files that were created as most of us can imagine pretty closely how the output looks. There will be no software or secret method to fix that. It will be a case of fixing the book as you read it which would be pretty difficult in this case without an original source to refer to.
I would be interested in knowing which book is available as an audiobook but not as an ebook. |
01-27-2024, 02:12 PM | #6 | |
Member
Posts: 17
Karma: 10
Join Date: Jan 2024
Device: none
|
Quote:
This. Also there are a couple others as well like this (Marvel stuff) |
|
01-27-2024, 02:34 PM | #7 | |
Wizard
Posts: 1,103
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
I had a quick hunt around, and it does not seem available anywhere except for some overpriced print editions. |
|
01-27-2024, 02:37 PM | #8 | |
Member
Posts: 17
Karma: 10
Join Date: Jan 2024
Device: none
|
Quote:
Have a look, there are a TON of material, unfortunately the majority of them are long OOP books. |
|
01-27-2024, 03:32 PM | #9 | |
Wizard
Posts: 1,103
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
I saw archive.org had quite a few Marvel books listed, but not your particular one. My local second had bookshop is reopening tomorrow after the Christmas break and I was going to head there for a browse in the next couple of days. I'll check if that book is on the shelves. |
|
01-27-2024, 06:56 PM | #10 | |
Member
Posts: 17
Karma: 10
Join Date: Jan 2024
Device: none
|
Quote:
Also if you see some other Marvel goodies and they are cheap, perhaps get those as well I know at least one more (Fantastic Four: Doomgate) that has no ebook variant, but others might lurk around too) |
|
01-29-2024, 12:54 AM | #11 | |
Wizard
Posts: 1,103
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
Did find a few Michael Crichton books - Micro, State of Affairs and The Terminal Man, and a James Patterson - The Jester, for a total of $15, so not a wasted trip for me. |
|
01-30-2024, 05:31 AM | #12 |
Member
Posts: 17
Karma: 10
Join Date: Jan 2024
Device: none
|
Thank you for looking!
|
02-19-2024, 09:13 PM | #13 |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jul 2023
Device: none
|
You could try ChatGPT with careful thought as to the prompt. You would probably also need to split the text into blocks of a few thousand words.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Is there a way to delete all footnotes and associated text within an epub file? | OccultDemonVHS | Sigil | 4 | 04-16-2022 06:48 AM |
Epub File Only Has Cover, no text. | sarafina | Android Devices | 2 | 11-28-2015 01:06 PM |
Text edits invalidate epub file | morganvont | Sigil | 3 | 01-05-2015 01:21 PM |
How do you edit as Epub file, the text in Word? | automa | Sigil | 13 | 06-13-2013 07:02 PM |
Reducing file size for straight-text epub? | Christi H | ePub | 5 | 01-10-2013 10:09 PM |