Looking for clean chapter splitting workflow (for custom audiobook creation)
Hello everyone,
I’m working on a personal project: I’d like to create my own audiobooks from DRM-free ePubs, with content I choose, using text-to-speech (Amazon Polly long-form).
I’m a developer, so I’m comfortable with a somewhat clunky workflow that mixes Python scripts and Calibre plugins. I understand this is a messy problem — but here’s what I’ve tried so far:
• I experimented with the EpubSplit plugin, but got frustrated that it “pollutes” my library with all the split parts.
• My goal was to combine EpubSplit with a CLI utility I found on GitHub, epub2txt2, which works nicely once chapters are properly separated.
• The idea: split the ePub → run epub2txt2 on each split ePub → get one clean text file per chapter.
• The next step would be cleaning each chapter of footnotes and other artifacts, which I’m planning to handle with AI.
• Finally, I’ll feed those cleaned chapter texts into Amazon Polly to generate high-quality long-form audio.
So my questions:
• Is there a way to use EpubSplit (or another approach) without cluttering the main Calibre library with all the split sub-books?
• Has anyone built or seen a plugin/workflow specifically for “chapter-per-chapter export” to external files?
• Or is the recommended way to script this entirely outside Calibre, by parsing the ePub directly?
Thanks in advance for any guidance!
|