MobileRead Forums - View Single Post - Looking for clean chapter splitting workflow (for custom audiobook creation)

bernardsirius · 08-20-2025, 02:31 PM

Hello everyone,

I’m working on a personal project: I’d like to create my own audiobooks from DRM-free ePubs, with content I choose, using text-to-speech (Amazon Polly long-form).

I’m a developer, so I’m comfortable with a somewhat clunky workflow that mixes Python scripts and Calibre plugins. I understand this is a messy problem — but here’s what I’ve tried so far:
• I experimented with the EpubSplit plugin, but got frustrated that it “pollutes” my library with all the split parts.
• My goal was to combine EpubSplit with a CLI utility I found on GitHub, epub2txt2, which works nicely once chapters are properly separated.
• The idea: split the ePub → run epub2txt2 on each split ePub → get one clean text file per chapter.
• The next step would be cleaning each chapter of footnotes and other artifacts, which I’m planning to handle with AI.
• Finally, I’ll feed those cleaned chapter texts into Amazon Polly to generate high-quality long-form audio.

So my questions:
• Is there a way to use EpubSplit (or another approach) without cluttering the main Calibre library with all the split sub-books?
• Has anyone built or seen a plugin/workflow specifically for “chapter-per-chapter export” to external files?
• Or is the recommended way to script this entirely outside Calibre, by parsing the ePub directly?

Thanks in advance for any guidance!

08-20-2025, 02:31 PM	#1
bernardsirius Junior Member Posts: 1 Karma: 10 Join Date: Aug 2025 Device: iPad	Looking for clean chapter splitting workflow (for custom audiobook creation) Hello everyone, I’m working on a personal project: I’d like to create my own audiobooks from DRM-free ePubs, with content I choose, using text-to-speech (Amazon Polly long-form). I’m a developer, so I’m comfortable with a somewhat clunky workflow that mixes Python scripts and Calibre plugins. I understand this is a messy problem — but here’s what I’ve tried so far: • I experimented with the EpubSplit plugin, but got frustrated that it “pollutes” my library with all the split parts. • My goal was to combine EpubSplit with a CLI utility I found on GitHub, epub2txt2, which works nicely once chapters are properly separated. • The idea: split the ePub → run epub2txt2 on each split ePub → get one clean text file per chapter. • The next step would be cleaning each chapter of footnotes and other artifacts, which I’m planning to handle with AI. • Finally, I’ll feed those cleaned chapter texts into Amazon Polly to generate high-quality long-form audio. So my questions: • Is there a way to use EpubSplit (or another approach) without cluttering the main Calibre library with all the split sub-books? • Has anyone built or seen a plugin/workflow specifically for “chapter-per-chapter export” to external files? • Or is the recommended way to script this entirely outside Calibre, by parsing the ePub directly? Thanks in advance for any guidance!