Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 09-17-2015, 04:43 PM   #1
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
EPUB programming API?

For those of us who believe GUI's are just a passing fad:
Anybody know of a good (script) language toolset for creating and manipulating EPUB's? Something that handles stuff like NCX, OPF, zipping etc.? I found gepub for ruby, which seems fairly powerful.
SBT is offline   Reply With Quote
Old 09-17-2015, 05:04 PM   #2
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Haven't really thought about it, to be honest.

But I imagine you could do quite a bit using calibre's APIs from a custom calibre-debug script. Of course it only supports EPUB2.

Scripted creation of EPUBs seems a bit counterintuitive, since at it's core an EPUB GUI like Sigil or calibre's ebook-edit is an API for automatically resolving name changes, wrapped around a plaintext editor and a ZIP compressor.

And it doesn't get much more basic than a plaintext editor.
eschwartz is offline   Reply With Quote
Advert
Old 09-18-2015, 07:17 AM   #3
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
Quote:
Originally Posted by eschwartz View Post
But I imagine you could do quite a bit using calibre's APIs from a custom calibre-debug script. Of course it only supports EPUB2.
Haven't looked at that, I admit. Though a great admirer of the sheer power of Calibre, I'm not particularly fond of what it does to the epub code.
Quote:
Scripted creation of EPUBs seems a bit counterintuitive, since at it's core an EPUB GUI like Sigil or calibre's ebook-edit is an API for automatically resolving name changes, wrapped around a plaintext editor and a ZIP compressor.

And it doesn't get much more basic than a plaintext editor.
Well, I actually use UNIX sed quite a lot
I repeat my belief in GUI's being just a passing fad More seriously, I'm very fond of something more batch-like. My work-flow currently looks something like this:
  • OCR
  • grep for obvious OCR errors
  • insert single-character "home-made" tags for headings, images, footnotes etc.
  • filter text through scripts to join words split over lines, handle footnotes, images etc, and convert to XHTML.
  • Automatically split text into chapters & generate NCX, OPF and stuff.
  • Edit stylesheet.
  • Zip up & proofread.
If I feel the need for something closer to WYSIWYG editing, I have the relevant xhtml file(s) open in a browser, edit them in an editor, and refresh the browser as required.
Currently I've a shell script to do the job, but it's a bit clunky, and generally I find that any good ideas I come up with regarding SW can already be found on github.

Apart from being old-fashioned to an almost bloody-minded degree, I do find there are definite advantages to this methodology. First and foremost that licking an OCR text into shape with least effort requires access to an astounding variety of command-line tools, which are somewhat more difficult to access even with a sensibly designed GUI system. Secondly, that it makes the separation of content and presentation very natural. Thirdly, it'll always beat a GUI tool for flexibility, for example when editing several volumes at once. My chief gripe against GUI systems is that they make tasks easy to learn, but not easy to do.
SBT is offline   Reply With Quote
Old 09-18-2015, 11:40 AM   #4
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Three cheers for sed!

As far as calibre goes, the editor APIs, which is what I was thinking of, don't do anything to your code. Conversion is what mangles your code.

You can open up an ebook container, add files as-is (calibre will then take care of managing to OPF), rewrite all your scripts to use python or use subprocess.check_output() in convoluted ways, build the ToC through XPath, etc.


I'm not saying this is the easiest way for you -- you have to learn python (if you don't know it) first, and migrate a bunch of stuff that works already, but it is certainly *possible*.

Which is all you asked : do such scripting tools exist.
Answer: calibre is one such tool, being possessed of extensive python modules for manipulating ebooks.


And just for the perversity: You can turn all that into an editor plugin that can be used from within the ebook-edit GUI.
Then, what is the difference between the GUI and the command line?
Other than processing many books in one pass. (Answer: calibre plugins can have CLI modes. Rarely do their makers create one, but hey, it's possible. )

You could make a fair argument that the GUI is not just a passing fad.
But then, I could also poke fun at you for using a GUI to write the preceding post, if I was that determined to poke fun at you...


As a fellow believer in flexible automation, can I tempt you with Vimperator or VimFX?
eschwartz is offline   Reply With Quote
Old 09-18-2015, 05:57 PM   #5
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
Posts: 13,291
Karma: 78876004
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
Is https://pypi.python.org/pypi/EbookLib/0.15 of interest?
PeterT is offline   Reply With Quote
Advert
Old 09-19-2015, 05:56 AM   #6
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
Quote:
Originally Posted by eschwartz View Post
But then, I could also poke fun at you for using a GUI to write the preceding post, if I was that determined to poke fun at you...
Grmf. You knew that would make me waste my precious time trying to answer your post using telnet, didn't you?
Thanks for putting me straight on the Calibre API. Wow, they even have documentation!

@PeterT: Thanks for the tip. It is definitely of interest.

The problem all the epub libraries/API's I have found share is that it's just too much fun making your own... However, it's definitely useful to peek at other solutions, ideally my own stuff should do the job better than what is already available.
SBT is offline   Reply With Quote
Old 09-19-2015, 11:31 PM   #7
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by SBT View Post
Grmf. You knew that would make me waste my precious time trying to answer your post using telnet, didn't you?
Thanks for putting me straight on the Calibre API. Wow, they even have documentation!
Or cURL.

Guilty as charged.
eschwartz is offline   Reply With Quote
Old 10-11-2015, 05:42 AM   #8
skreutzer
Software Developer
skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.
 
skreutzer's Avatar
 
Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
If the manipulations you have in mind are of general usefulness and not specific to your projects, I might be interested in implementing them for my toolchain. However, it isn't very flexible (not in a scripting language, but ports and new parts could be), and manipulation is mostly done with XML libraries, not with Search&Replace or Regex etc., while still some primitive text parsers are part of the package, too.

The “architecture” has three levels:
  • Standalone CLI tool
  • CLI workflows which group tools together for automatization
  • Optional GUIs for tools and workflows
skreutzer is offline   Reply With Quote
Old 10-11-2015, 05:23 PM   #9
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
I'll keep you posted, skreutzer.

I am trying to code something that'll make everything work always...

I'm thinking in terms of an OO programming project, with the core pretty similar to the libraries out there already: set basic metadata, add files, spine entries and toc entries, and methods for outputting Epub2/3 & KF8.

However, instead of having methods for adding files, OPF & NCX entries, I thought I'd use delegates instead. Then you could have plain vanilla delegates for your plain vanilla ebook, and when you'd be making a book with a really contorted TOC, you just make a contorted_TOC child class from your vanilla_TOC class. Or an entirely new class, for that matter. With the existing tools, you end up contorting the NCX more or less after the fact, and that often turns ugly.

My problem with all toolchains I've used so far is that I end up changing pretty basic functionality all the time to deal with strangely structured ebooks and/or strange source material. I want to make an architecture that's sufficiently flexible to lessen the chance of that happening, and that'll ensure I can reuse as much code as possible.
SBT is offline   Reply With Quote
Old 10-12-2015, 05:03 AM   #10
skreutzer
Software Developer
skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.skreutzer considers 'yay' to be a thoroughly cromulent word.
 
skreutzer's Avatar
 
Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
I understand what you're looking for, and that the scope of my project is a little different and won't fit your requirements. The difference is that I have to build tools which are usable for non-programmers, but if you're able to write your own lamdas in order to customize a more general library to meet your specific project needs, then you indeed need something which provides such hooks. My “hooks” are that you could use your own custom XSLTs, or that you could add more tools to an existing workflow which perform certain tasks, but basically that's it. I could introduce lamdas or at least dependency injection, but I don't think you would be very pleased with Java as bytecompiled language to do such customization, would you? ;-)
skreutzer is offline   Reply With Quote
Old 10-13-2015, 03:50 AM   #11
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by SBT View Post
My problem with all toolchains I've used so far is that I end up changing pretty basic functionality all the time to deal with strangely structured ebooks and/or strange source material. I want to make an architecture that's sufficiently flexible to lessen the chance of that happening, and that'll ensure I can reuse as much code as possible.
I know that you're a hard-core terminal user, but you may want to give Sigil and the bundled Python plugin framework (doc link) another try.
Why reinvent the wheel when pretty much all the routines that you need for manipulating/creating .ncx, .opf and .xhtml files are already there?

Theoretically, they'd allow you to assemble a fully working epub2 or epub3 book without using a single Sigil menu option other than the Plugin Runner.

I.e., you could use Sigil to launch your scripts and preview the generated ePubs.

And if you find the provided well-documented Python routines too limited, you could easily extend them with Python scripts or shell scripts.

Last edited by Doitsu; 10-13-2015 at 03:54 AM.
Doitsu is offline   Reply With Quote
Old 10-13-2015, 08:10 AM   #12
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
Quote:
Originally Posted by Doitsu View Post
I know that you're a hard-core terminal user, but you may want to give Sigil and the bundled Python plugin framework (doc link) another try.
Why reinvent the wheel when pretty much all the routines that you need for manipulating/creating .ncx, .opf and .xhtml files are already there?
...
And if you find the provided well-documented Python routines too limited, you could easily extend them with Python scripts or shell scripts.
Will do, Doitsu. Although making up your own code is more fun than looking at somebody else's... After all, when somebody nods when you're explaining your code to him, wake him up.

(And what's wrong with reinventing the wheel? Isn't it time somebody looked at it afresh to see if it can't have a better shape?)
SBT is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
API ability for editing epub while transfering to device sts Development 3 09-11-2015 09:18 AM
Converting PDF TO EPUB using Java Programming Ravi4484 Workshop 2 01-02-2015 03:32 PM
pocketsphinx Audio Recognition and Programming API twobob Kindle Developer's Corner 60 04-02-2013 07:51 PM
Epub explode/rebuild api drMerry Development 1 05-28-2011 05:27 PM
Converting programming ebooks in PDF to ePub stirredo ePub 1 12-10-2010 10:21 AM


All times are GMT -4. The time now is 10:13 AM.


MobileRead.com is a privately owned, operated and funded community.