Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 09-12-2009, 12:41 AM   #1
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Prince XML for creating mobile reader-sized PDFs?

I've started to experiment cross-platform (Windows, Mac, Linux, FreeBSD, etc.) commandline tool Prince XML for creating liseuse-sized PDFs from HTML or XML source.

It's free to download and run for personal use, though it does add a small watermark on the first-page.

My road there has been the following. Those of you who followed the "PDF is not an ebook format" thread know that some of us who do still think PDF is an ebook format have been disappointed with the typography quality delivered by the renderers for most standard reflowable formats like ePub and mobi. In particular these tend not to support such things as end-of-line hyphenation, ligatures, font kerning, widow and orphan control, embedding font subsets, etc.

But at the same time we all must admit that the ability to customize/change the font size and page properties is a desirable feature.

This lead us to discuss what options there might be for the best of both worlds. At least presently, it seems that the best looking ebooks are those generated by something like (pdf)LaTeX, made especially for the size of the device in question, and the font size preferred by the user. Some of us are still looking into the possibility of automating the process of generating appropriately-sized PDFs from LaTeX code, as in this thread.

One stumbling block is that LaTeX uses its own mark-up language, whereas most other ebook formats are HTML or XML based, and while conversion is certainly possible, it's unclear if any converters right now work well enough that the resulting code wouldn't have to be manually checked and corrected

In researching the use of LaTeX for creating ebooks, I discovered that Feedbooks used to do something like this with LaTeX, but has switched to Prince XML -- so I decided to experiment with it myself.

Some interesting features:
  • Support for end-of-line hypenation (using the same algorithm as TeX, with the possibility of specifying a custom pattern list)
  • Font kerning
  • The possibility of incorporating floats and footnotes, etc.
  • Limited support for MathML
  • The possibility of specifying whatever page-size you want, and whatever font/font size you want, to embed, etc.
  • Bette/fuller support for CSS than any ePub renderer (perhaps even theoretically perfect up-to-spec ePub).

The results are interesting, so far. I don't think the results are as good as LaTeX, but it may just be that I'm less familiar with it. Still the possibility of more easily incorporating it within a conversion script--for example, from ePub--(their website even gives instructions on how to call it from within various programming/scripting languages), makes exploration of it worthwhile, in my opinion.

I'd be very interested in hearing about anyone else's experience with it, or opinion about its prospects.

My own experiments are just beginning, but I'll post some initial results in the next post.

Last edited by frabjous; 09-12-2009 at 01:35 AM.
frabjous is offline   Reply With Quote
Old 09-12-2009, 01:06 AM   #2
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
All right, just to give some initial results... I've been working on making Bertrand Russell's Introduction to Mathematical Philosophy (a public domain title) available in several different formats. To this end, I first generated HTML source, which I used as a master document for creating .mobi and .ePub files.

I posted these screenshots in the PDF is not... in this post (which also had some additional screenshots), but to repeat:

Here's how my first attempt at a mobi looked like as a screenshot on a Kindle:



Things to note: it looks like crap generally, and the overlining (which you'll see below) doesn't even work since mobi doesn't support it. (I've since had to modify the original notation there... I hate mobi files.)

ePub is an improvement, since you can at least overline. Here's a screenshot from ADE from the ePub version:



This looks better than on my Sony, where I can't get full justification, but it still looks pretty bad. The variables are not in true italics, but a slanted roman, and hence run into the Sheffer strokes next to them. The Sheffer strokes are not properly spaced.

I took my HTML source and converted it to TeX, and can now make various sized PDFs from that source. Here's what it now looks like if I use 12pt Bitstream Charter for the font and size it for my reader:



Very nice.. hyphenation, kerning, proper mathematical spacing, and a number of other improvements it's hard to fully list.

Nevertheless, converting it to LaTeX was a fair amount of work that I couldn't have fully scripted.

However, with Prince XML, I could have gotten pretty decent results just by sticking in a few things in the CSS of the HTML, in particular, adding just:

Code:
@page{ size: 90mm 120mm; margins: 2mm 2mm 2mm 2mm}
body
{
    hyphens: auto;
    font-family: Charis SIL
}
(I choose Charis SIL since it's based on Bitstream Charter, for ease of comparison.)

The result after running Prince, for the same page of the above book, looks like this:



Here's one page earlier, so you can see what a page of just text looks like:



This isn't as good as the LaTeX, I'll admit, but I haven't really put 1/100th as much work into it. I could probably do the Sheffer Stroke spacing better with some MathML, and I might even be able to do the original pagination in the margins, as in the LaTeX versions with the right code (--it has a lot of interesting options--) etc. The line spacing gap created by the footnote marker is definitely unsightly, but again, maybe with some tweaked CSS this could be fixed.

Still, it's much better than the original ePub (at least as displayed by ADE or on my Sony), and infinitely better than the .mobi. We've got hypenation, kerning, a nicer looking font, true italics, justification that will work even on my Sony, etc.

Changing the font or font size would just be a matter of making one minor change to the CSS before running Prince. Indeed, some of this would be easier to automate than with LaTeX.

If we could get a script working with it to extract the (X)HTML from an ePub and convert it via Prince, I wonder if I'd ever use ePubs on my reader again... I may even begin working on such a script, perhaps even with a GUI, despite my very limited programming skills. Then again, I may not have enough freetime to do anything of the sort.

Last edited by frabjous; 09-12-2009 at 01:23 AM.
frabjous is offline   Reply With Quote
Old 09-12-2009, 05:43 AM   #3
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Prince XML looks quite interesting, especially if they're responsive in introducing new features (a pity it isn't open source).

Quote:
If we could get a script working with it to extract the (X)HTML from an ePub and convert it via Prince, I wonder if I'd ever use ePubs on my reader again... I may even begin working on such a script, perhaps even with a GUI, despite my very limited programming skills. Then again, I may not have enough freetime to do anything of the sort.
This script might be a starting point.
Jellby is offline   Reply With Quote
Old 09-12-2009, 07:36 AM   #4
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by Jellby View Post
(a pity it isn't open source)
A real pity indeed.
Valloric is offline   Reply With Quote
Old 09-12-2009, 10:09 AM   #5
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
My lack of faith in XML based solutions (or at least proprietary ones) comes primarily from my expectation that sooner or later (knowing me, rather sooner) I'll come across something they haven't thought of... and I'll either have to choose another tool, or try to hack it using the features/tools they already have available, which will probably (given unintended usage) look like crap.

Not to beat a dead horse... but Hungarian Runes written in boustrophedon *is* of genuine and ongoing relevance to me. So is Hanzi with good quality typography. So is weird stuff like being able to put arbitrary accents or subaccents on just about any character, whether latin, cyrillic, or even Hanzi.

These things are just off the top of my head... and I suspect all three of them make PrinceXML a non-starter for me.

Not to mention far simpler that I suspect they have no proper support for:

Hungarian hyphenation. Including properly hyphenating doubled digraph consonants, and having some way to differentiate genuine doubled digraphs from a non-doubled digraph merely sitting beside the "wrong" single letter.

e.g.:

boccsal -> bocs-csal
bérccsoport -> bérc-cso-port
sasszárny -> sas-szárny
hosszú -> hosz-szú
et cetera
Or am I underestimating the tool?

- Ahi

Last edited by ahi; 09-12-2009 at 10:16 AM.
ahi is offline   Reply With Quote
Old 09-12-2009, 10:56 PM   #6
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
No worries, Ahi... I haven't given up on contributing to a robust LaTeX for eBook automation script; I'm just exploring Prince as something that might fill a gap in the meantime, especially since it might not take much effort to get a pretty decent script for it together.

Jellby, thanks for recommending your ePub script -- it did actually cross my mind. I'll take a closer look when I get a chance. Very busy with other things right now, unfortnately... which is tough when you're excited about "fun" projects like these.

Another interesting thing I noticed is Prince 7.0beta automatically uses ligatures. Neat.
frabjous is offline   Reply With Quote
Old 09-13-2009, 07:07 AM   #7
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I've tried Prince XML and it's quite easy and powerful, at least at first sight. As a test, I've done the conversion of The Picture of Dorian Gray, and the result is attached.

All that was needed was this user css:

pdf_output.css
Code:
@page {
  size: 9cm 12cm;
  margin: 5mm 1mm 1mm 1mm;
  @top-left {
    border-bottom: solid 0.2pt #000;
    margin-bottom: 1mm;
    content: "";
  }
  @top-center {
    font-size: 60%;
    font-style: italic;
    border-bottom: solid 0.2pt #000;
    margin-bottom: 1mm;
    content: string(chaptitle);
  }
  @top-right {
    font-size: 50%;
    border-bottom: solid 0.2pt #000;
    margin-bottom: 1mm;
    content: counter(page) "/" counter(pages);
  }
}

@page:first {
  margin: 1mm 1mm 1mm 1mm;
  @top-left {
    border-size: 0;
    margin: 0;
    content: normal;
  }
  @top-center {
    border-size: 0;
    margin: 0;
    content: normal;
  }
  @top-right {
    border-size: 0;
    margin: 0;
    content: normal;
  }
}

@page title {
  margin: 1mm 1mm 1mm 1mm;
  @top-left {
    border-size: 0;
    margin: 0;
    content: normal;
  }
  @top-center {
    border-size: 0;
    margin: 0;
    content: normal;
  }
  @top-right {
    border-size: 0;
    margin: 0;
    content: normal;
  }
}

/* specific code for this image */
@page cover {
  size: 7.98cm 12cm;
  margin: 0 -2.01cm; /* make the virtual width 12cm */
}

body {
  font-size: 9.9pt;
  font-family: serif;
  text-align: justify;
  prince-image-resolution: 166dpi;
  hyphens: auto;
  prince-text-replace: " – " "—"       /*replace em-dashes*/
                       "st" "s\FEFFt"; /*disable st ligatures*/
}

body.cover {
  page: cover;
}

div.header {
  string-set: chaptitle content();
}

div.title, div.edition {
  page: title;
}
div.edition {
  float: bottom;
}

p.logo {
  display: none;
}

div.toc a {
  text-decoration: none;
}
div.toc a::after {
  content: leader('. ') target-counter(attr(href), page);
}

h1 {
  prince-bookmark-level: 1
}
(I also modified the main font.css file, to use FreeSans, and FreeSerif as default fonts for all conversions.)

And then on the directory with the ePUB files uncompressed, I ran:

Code:
prince OEBPS/Cover.xhtml OEBPS/Title.xhtml OEBPS/Contents.xhtml OEBPS/Preface.xhtml OEBPS/Chapter-*.xhtml -s pdf_output.css -o test.pdf
The only modification in the .xhtml was adding a "class=cover" in the body of Cover.xhtml, which I'll be probably adding in future ePUBs. Note how I changed back, on the fly, from "space-endash-space" no "emdash" The "st" ligatures are probably a fancy of the FreeSerif font, which I didn't want here.

Now I have to separate the "universal" stuff from the settings and classes particular to this ebook or to my coding style. I'd like to place the latter in a separate css file in the ePUB, and maybe use some metadata container for it, then a converter could use this for automatically convert the ePUB to PDF...

By the way, the logo watermark in the first page is quite easy to remove if you output the PDF with --no-compress (the pdf can later be compressed with pdftk).
Attached Files
File Type: pdf test.pdf (1.22 MB, 1106 views)

Last edited by Jellby; 09-13-2009 at 04:42 PM.
Jellby is offline   Reply With Quote
Old 09-13-2009, 07:14 AM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,943
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by frabjous View Post
I've started to experiment cross-platform (Windows, Mac, Linux, FreeBSD, etc.) commandline tool Prince XML for creating liseuse-sized PDFs from HTML or XML source.
How do you create bright light (liseuse) sized PDF? In fact, how do you use a bright light as a computer?
JSWolf is offline   Reply With Quote
Old 09-13-2009, 10:02 AM   #9
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Quote:
Originally Posted by JSWolf View Post
How do you create bright light (liseuse) sized PDF? In fact, how do you use a bright light as a computer?
By slaughtering a thousand ePubs and using their life-essence for washing.

- Ahi
ahi is offline   Reply With Quote
Old 09-13-2009, 10:04 AM   #10
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Quote:
Originally Posted by frabjous View Post
No worries, Ahi... I haven't given up on contributing to a robust LaTeX for eBook automation script; I'm just exploring Prince as something that might fill a gap in the meantime, especially since it might not take much effort to get a pretty decent script for it together.
Oh, I'm not worried. I was just pointing out why, for me and for other users with broader than single language requirements, this sort of solution is less likely to be viable.

Obviously it has its benefits, and Feedbooks definitely shows its value very well.

- Ahi
ahi is offline   Reply With Quote
Old 09-14-2009, 11:33 AM   #11
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
OK, here it is, the first version of epub2pdf, a bash script for converting ePUB books to PDF. Since it's a bash script, it needs bash (doh!), that means linux users will have it easy, for MacOS users it will be probably similar, but Windows users will have to install Cygwin for the moment (although it should be easy to translate the script to Windows...)

I've tried it with some ePUBs of my own, as well as generated by Calibre, and uploaded by Zelda and Abecedary, and it seems to work great! These are the usage notes:

Code:
epub2pdf.sh [options] input.epub output.pdf

Where the options are:
  -s "style.css"  Use "style.css" as stylesheet instead of the default ~/.epub2pdf/default.css
  -v              Verbose output
  -h              Show this help
So you see, it's fairly easy. Place the included "default.css" file in ~/.epub2pdf and it will be used automatically for all conversions, or you can specify other css file (or modify default.css at your will), it will be searched in ~/.epub2pdf first (so you can keep different "profiles" there).

I added a feature to use a book-specific stylesheet if found. This stylesheet should be included in the .epub and referenced thus:

1.- Include a .css with rules and selectors for Prince XML. These are not going to be used in the normal ePUB rendering, only when processing with Prince XML, so you can use everything supported by Prince XML (use !important to override the standard css rules).

2.- As with every file you include in the epub, there must be an entry in the <manifest> (in the .opf file).

3.- Add a <meta name="prince-style" content="XXXXX"> to the <metadata> block of the .opf file, where "XXXXX" is the id of the above .css file.

That's all, epub2pdf will use this .css file included in the .epub in addition to the default.css or whatever you use. As an example, I'm updating the The Picture of Dorian Gray upload.

Please, try it and tell me what you think!

EDIT: Script updated to version 2.0 (now it uses XMLStarlet to process the metadata and "pdf-style" has been changed to "prince-style").

EDIT: Now updated to version 3.0

EDIT: The script is now available here.

Last edited by Jellby; 11-22-2009 at 07:52 AM.
Jellby is offline   Reply With Quote
Old 09-14-2009, 11:37 AM   #12
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Quote:
Originally Posted by Jellby View Post
OK, here it is, the first version of epub2pdf, a bash script for converting ePUB books to PDF.
This works by generating a PrinceXML source file and getting PrinceXML to do the PDF generation? Or how?

(Can't check it until later today... so forgive the question, if it is a dumb one.)

- Ahi
ahi is offline   Reply With Quote
Old 09-14-2009, 12:18 PM   #13
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by ahi View Post
This works by generating a PrinceXML source file and getting PrinceXML to do the PDF generation? Or how?
Sort of. The good thing about Prince XML is that it works on standard XHTML files, so nothing has to be changed in the source ePUB. All the script does, actually, is uncompress the .epub file and call Prince XML on all the files in the spine on the right order. The formatting is done through .css files.
Jellby is offline   Reply With Quote
Old 09-14-2009, 12:37 PM   #14
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Wow, great job, Jellby. I had begin playing around with your epub-read script as a starting place, but I had a feeling you'd beat me to the punch.

Ahi, I haven't studied the script in too much detail, but it looks simpler still. It just extracts the (X)HTML source of the ePub, reads the contents of its spine and table of contents, and then processes those files in that order --- (you can include multiple files in a single PDF with Prince) -- adding only a CSS file that controls the page layout and some defaults (fonts, hyphenation pattern, etc.). (EDIT: oops... didn't see Jellby's reply...)

This seems to work well. Some notes though:

1. Dont' know what linux distro you're using, but dos2unix does not come standard on Ubuntu Jaunty; fixed by installing the tofrodos package. (Actually I did that earlier for your other script.)

2. Right now, if the CSS of the ePub chooses a different font/font size/justification setting, etc., it overrides the settings in default.css; this is perhaps as it should be, but a setting that would make default.css override these would be great. (This would be tougher to code, and perhaps dangerous in certain circumstances, depending its aggression level...)

3. Defaulting to a 9.9pt font seems a little small...

Some things that would be nice:
  • A port to something like python to make it a bit more platform-independent, though personally a bash script works fine for me.
  • A minimal GUI wrapper for editing default.css (or creating a new custom .css) in which you can choose page sizes (maybe even from a list of standard ones), borders, fonts, etc. I might work on this if no one else is interested. But maybe it's not worthwhile before it's ported. (Linux users may well be happy without one.)

Last edited by frabjous; 09-14-2009 at 12:53 PM.
frabjous is offline   Reply With Quote
Old 09-14-2009, 01:31 PM   #15
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by frabjous View Post
1. Dont' know what linux distro you're using, but dos2unix does not come standard on Ubuntu Jaunty; fixed by installing the tofrodos package. (Actually I did that earlier for your other script.)
I'm using Mandriva, and it wasn't standard here either, but I installed it with urpmi This is one of the things that need some work, at the moment I process the spine with sed scripts, which rely on "correct" newlines (that's why I needed dos2unix in some cases). Ideally, the .opf file should be processed with some XML tool, do you know any?

Quote:
2. Right now, if the CSS of the ePub chooses a different font/font size/justification setting, etc., it overrides the settings in default.css; this is perhaps as it should be, but a setting that would make default.css override these would be great. (This would be tougher to code, and perhaps dangerous in certain circumstances, depending its aggression level...)
The standard .epub settings (not those in the "special" pdf-style file) can be overriden by adding !important to the default.css file, at least according to the documentation. I could add another option to specify highes-priority rules (it would be just adding another .css after the book-specific one in the prince command-line).

Quote:
3. Defaulting to a 9.9pt font seems a little small...
Well, I'd expect each user customizing his/her own default.css Some want headers, some don't; some like margins, some don't; some like serif, some sans... I just included my preferences.

Quote:
A port to something like python to make it a bit more platform-
independent, though personally a bash script works fine for me.
Yes, feel free to code it Actually, I have a vanishingly small experience with coding python, perl, or any other platform-independent script language (I've done some perl scripts, but nothing in python), so I'm afraid I'm not the right one to do it. As you see, the script working is very simple, so it shouldn't be too hard to translate it to anything else.

Quote:
A minimal GUI wrapper for editing default.css (or creating a new custom .css) in which you can choose page sizes (maybe even from a list of standard ones), borders, fonts, etc. I might work on this if no one else is interested. But maybe it's not worthwhile before it's ported. (Linux users may well be happy without one.)
Oooh... a GUI, it makes me shudder I think that's quite beyond my goal at the moment, but of course, it would be welcome.

For the moment, let's see if the introduction of this <meta name="pdf-style"> has any acceptance...
Jellby is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Creating XML book listing with Calibre JTAL604622 Library Management 5 06-01-2010 02:57 PM
Question about creating PDFs (resolved - my error, d'oh) Prince Hal PDF 19 03-02-2010 11:30 PM
Software for creating image-based PDFs 301verbs Workshop 2 06-13-2009 12:51 PM
Mobile reader being able to display A4 pdfs Mononofu Which one should I buy? 10 01-17-2009 07:22 AM
Creating media.xml manually pepak Sony Reader 5 11-28-2008 10:26 AM


All times are GMT -4. The time now is 12:51 PM.


MobileRead.com is a privately owned, operated and funded community.