MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   Help with converting PDF to epub (https://www.mobileread.com/forums/showthread.php?t=62104)

neilmarr 11-13-2009 08:26 AM

Help with converting PDF to epub
 
[moved to a new thread -- Valloric]

Apologies in advance for being a dumb wordsmith and no technotop ...

I need to convert almost 100 PDF versions of my own wee house's paperback titles to ePub. Our own technical side is up to the gills in other work and we can't afford to hire in extra help ... so the job's down to me. And I'm a bloody technodunce. Pretty well everything in this thread, for instance, may be plain English to my MR pals, but it's way over my head.

So my question is simply this: Is there a simple 'Sigil for Dummies' turorial of any sort that will explain, step-by-step, how I first convert my PDFs to ePub using the Sigil programme I've installed and then how to edit the result ready to go?

Many thanks. Neil

kjk 11-13-2009 01:33 PM

Quote:

Originally Posted by neilmarr (Post 655788)
Apologies in advance for being a dumb wordsmith and no technotop ...

I need to convert almost 100 PDF versions of my own wee house's paperback titles to ePub. Our own technical side is up to the gills in other work and we can't afford to hire in extra help ... so the job's down to me. And I'm a bloody technodunce. Pretty well everything in this thread, for instance, may be plain English to my MR pals, but it's way over my head.

So my question is simply this: Is there a simple 'Sigil for Dummies' turorial of any sort that will explain, step-by-step, how I first convert my PDFs to ePub using the Sigil programme I've installed and then how to edit the result ready to go?

Many thanks. Neil

You will need Calibre to convert from PDF to ePub. Then use Sigil to edit the ePub.

Valloric 11-13-2009 01:56 PM

Quote:

Originally Posted by neilmarr (Post 655788)
I need to convert almost 100 PDF versions of my own wee house's paperback titles to ePub. Our own technical side is up to the gills in other work and we can't afford to hire in extra help ... so the job's down to me. And I'm a bloody technodunce. Pretty well everything in this thread, for instance, may be plain English to my MR pals, but it's way over my head.

So my question is simply this: Is there a simple 'Sigil for Dummies' turorial of any sort that will explain, step-by-step, how I first convert my PDFs to ePub using the Sigil programme I've installed and then how to edit the result ready to go?

You have a tough road ahead of you.

First of, your source documents are PDF. That's the worst possible format to convert to epub. Epub is a reflowable format, meaning the paragraphs are "marked up": the text that needs to be displayed as a single paragraph is marked as such like this:

Quote:

<p>This is all one paragraph. This is all one paragraph. This is all one paragraph. This is all one paragraph. This is all one paragraph</p>
The <p> "tags" mark the boundaries of a paragraph. This is all XHTML. Other elements of the document are marked up with different tags. This is done so that the Reading System (a computer application or a hardware device) can adjust how the text displays. Hand-held devices have smaller screens and a paragraph appears "longer", whereas a computer monitor has a bigger screen so the paragraph takes up fewer lines. In essence, the display of the text adjusts to the size of the screen. So a book could have 300 pages when displayed on a computer screen, and 800 pages when displayed on the Sony Reader. It's the same book though, just displayed in different page/screen sizes, automatically.

PDF is a problem. It is not a reflowable document type. The "page" is fixed when the document is created. You cannot change it afterwards. Every character is effectively "burned in" on the virtual page. There is no semantic information about paragraphs, tables, images etc. Any converter (like for instance calibre) has to make guesses about the structure of your document, and these guesses often don't work. So converting PDF to any reflowable format is extremely difficult and error-prone.

I would suggest you get access to the original source documents from which the PDF versions were created. The original documents were surely reflowable and conversion to epub from those formats would be much more accurate.

As kjk noted, you will need calibre or some other converter to convert your original documents to a format Sigil can import, like (X)HTML or epub. Calibre can also do this for PDF books, but the results are usually not pretty. Not because of calibre, but because of the PDF format itself.

So in a nutshell, you need to convert your documents to XHTML or epub and then open these files in Sigil for editing. That's it.

EowynCarter 11-13-2009 04:34 PM

Quote:

I would suggest you get access to the original source documents from which the PDF version were created.
Yeap, would be some help.
The best pdf to someting else converter i've seen was the mobipocket creator. Calibre have some options the should works ot bad too. Maybe you should also have a look at the paid stuff.

You're on for some hours editing though....

neilmarr 11-14-2009 04:47 AM

Thanks, chaps. Your info is especially helpful and clear, Valloric. We should, indeed, have all the Word.doc source documents in the data base, so we can work from there. The other editors on the team have already agreed to proof the eventual ePub versions of what titles they worked on, Eowyn, so that should be simply a matter of midnight oil for them and for me. I'll spend the weekend on this wee puzzle and report back, but thanks so much for explaining the basics and pointing me in the right direction (basically, convert Word source material to ePub using Calibre and then use Sigil to edit and produce the polished end result, if I've understood correctly). Best wishes. Happy weekend. Neil

PS: Do you think it realistic to think in terms of offering only PDF and ePub or do you reckon it woule be worth considering other formats while we're on this marathon job? N

EowynCarter 11-14-2009 09:18 AM

Quote:

PS: Do you think it realistic to think in terms of offering only PDF and ePub or do you reckon it woule be worth considering other formats while we're on this marathon j
mobi maybe, if you want kindle users to read your book.

neilmarr 11-14-2009 10:26 AM

The problem with Kindle Store, Eowyn, is that it only carries books by publishers with a
US presence (US address, social security and tax numbers, and banking). We're UK-based so -- although all our single edition paperbacks by authors worldwide and on which we hold international rights are still on offer at Amazon -- we, and hundreds of other small non-US houses -- are locked out and, after eight years, our ebooks disappeared overnight.

Amazon says there are 'no immediate plans' to change the Kindle Store system to allow foreign publishers a crack of the whip. No explanation ... just that. I have a feeling non-US houses are likely to see a smiliar lock-out on their ebook versions with Barnes & Noble now that *the nook* has launched. All our ebook versions have disappeared from their site.

I'm not too concerned. Amazon and B&N are predators and I'd prefer not to join them in their game. If I can make our ebooks more presentable in smaller, altertnative and honest retail stores -- less monopolising, manipulative and downright greedy and with no nefarious vested interest in hardware and jealously guarded format and DRM -- I'll feel I'm doing my job as I always promised our authors and readers it would be done.

Hoots. Neil


All times are GMT -4. The time now is 06:43 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.