11-14-2013, 01:23 AM | #1 |
Junior Member
Posts: 4
Karma: 10
Join Date: Nov 2013
Device: Nook Simple Touch
|
Converting pdfs into Nook Simple Touch Format
Hi,
I have some books in pdf format. I have been trying to convert them into Nook Simple Touch format. I've tried online tools that turn pdfs into .epubs, but they don't seem to display correctly once moved over to my Nook. Either the page numbering will be off, screens will number the same page number multiple times, the size of the page is too large for the screen, etc. I learned that an epub is a zip file with html and other files. I renamed one as such, and saw that one conversion turned real highlightable text into .png files! Is there is a tool, online or downloadable, that will convert my pdfs into a file that will be optimized for Nook Simple Touch-friendly files? If so, is there one where editing, font changing, font size changing, and previewing is possible? Last edited by plaidrhino; 11-14-2013 at 01:26 AM. |
11-14-2013, 05:42 AM | #2 |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Had this same discussion last month in the PDF section:
https://www.mobileread.com/forums/sho...d.php?t=223817 Just be warned, PDF -> anything is the WORST conversion. There will be lots of errors from the OCR output, and it takes many hours of fixing to get it up to par. |
Advert | |
|
11-14-2013, 11:15 AM | #3 |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
If the PDF contains a text layer, i.e. if you can search it, you could try converting it with Calibre with all Heuristics Processing options enabled.
Activating all Heuristics Processing options often leads to somewhat better results. However, as Tex2002ans has already pointed out, PDF files are the worst input format for ePub converstion. Here's a screen shot of the dialog from SoftPedia: |
11-15-2013, 12:55 AM | #4 |
Junior Member
Posts: 4
Karma: 10
Join Date: Nov 2013
Device: Nook Simple Touch
|
Well, I tried using Calibre, turning on the heuristics.
It converted better, with a few anomolies: 1. Triple copies of the cover 2. Numbering is off. A few pages show 1 of 4, then many show 2 of 4. There are over 200 pages though. 3. Margins are too large - too much white space around each page. 4. It converted all the text pages, which I could highlight in Adobe Reader, into png. Only 2 pages, the front and back are graphical. Trying to correct the above. I tried the "tweak book" tool. I opened content.opf. I don't know that much html, which I think will help. Am I on the right track here? |
11-15-2013, 07:55 AM | #5 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
You can try adapting the stylesheet. The best way is still to do OCR though.
|
Advert | |
|
11-25-2013, 11:13 PM | #6 |
Junior Member
Posts: 4
Karma: 10
Join Date: Nov 2013
Device: Nook Simple Touch
|
Toxaris,
Thanks for the idea. I think calibre used OCR, as it converted the text into images. Still hitting the same problems: - Duplicate copies of the cover - Numbering. It only shows x (1, 2, 3 or 4) of 4 pages, even though there are over 200+ pages. - Margins. Too large, extra left alternating with extra right, and overall too much around each page. It is still readable, but I would like it to be a better conversion. If it's not against the site policy, I'll post the original pdf and the epub conversion that resulted from Calibre. Maybe someone can advise of a better technique. Original book: http://www.mediafire.com/view/6rb06n...0of%20Mind.pdf epub output: http://www.mediafire.com/download/o0...e_of_Mind.epub Thanks. |
11-26-2013, 12:11 AM | #7 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Unfortunately, your PDF is made of pictures, probably with text layered behind it, which is why you can highlight it. Many PDFs are like that, and it means nothing, because...
The conversion pipeline in calibre can only read the .png (the main) form of the pages; this is one of the reasons why PDFs are the worst format to convert from. The margins are likely built into the image, especially if they alternate. That is for the left side/right side pages in a paper book, once printed. The cover image is the first "page" and calibre then adds the cover, again, this time as a cover image. If Images are used heavily, there is less length of content in the html, which is probably why the page numbers are wonky; I get that in comic books all the time. It's treating each image as one line, which to be fair, it is. You will have to use OCR to get the text from the pictures. OCR is software that attempts to guess the text from pictures -- calibre doesn't include such software, it can only use the actual content of the PDF Or you can copy and paste into a text file, using calibre's txt conversion to recognize paragraphs by the empty lines in between, use markdown to indicate the bold/headers (for the chapter titles)/italics, use the extracted cover image that calibre has already saved in the book listing, etc. I did this for a few short stories online as free PDF's, and it is not something I would want to do a lot of. Also, in future, you can attach documents to Mobileread, by posting using Go Advanced ==> Additional Options, instead of using external hosting sites. And it's only against site policy to post these if it is a copyrighted book you don't have permission to share. Last edited by eschwartz; 11-26-2013 at 12:16 AM. |
11-26-2013, 07:07 AM | #8 |
Color me gone
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
Does original text show when you select text view in a program like Foxit Reader? You might be able to get at the text that way. What you will get is very hard to say, since PDFs are constructed for display and printing, not deconstruction.
|
11-26-2013, 03:06 PM | #9 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
It does, (the OP said so) but how would you recommend turning that into paragraphs/chapters? I've done that by hand for short stories (copy-pasting, with Paragraph Style: Block and markdown formatting for the bold/italics), but WHEW is that time-consuming.
|
11-26-2013, 03:08 PM | #10 |
Color me gone
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
It is a lot of work. But it might be less work than OCRing and correcting those errors.
Oh for the bad old days when everyone didn't think they were graphic artists and the text had to do the talking. |
11-26-2013, 03:15 PM | #11 |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
the text inside the file is likely OCR'd in the first place and likely full of errors.
Dale |
11-27-2013, 04:54 AM | #12 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
My Word add-in might help you to clean up the OCR text more quickly with retaining formatting.
|
12-01-2013, 08:00 PM | #13 |
Junior Member
Posts: 4
Karma: 10
Join Date: Nov 2013
Device: Nook Simple Touch
|
Thank you all for the replies. I didn't get an email informing me of replies. I just changed the notification setting from weekly (default?) to instant.
Here are my replies: eschwartz - thanks for the tips and the attachment option info mrmikel - I haven't used Foxit - just adobe pdf reader. maybe i'll try it. DaleDe - the text is fine. No errors that I've seen Toxaris - I might try your add-on if I really want to do the coversion. For now, I'm reading it ok in my nook. Just not as clean or neat as I'd like it to be, but it works. Occasionally it loses the page I'm on, or I have to reboot, and then have to press the up or down key a lot, like 50 times, since it only thinks there are 4 pages, and the nook's 'scroller' doesn't work. Wondering, I used to use Adobe's pdf distiller to go from pdf to Word, or another text output - anyone familiar with that? I might use that again. |
12-02-2013, 11:11 AM | #14 | |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
Tags |
conversion, nook, pdf to epub, pdf to epub converter, simple touch |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Numbered and bulleted list format lost on Nook Simple Touch epub | Pondering | ePub | 2 | 10-15-2013 09:19 AM |
ConsumerReport: E-book readers: Nook Simple Touch tops Kindle Touch | afv011 | Barnes & Noble NOOK | 4 | 11-22-2011 03:39 PM |
Kindle 4th gen non touch vs Nook Simple Touch | shinew | Which one should I buy? | 8 | 10-07-2011 09:10 PM |
Kindle 3, Nook Simple Touch, Kobo Touch and Libra Pro Touch | jbcohen | Which one should I buy? | 4 | 06-18-2011 07:58 PM |
Pre-ordering Nook Simple Touch or Kobo Touch? | SilentDuck | Which one should I buy? | 27 | 05-29-2011 05:27 PM |