10-05-2010, 10:19 PM | #1 |
Enthusiast
Posts: 43
Karma: 10
Join Date: Jun 2010
Device: Kindle 3
|
pdf to mobi conversion issue
i just got my kindle 3 today and i love it, and i have been converting pdf's to mobi all day so they are readable on my k3. the books are very readable on my k3, but there are tons of gaps that don't need to be there. line spacing between sentences and sentences stopping and restarting on the next line for no reason. the books are still readable but it's a tremendous waste of space. are there any settings for the conversions i need to know about? is there a better format to convert my pdf's too other than mobi? some info on this would much appreciated, i already input like a hundred books and really would like some clarification on this before i put in anymore. thanks in advance for any help.
|
10-06-2010, 12:30 AM | #2 |
Wizzard
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
|
Sounds like your PDF converter is leaving in the breaks between the "lines" of the PDF instead of joining them up.
This can be fixed by unpacking the converted mobi file to html, then editing the html in a plain text editor and doing a simple search/replace to get rid of redundant spaces/linebreaks (just keep replacing double with single until only single are left), then finally converting the html back into mobi. Unfortunately, this will happen with just about anything you convert your PDFs to (short of an image format which kind of defeats the purpose). It's just one of those drawbacks of that format. |
Advert | |
|
10-06-2010, 12:59 AM | #3 |
Enthusiast
Posts: 43
Karma: 10
Join Date: Jun 2010
Device: Kindle 3
|
aww man that sounds unbearably annoying, i don't know if i can do that for everyone of my thousands of books. is there another better converter i can use?
|
10-06-2010, 01:16 AM | #4 |
Enthusiast
Posts: 43
Karma: 10
Join Date: Jun 2010
Device: Kindle 3
|
i was given a great solution that actually mostly fixed my problem. my line unwrapping factor was set at 0 by default. i was advised to set it to .45, and that did the trick. now i gotta redo like a 100 books. thanks for the advice tho.
|
10-06-2010, 01:16 AM | #5 |
Wizzard
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
|
Well, I've gotten surprisingly decent results using Amazon's @free.kindle.com conversion service. But they only let you use it if you've got an actual Kindle and not just one of the apps.
You can also try tweaking some of the Structure Detection and PDF input settings in Calibre, though I've heard that of the freely available apps, Mobipocket Creator does the best job (I wouldn't know; it doesn't work on the Mac). Or find a regular PDF to text/html converter which does a good job and convert to Mobi from *that* conversion. I think Project Gutenberg's Distributed Proofreaders have a macro up on their wiki that will auto-clean extra linebreaks and such from text and html files. Or maybe we have one here at the MR wiki; I forget which. Also, you could probably automate the process for your already-converted books using a couple of command line scripts so all you'd have to do is point them at a containing folder and walk away for a snack. But that may require a greater level of computer fiddling than you may be comfortable with. |
Advert | |
|
10-06-2010, 01:28 AM | #6 |
Enthusiast
Posts: 43
Karma: 10
Join Date: Jun 2010
Device: Kindle 3
|
hmmmm....i'd love to know more about the scripts. it would save me a ton of time.
|
10-06-2010, 01:43 AM | #7 |
Wizzard
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
|
Er. I'm afraid the scripts would have to be of the DIY variety, as I don't think anyone's formally written up and distributed anything yet.
Personally I just type 'for m in *.mobi; do mkdir "${m/.mobi/}"; mobiunpack "$m" "${m/.mobi/}"; done' directly into the Terminal whenever I want to do a batch unpack and I'd probably run sed or something similar the same way for the batch search and replace before regenerating the .mobi. Sorry for getting your hopes up. Calibre does have a "bulk convert" mode if that's of any help. Last edited by ATDrake; 10-06-2010 at 01:49 AM. Reason: Remembered a possibly useful feature. |
10-06-2010, 02:37 AM | #8 |
Enthusiast
Posts: 43
Karma: 10
Join Date: Jun 2010
Device: Kindle 3
|
the bulk convert does not have the most important thing i require, which is the pdf input button which is only in the individual conversion. the pdf input section of the conversion program has the line-unwrapping factor. in the bulk conversion there is no line-unwrapping field. for some reason for me the default line unwrapping is 0.0
i need it to be 0.45 for my pages to look right. also in a couple of my books i have noticed something new. every few pages this pops up: Y Y Y er Y er B 2 B 2 B .0 B .0 A A Click here to buy Click here to buy w w w w w. ABBYY.com .ABBYY.com i copied that exactly, and it pops up every couple pages and wastes a page and a half, this changes the locations number from around 9,000 to 23,066. thats a huge waste of space. i have no idea what do about that. |
10-06-2010, 04:27 AM | #9 |
Wizzard
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
|
Looks like the affected books were generated using some sort of trial/unregistered version of ABBYY's Finereader OCR or their PDF-production app and it includes some promo text to try and get people to buy a copy.
If that's in the actual PDFs, you can try editing the affected pages out before you do the PDF convert. Sorry I can't point you to any Windows utilities for that, but I'm sure someone here knows of some. As for Calibre's bulk convert, you can set mass-settings for all PDF conversions, including the line-unwrapping, via the Preferences button -> Conversions, which should then apply as default across the board. I don't know if you'd then have to uncheck "Use saved conversion across individual books" to get it to do so, but you can experiment in little batches of two or three until you've got it all figured out. Hope this helps. |
10-06-2010, 05:03 PM | #10 |
Enthusiast
Posts: 43
Karma: 10
Join Date: Jun 2010
Device: Kindle 3
|
hey thanks for the bulk convert mass settings idea under the preferences button. i was looking for that. that did it.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
pdf to mobi conversion issue | dkritso109 | Calibre | 16 | 10-08-2010 06:10 AM |
PDF to ePub conversion issue - headers getting left in | deadSkip | Calibre | 7 | 07-09-2010 02:07 AM |
PDF Zooming Issue on 505 after conversion | JVIPER88 | Sony Reader | 4 | 12-05-2009 01:58 PM |
pdf to mobi table of content issue | magphil | Calibre | 2 | 08-27-2009 12:06 PM |
Mobi Conversion to EPub very minor issue | markbond1007 | Calibre | 1 | 08-06-2009 02:49 PM |