![]() |
#1 |
Connoisseur
![]() Posts: 55
Karma: 76
Join Date: Sep 2010
Location: Australia
Device: Kindle 3
|
Converting pdf for Kindle with Calibre
So far I have been lucky and found everything I want to read in .mobi format. But one book I can only find in pdf and I can't get Calibre to do it properly. (I have the latest update)
The resulting .mobi book is interspersed with unwanted stuff that appears freqently, like this: 54 9781416585855TEXT.indd 54 25/11/09 3:31:56 PM Changing the various options in Calibre doesn't eradicate it, perhaps I am missing something. The 54 is the page number but can anyone suggest a way to get rid of this please? |
![]() |
![]() |
![]() |
#2 | |
Enthusiast
![]() Posts: 32
Karma: 10
Join Date: Jan 2011
Device: Kindle 3 WiFi, Onyx M92
|
Quote:
\[B\]\d+ \d+TEXT\.indd \d+ \d+\/\d+\/+d+ \d+:+d+:+d+ PM\[\/B\] Since this isn't Perl (which is the variation of regexps I usually use), you may not have to put a "\" behind a "/" as I have done above. Try to experiment with these strings and if supported by Calibre, put "^" in front of the expressions to denote beginning of line and "\s*$" at the end of the expressions to denote end of line with possible trailing white space. If the date and time strings are the same in all instances of the unwanted strings, you can use the actual numbers rather than "\d+" (which denotes one or more digits). Experimentation is the key here and you will learn how to do this. Regexps are great stuff, though looks like Greek to the uninitiated (except for the Greek uninitiated ![]() -- bob_tm Last edited by bob_tm; 03-05-2011 at 03:23 PM. |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,105
Karma: 1025784
Join Date: Oct 2010
Device: WiFi Kindle3
|
I just tried converting some guide books - almost a complete disaster.
They use a format where there is kind of a basic details box containing a list at the 1/3 outer side of each page and then a more detailed full narrative text taking up 2/3. But it all gets mashed together into an unreadable mess. I was curious and converted to epub - just as much a mess. So it is not only the kindle that suffers. pdf that are not just linear text just don't convert very well. (other readers besides the kindle may do a better job of showing the pdf natively (reflow) |
![]() |
![]() |
![]() |
#4 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Mar 2011
Device: Kindle 3
|
I also have been trying to convert pdf files to kindle format. I have emailed them to freekindle.com and put convert in the subject line. I have yet to receive a reply. I downloaded Calebre and it just transfered the pdf book to my kindle as a pdf. No kindle formating.
What am I doing wrong? |
![]() |
![]() |
![]() |
#5 | |
Enthusiast
![]() Posts: 32
Karma: 10
Join Date: Jan 2011
Device: Kindle 3 WiFi, Onyx M92
|
Quote:
bob_tm |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,251
Karma: 3720310
Join Date: Jan 2009
Location: USA
Device: Kindle, iPad (not used much for reading)
|
Quote:
I don't know what steps you took in Calibre to convert, so I can't help with that. Did you set the output format to .mobi? You have to tell it to convert, not just send, in case that's the problem. |
|
![]() |
![]() |
![]() |
#7 |
Feral Underclass
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,622
Karma: 26821535
Join Date: Jan 2010
Location: Yorkshire, tha noz
Device: 2nd hand paperback
|
PDFtoEpub works better than Calibre, you can crop off things like page numbers before you start.
|
![]() |
![]() |
![]() |
#8 | |
Connoisseur
![]() Posts: 55
Karma: 76
Join Date: Sep 2010
Location: Australia
Device: Kindle 3
|
Quote:
I am slowly getting through the tutorial; just hope I have the patience. Edit: just worked it out....\d\d\d for all the page numbers. Last edited by pietro99; 03-05-2011 at 04:36 PM. Reason: update |
|
![]() |
![]() |
![]() |
#9 |
Connoisseur
![]() Posts: 55
Karma: 76
Join Date: Sep 2010
Location: Australia
Device: Kindle 3
|
I've been spending a couple of hours with Calibre, and although I got the first line with the page number to work, I just can't get the other 2, and hoping bob_tm might help.
For the 2nd line: 9781416585855TEXT.indd 57<br> I come up with: \d+TEXT\.indd \s\ d+<br> For the 3rd line: 25/11/09 3:31:56 PM<br> I come up with: 25/11/09\s\d+\:d+\:d+ PM<br> (the date stays the same each time) Neither of these will work for me. |
![]() |
![]() |
![]() |
#10 |
Enthusiast
![]() Posts: 32
Karma: 10
Join Date: Jan 2011
Device: Kindle 3 WiFi, Onyx M92
|
I recommend
\d+ which means "one or more digits". That will cover all page numbers regardless of number of digits. As it stands, however, it will also get rid of alle numbers in the whole book, so it should be restricted using markers that makes the page numbers unique (like ^\s*\d+\s*$ which means a series of digits on its own on a line with possible white space before and after - also remember to add possible HTML tags that nmay surround the page number). bob_tm |
![]() |
![]() |
![]() |
#11 | |
Enthusiast
![]() Posts: 32
Karma: 10
Join Date: Jan 2011
Device: Kindle 3 WiFi, Onyx M92
|
Quote:
bob_tm |
|
![]() |
![]() |
![]() |
#12 | |
Connoisseur
![]() Posts: 55
Karma: 76
Join Date: Sep 2010
Location: Australia
Device: Kindle 3
|
Quote:
being purposely obtuse?” she said.<br> <i>71</i><br> 9781416585855TEXT.indd 71<br> 25/11/09 3:31:58 PM<br> <hr> <A name=79></a>“Obtuse is purposeful by defi nition,” Bernie said.<br> |
|
![]() |
![]() |
![]() |
#13 | |
Enthusiast
![]() Posts: 32
Karma: 10
Join Date: Jan 2011
Device: Kindle 3 WiFi, Onyx M92
|
Quote:
9781416585855TEXT.indd 57<br> You suggested: \d+TEXT\.indd \s\ d+<br> I suggest: \d+TEXT\.indd\s+\d+<br> For: 25/11/09 3:31:56 PM<br> You suggested: 25/11/09\s\d+\:d+\:d+ PM<br> I suggest (note the misplaced "\" above that originated from my typos): 25/11/09\s\d+:\d+:\d+\s+PM<br> Sorry about this. Hopefully the regexps make more sense as written here (though they could be wrong too). All you should need here are normal text, \s and \d for white space and digit and the '+'-suffix to these in order to denote "one or more occurrences". bob_tm |
|
![]() |
![]() |
![]() |
#14 | |
Connoisseur
![]() Posts: 55
Karma: 76
Join Date: Sep 2010
Location: Australia
Device: Kindle 3
|
Quote:
We are getting there. The 2nd one found 311 instances but the 3rd one still doesn't find any. EDIT: Got it! The 3rd line that works is: 25/11/09\s+\d+:\d+:\d+\s+PM<br> The s becomes s+ as I think there are 2 spaces after the date. That has been a most edifying experience. Thanks for all your input. Last edited by pietro99; 03-06-2011 at 04:37 PM. Reason: Update |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Calibre not converting entire PDF book, HELP! | chilady1 | Calibre | 4 | 09-21-2010 05:11 AM |
Problem converting PDF to EPUB in calibre | adgpro | Calibre | 2 | 07-09-2010 01:10 AM |
Converting from PDF to ePub, Calibre not working | Alda | ePub | 10 | 07-09-2010 01:00 AM |
PRS-300 Converting PDF via Calibre for Reader 300 | jamcoops | Sony Reader | 9 | 10-23-2009 06:59 PM |
Converting PDF files in Calibre | BJWanlund | Calibre | 0 | 12-07-2008 10:28 PM |