|  09-14-2014, 01:02 AM | #1 | 
| Enthusiast  Posts: 30 Karma: 10 Join Date: Jul 2012 Device: Kindle | 
				
				Converting pdf to mobi question
			 
			
			calibre did a very effective job of converting a pdf to mobi, to read on a Kindle.  A question: is there any way to edit out the text and image(?) between pages, as seen in the screenshot (the "Yuan Ban..." comes from using the translation option)?  The number on the upper left - Di 2 - is the page number, but it's not important.  Thanks,
		 Last edited by highstream; 09-14-2014 at 01:04 AM. | 
|   |   | 
|  09-14-2014, 01:13 AM | #2 | |
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			It will require a bit of work: Sticky: Read this before Posting PDF Questions Quote: 
 | |
|   |   | 
|  09-14-2014, 02:23 AM | #3 | 
| Enthusiast  Posts: 30 Karma: 10 Join Date: Jul 2012 Device: Kindle | 
			
			Thanks.  My oversight.  I had looked at the Search and Replace tab under Convert Books, but misunderstood the phrase "Regular Expressions," as it's a technical term for what in plain language means repeating expressions.  That faq on PDFs helped, to a point.   I tried Mobipocket Creator, but it doesn't appear to recognize pdf's. Ok, one of the new screenshots below shows how far I've come from the one in the OP; the other shows the underlying code in the PDF. The larger number is the chapter, which stays. The two pieces I've yet to figure out how to code as regular expressions are the small Chinese character and the number that follows (page), and the large graphic (below in the coding). For the first, getting rid of the Chinese character is no problem, but my attempts to get rid of the page numbers with brackets e.g., [0-9], have failed - and I'm afraid of that messing with the Chapter numbers. For the graphic, img src="index-1_1.jpg gets rid of one, but I'm not sure how to code an expression for all of them. Suggestions welcome. Thanks, Last edited by highstream; 09-14-2014 at 02:27 AM. | 
|   |   | 
|  09-14-2014, 02:31 AM | #4 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			Perhaps a regex tutorial can help you get more of an idea how to code them. I've always found this site to be extremely helpful.   http://www.regular-expressions.info/ Code: img src="index-\d+_\d+.jpg \d represents a set of [0-9], or any number. The plus repeats it one or more times. \d+ represents any non-decimal number of arbitrary length. | 
|   |   | 
|  09-14-2014, 02:34 AM | #5 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			If you copy-paste the whole header-footer code here, in [CODE][/CODE] tags, I can help you arrive at a suitable regex. I am not sure where the page numbers with brackets are, I don't see any in the screenshot.
		 Last edited by eschwartz; 09-14-2014 at 02:45 AM. | 
|   |   | 
|  09-14-2014, 02:43 AM | #6 | 
| null operator (he/him)            Posts: 22,007 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | 
			
			@highstream - FWIW - With Mobicreator don't use the File menu, start the program and drag drop the PDF onto the first screen, or click Import From Existing File->Adobe PDF.   BR | 
|   |   | 
|  09-14-2014, 02:45 AM | #7 | 
| Enthusiast  Posts: 30 Karma: 10 Join Date: Jul 2012 Device: Kindle | 
			
			Thanks for the examples!  The brackets are used in examples on both the regular expression faqs, before getting to more general expressions using \d and such.   Your image coding got rid of all those except the one on the cover page - no big deal. For the page numbers, following your example I tried "第 \d+" and that worked! It'd be nice to get chapters coded so that they are recognized by the forward and back buttons on the Kindle, but I imagine that's asking too much given the current state of pdf to mobi conversion. Many thanks, Last edited by highstream; 09-14-2014 at 02:58 AM. | 
|   |   | 
|  09-14-2014, 02:57 AM | #8 | 
| Enthusiast  Posts: 30 Karma: 10 Join Date: Jul 2012 Device: Kindle | 
			
			BetterRed, Thanks, I missed that.  Gave Mobicreator Adobe PDF conversion a try.  Unless I'm missing something, in this case the result left a lot to be desired compared to a direct conversion with calibre.
		 | 
|   |   | 
|  09-14-2014, 03:06 AM | #9 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			You can manually build a ToC with the ToC editor. Once you have a ToC, the back/forth buttons will work.
		 | 
|   |   | 
|  09-14-2014, 03:15 AM | #10 | |
| null operator (he/him)            Posts: 22,007 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | Quote: 
  I suspect Kovid has done quite a lot of work on the PDF-Input PI since that PDF Read This First sticky was written, BR WYGIATI - what you get is all there is | |
|   |   | 
|  09-25-2014, 10:18 PM | #11 | 
| Fanatic            Posts: 515 Karma: 1470724 Join Date: Jul 2013 Location: Quebec CA Device: android 4 (samsung tablet and asus tablet) | 
			
			I have found mobipocket creator to be fairly good at conversion and removal of headers and footers. Due to the variety of ways pdf files are created, it is very likely that any pdf conversion will need to be "tweaked". The conversion program used is limited by the quality of the pdf that it is trying to convert. | 
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Converting a PDF to mobi and having it come out right? | bizzybody | Kindle Formats | 7 | 08-12-2014 02:20 PM | 
| Converting PDF to MOBI | Killiney Colm | Workshop | 1 | 07-15-2012 09:59 AM | 
| converting pdf to mobi | BeccaPrice | Conversion | 2 | 01-03-2012 05:40 AM | 
| Error converting pdf to mobi, and also chm to mobi | Neo139 | Conversion | 10 | 08-12-2011 09:55 AM | 
| Converting .html to .mobi Question | gilvezan | Conversion | 1 | 05-30-2011 01:14 PM |