04-25-2014, 09:32 AM | #1 |
Junior Member
Posts: 2
Karma: 10
Join Date: Apr 2014
Device: htc rsvp
|
Repeated text pdf to epub conversion
Hey everyone,
I have been trying to use rsvp with an app called glance , the issue I am running into is that my conversion of a pdc to epub keeps repeating various information. I tried to use regular expressions however it did not seem to work. Is there an option I am missing that can delete this information form the converted epub? Thanks for the help. Example : Copyright © 2011 by The McGraw-Hill Companies 01-ch01.indd 51 3/23/2011 6:18:37 AM CertPrs8/RHCE Red Hat Certifi ed Engineer Linux Study Guide (Exam N0-201)/Jang/176565-7/Chapter 1 Here is the exact markup. <br> Copyright © 2011 by The McGraw-Hill Companies<br> 01-ch01.indd 3<br> 3/23/2011 6:18:31 AM<br> <hr/> <a name=4></a><b>CertPrs8</b>/RHCE Red Hat Certifi ed Engineer Linux Study Guide (Exam N0-201)/Jang/176565-7/Chapter 1<br> <b>4</b> Chapter 1: Prepare for Red Hat Hands-on Certifi cations<br> |
04-25-2014, 01:40 PM | #2 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
I don't see any repetition in your example.
In any event, PDF is a horrible format to convert from, although there are a few tricks you can use. It mostly boils down to regex though. See the sticky here: Read this before Posting PDF Questions. |
Advert | |
|
04-25-2014, 01:53 PM | #3 |
Well trained by Cats
Posts: 29,806
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
What you see is 'page' footers
They are NOT identical, but similar. There is a 'page number' and page (generated?) timestamp that is encoded IMHO, use the editor and some careful REGEX to clean up after a PDF conversion |
04-25-2014, 02:02 PM | #4 |
Junior Member
Posts: 2
Karma: 10
Join Date: Apr 2014
Device: htc rsvp
|
The book only came with pdf, which is why I am stuck with pdf format. I am currently reading the link on what to do with pdf conversion. I think @theducks is right they look like footers..
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Generate epub using text-recognized text in PDF not Pictures. | lordofazeroth | Conversion | 0 | 09-19-2013 04:16 PM |
Problem with Epub to Pdf Conversion:text invisible on right hand side due to clipping | Feher | Conversion | 2 | 06-08-2013 03:58 AM |
disjointed text in pdf to epub conversion (calibre) | Janelle12 | Conversion | 6 | 05-06-2013 09:57 AM |
PDF Conversion doesn't see hidden text | Bearbait | Conversion | 3 | 02-18-2011 02:56 PM |