View Single Post
Old 10-31-2011, 07:17 AM   #1
pureambient
on twitter: @pureambient
pureambient began at the beginning.
 
Posts: 13
Karma: 10
Join Date: May 2011
Location: central scotland
Device: Kindle
PDF to MOBI Conversion Questions

Hello Forum

I've spent many many hours so far trying to learn how to effectively convert PDFs to MOBI.

I've been using Microsoft Access since 1997, so I understand in principle what a regular expression is, and how it works within the FIND & REPLACE window of calibre.

I have learned some of the shortcuts and am learning some of the syntax needed to write effective expressions.


But - I am struggling.

I sat down to write a comprehensive process for myself, which starts with Metadata clean up and then moves onto Conversion.


I took a PDF with only a minor number of irritants, and applying all the knowledge I've gained over the past weeks, I wrote three effective expressions that actually produced a PERFECT book. Completely clean, all rubbish removed. That was an EASY book, however.


Then I got a difficult book, one with horrific advertising and "click here to buy" and logos and tons and tons of absolute rubbish STREWN through the PDF file.

I began the same process, trying to identify target strings, and run conversions. And this is where the trouble begins. I have yet to succeed with this book, for a number of reasons.

1) What are you supposed to do if you cannot "fix" all of a book's problems with JUST THREE expressions?

I ASSUMED that what you would do would be, load the first 3 expressions, and convert to MOBI.


Then, for the NEXT 3 conversions, you would select the converted MOBI (so you are STARTING with the book that you have PARTIALLY fixed - NOT the original PDF now) and you would run the NEXT 3 conversions against that.


But I ran into problems, after set three (conversions 7, 8, 9) I noticed that conversions 1, 2 3 were BACK, so somehow I was NOT converting the converted MOBI, but maybe the PDF ????????


So the question is: What do I do, what is the EXACT PROCESS, when I want to run more than 3 expressions against a PDF?




The other question is:

2) Can you "save" a set of expressions to run against other books?

The reason I want to do that is I want to work out what my 12 conversions are against one book, then find all books with similar problems, then BULK convert ALL of them using this "one set" of 12 master expressions - if you see what I mean.

Once developed (and so far, I have failed, but I will get there) I wish I could store that set forever, because I might run across the same advertisements or whatever in future.

My workaround so far is to store all expressions in a text document of known good expressions (and known bad ones, too, to learn from - what NOT to do).


Please let me know what to do when you need MORE than the 3 expressions.


Thanks!

dave
pureambient is offline   Reply With Quote