Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 01-22-2011, 05:17 AM   #1
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
new search & replace - great, but couple of suggestions

I did a pdf to epub and was easily able to strip header & footer ( including page numbers) with the new feature & wizards.

but I'd like to be able to toggle S&R so that when I now do e.g. epub to mobi, the S&R is not reapplied. I can opt for reset to defaults on search replace but then I lose my carefully constructed expressions. I "should" not need them again but keeping them would preserve my "start over" path if I screw up the epub, later.

so an option to disable, but preserve, field contents would be good.

& keep the source ( pdf) used by the SR wizard in memory after constructing 1st epxression, so that when needed again for 2nd expression there is no delay

AFAIK, conversion options are only stored once per book, not by source type within book ( which would be better ) ?

Last edited by cybmole; 01-22-2011 at 05:34 AM.
cybmole is offline   Reply With Quote
Old 01-22-2011, 10:23 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
your regexes are unlikely to match both pdf input and EPUB input, so just leave them in.
kovidgoyal is offline   Reply With Quote
Advert
Old 01-22-2011, 11:23 AM   #3
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Quote:
Originally Posted by kovidgoyal View Post
your regexes are unlikely to match both pdf input and EPUB input, so just leave them in.
can do - they do not seem to cause any slowdown.

a greater slowdown is waiting for pdttohtml sub process to run 3 times in order to set up 3 regex filters- if that could be held in memory to reuse for filters 2 & 3 ... ?

I have to say again, that in general this is very slick & MUCH better that the previous header / footer options. I now feel confident that I can filter pretty much any PDF header / footer setup
cybmole is offline   Reply With Quote
Old 01-22-2011, 11:44 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I'll leave that one up to user_none, the regex wizard is his baby.
kovidgoyal is offline   Reply With Quote
Old 01-22-2011, 04:08 PM   #5
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
In most cases the search and replace regex will not match between different inputs. Usually the regex used is also not complex enough to see any slow down by having it run without finding any matches.

pdttohtml only runs once per PDF input. Each search and replace expression is run over the html produced by pdftohtml. It actually runs over the html produced by any input plugin. Each expression is run once per page. In the case of PDF input only one page is generated so it is only run once. In the case of an EPUB book (or an input plugin that creates multiple pages within the OEB) there is no way to avoid running the search and replace multiple time. It has to run over each independent page.

If you're seeing the search and replace run multiple times it's because the document has multiple pages and it has to be run over each page. When I say page I'm talking about the independent xhtml files found with an OEB (EPUB).

Also, there is a bug I'm aware of with the regex input field. It should be saving all previous regex used in the drop down. It's not and I plan to look into why. Once that's fixed you can put your expressions, save, then remove it, then later find it in the list to reenable it.
user_none is offline   Reply With Quote
Advert
Old 01-23-2011, 03:13 AM   #6
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
thanks for response - maybe I was not totally clear about the pdftohtml bit.

I am at the stage of setting conversion fields for pdf to epub

1.
I take a large pdf ask and the wizard to help me create S&R 1- to do that it runs pdftohtml and takes a few seconds.
2
I am happy with that, so I ask the wizard to now help me create S&R2. This is where it seems to go back to the beginning and spend many seconds preparing the PDF for wizard use ( even though it's done it once already in step 1). in task manager I see that pdftohtml is running & using lots of CPU during these seconds.

what I am suggesting is keep the file used for wizard / testing from step 1 so that it does not have to be re-created for step 2 of the setup test - . I am not talking about the actual conversion to epub. does that make sense

ps on saving previous S&R into a drop down - will that then work across different books - as some expressions may be re-usable elsewhere? Ideally it would be good to accumulate a user's mini library of expressions somewhere, or to have a way of sharing them on this forum

Last edited by cybmole; 01-23-2011 at 03:16 AM.
cybmole is offline   Reply With Quote
Old 01-23-2011, 01:16 PM   #7
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by cybmole View Post
I am at the stage of setting conversion fields for pdf to epub

1.
I take a large pdf ask and the wizard to help me create S&R 1- to do that it runs pdftohtml and takes a few seconds.
2
I am happy with that, so I ask the wizard to now help me create S&R2. This is where it seems to go back to the beginning and spend many seconds preparing the PDF for wizard use ( even though it's done it once already in step 1). in task manager I see that pdftohtml is running & using lots of CPU during these seconds.

what I am suggesting is keep the file used for wizard / testing from step 1 so that it does not have to be re-created for step 2 of the setup test - . I am not talking about the actual conversion to epub. does that make sense
I understand what you're asking for now. Unfortunately, the design of the components does not allow for this to be easily achieved. Open a ticket for this enhancement so it doesn't get lost but I can't guarantee when if ever it gets implemented.

Quote:
Originally Posted by cybmole View Post
ps on saving previous S&R into a drop down - will that then work across different books - as some expressions may be re-usable elsewhere? Ideally it would be good to accumulate a user's mini library of expressions somewhere, or to have a way of sharing them on this forum
Settings are saved on a per book basis when set in a conversion of a specific book. If you set a setting in Preferences it's global. So you could create a few that work across books there and just disable them when necessary on a per book basis. The per book settings will remember you've disabled it for subsequent conversions.

However, it sounds more like you want a global settings scratch pad where you can see all stored settings and probably what book they were stored for... If you have some idea of how it would work open a ticket for it too. Someone might like the idea (or be bored) and implement it.
user_none is offline   Reply With Quote
Old 01-23-2011, 04:07 PM   #8
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
i keep a personal scratch pad in notebook++
once I get slicker with regex then I may not need it, but for now it is handy

I am not too fussed about the htmltopdf issue - I happened to notice the delay on one book but I have cleared my backlog of pdf converts for now & have no more planned.

so I don't need either feature badly enough to take up a slot on the to-do lists, unless others want them also.
cybmole is offline   Reply With Quote
Old 01-24-2011, 07:01 AM   #9
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
I've added support for caching the document in the wizard. It converts once and uses the result across all regex wizards in search and replace. It wasn't as invasive a change as I though it would be.
user_none is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search & Replace - Regular expression oldbwl Calibre 2 01-09-2011 09:33 AM
Search & Replace Suggestion Philosopher Calibre 6 12-31-2010 11:55 AM
Search & Replace: Destination series_index? Starson17 Calibre 0 12-09-2010 01:12 PM
Search & Replace Pat Nickholds Sigil 2 10-21-2010 11:18 PM
Search & replace TEXT ToeRag Calibre 3 04-10-2010 01:44 PM


All times are GMT -4. The time now is 05:10 PM.


MobileRead.com is a privately owned, operated and funded community.