Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 01-27-2011, 10:00 AM   #1
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,436
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Increase search and replace?

I've noticed a lot of people using the new search and replace. I've also seen a lot of answers to questions being, "you can do this using search and replace."

Is 3 s&r fields enough or should I add more?

Also, are there other changes to it you would like to see?
user_none is offline   Reply With Quote
Old 01-27-2011, 10:38 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,924
Karma: 5036099
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Given that you can chain regexes with | I don't really see the need for more.
kovidgoyal is offline   Reply With Quote
Old 01-27-2011, 05:12 PM   #3
duepixel
Junior Member
duepixel began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jan 2011
Device: kindle 3
hello user_none, with search & replace is now possible to have more control over text.
I strongly suggest to add more because three is not sufficient in some cases. It 's true that can be chained with | but you can't do the same with the replace regexp!

Also, there is a bug in the regexp (maybe only in preview?): you can't use Start of string and end of string Anchors (http://www.regular-expressions.info/anchors.html)

would be nice, in the near future, have more control over text by working directly on the HTML file produced by converted PDF (like Mobipocket Creator). Is it possible?

thanks and congratulations.
duepixel is offline   Reply With Quote
Old 01-27-2011, 05:46 PM   #4
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,436
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by kovidgoyal
Given that you can chain regexes with | I don't really see the need for more.
You can chain the search regex but you can't chain the replacement text. Chaining would only be useful in a limited number of circumstances such as removing content entirely.

Quote:
Originally Posted by duepixel View Post
Also, there is a bug in the regexp (maybe only in preview?): you can't use Start of string and end of string Anchors (http://www.regular-expressions.info/anchors.html)
^ and $ work fine for start and end of string. Remember that the start of the string is the first character in the regex preview. You probably want ^ and $ to work on lines. Look the (?iLmsux) flags section of the Python Re Syntax to enable this behavior.
user_none is offline   Reply With Quote
Old 01-27-2011, 06:03 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,924
Karma: 5036099
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by user_none View Post
You can chain the search regex but you can't chain the replacement text. Chaining would only be useful in a limited number of circumstances such as removing content entirely.
Yeah but if you're doing large scale search replace you should really be using an editor.
kovidgoyal is offline   Reply With Quote
Old 01-27-2011, 06:18 PM   #6
duepixel
Junior Member
duepixel began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jan 2011
Device: kindle 3
Quote:
Originally Posted by user_none View Post
^ and $ work fine for start and end of string. Remember that the start of the string is the first character in the regex preview. You probably want ^ and $ to work on lines. Look the (?iLmsux) flags section of the Python Re Syntax to enable this behavior.
ok, sorry.
Code:
(?m)^
works fine.

Please add more S&R items! at least 5 or 6, or dynamic list. It 's too good!

Quote:
Originally Posted by kovidgoyal View Post
Yeah but if you're doing large scale search replace you should really be using an editor.
if I have a pdf file, the only way to work on text in html is S & R, I think.

Last edited by duepixel; 01-27-2011 at 06:27 PM.
duepixel is offline   Reply With Quote
Old 01-28-2011, 03:56 AM   #7
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80446
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by duepixel View Post
if I have a pdf file, the only way to work on text in html is S & R, I think.
Convert to ePub and use Sigil.
Manichean is offline   Reply With Quote
Old 01-28-2011, 06:51 AM   #8
duepixel
Junior Member
duepixel began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jan 2011
Device: kindle 3
Quote:
Originally Posted by Manichean View Post
Convert to ePub and use Sigil.
Do not agree.
For me, more flexible management of S&R is necessary in calibre.

There are some situations where the conversion from pdf-> html-> epub lose formatting.
example:

I have a text like this (converted by calibre pdftohtml engine):
__________________________
Code:
a bit 'of sticky stuff. I spent the index and I approached him on the nose. <br>
<hr>
<A name=39> </ a> tomato sauce. <br>
calibre epub conversion:
__________________________
Code:
<p class="calibre2"> bit 'of sticky stuff. I Spent the index and I approached HIM on the nose </ p>
<p class="calibre2"/>
<p class="calibre2"> tomato sauce. </ p>
in this case (when load epub html in sigil) I do not know if the break line is desired or wrong interpretation of <hr>

with S&R I can create a regex like this:
<br> \ s <hr> \ s <A name=\d+> </ a>
and replace wiht nothing.

Another example is un-wrapping:
Code:
The hottest summer of the century.<br>
Four homes lost in the corn. The major are plug-<br>
ged into the house. Six children on their bicycles<br>
epub:
Code:
<p class="calibre2">The hottest summer of the century.
Four homes lost in the corn. The major are plug-ged into the house. Six children on their bicycles</p>
I can't remove the character "-" in sigil because it can be used successfully in other circumstances (eg: mercedes-benz)...

with S&R i can create a regex:
([^\s]\-<br>)|([^\s]\-<br>\s*)
and replace with null string.

it's wrong?
duepixel is offline   Reply With Quote
Old 01-28-2011, 07:35 AM   #9
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Hyphenated words are de-hyphenated by using your document as a dictionary. However if the word only occurs once in the book the hyphen won't be removed. At a guess, I think the word 'plugged' only occurs a single time in your book.

You can't write a reliable regex to fix that, as you noted with the Mercedes-Benz example - imagine if the line wrapped on Mercede-<linebreak>Benz - you can't delete that hyphen. This sort of thing really is far easier to fix in Sigil. Use Calibre Search and Replace for repeating occurrences throughout the book, use Sigil to clean up the one-off items after conversion.
ldolse is offline   Reply With Quote
Old 01-28-2011, 07:35 AM   #10
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80446
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
You need to keep in mind, though, that Calibre is not designed for "working on text", as you say- it's designed to convert and manage books.
That said, I'm a believer in the Zero-One-Infinity rule of software design, as long as the interface isn't too cluttered- so if you can figure out a way to make a potentially unlimited list, that'd be the best solution, if not, I don't think it'll be much difference if we have three fields as opposed to six or however many you'd put in- there'll always be people begging for more.
Manichean is offline   Reply With Quote
Old 01-28-2011, 08:16 AM   #11
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 2,793
Karma: 1089170
Join Date: Sep 2010
Device: Kobo aura HD, Kobo Arc, Kindle Fire HDX 8.9 , Kindle for PC
Quote:
Originally Posted by user_none View Post
I've noticed a lot of people using the new search and replace. I've also seen a lot of answers to questions being, "you can do this using search and replace."

Is 3 s&r fields enough or should I add more?

Also, are there other changes to it you would like to see?
I had a need for 4 recently - it was a pdf anthology of 3 short stories so there were 3 types of title+author to remove + the page numnbering

it is much easier to write several simple S& R that to attempt one complex fix-it-in-one .

so I'd favour a drop down option to keep adding more, as needed.

PS I did not realise that searches could be chained - is that in the instructions ???

Last edited by cybmole; 01-28-2011 at 08:33 AM.
cybmole is offline   Reply With Quote
Old 01-28-2011, 09:45 AM   #12
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by cybmole View Post
PS I did not realise that searches could be chained - is that in the instructions ???
Manichean covered this in his guide, you use () and | , e.g.:

(expression1|expression2|expression3|etc)
ldolse is offline   Reply With Quote
Old 01-28-2011, 09:58 AM   #13
duepixel
Junior Member
duepixel began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jan 2011
Device: kindle 3
Quote:
Originally Posted by Manichean View Post
You need to keep in mind, though, that Calibre is not designed for "working on text", as you say- it's designed to convert and manage books.
That said, I'm a believer in the Zero-One-Infinity rule of software design, as long as the interface isn't too cluttered- so if you can figure out a way to make a potentially unlimited list, that'd be the best solution, if not, I don't think it'll be much difference if we have three fields as opposed to six or however many you'd put in- there'll always be people begging for more.
it's true, calibre is not designed to work on text, at least before the introduction of the S & R.

Until version 0.7.40 I was often forced to use Mobipocket to convert, and then re-import it into calibre for optimization of the text (unwrap, Paragraphs ...) and management (metadata, cover, database, upload on device).

With version 0.7.42 (S & R) I left other tools and takes less time to do this.
My goal is to read a fairly well-formatted e-book on my kindle 3, I will not waste time rewriting the ebook with Sigil!
@ldolse: does not matter if the text contains some little errors not caught by by a regexp.
I think this is the goal of most people using calibre!

however, I think a solution with a dynamic list for the management of S & R will comes to mind to developers. You can simply add a "Add" button that will display a new pair of fields Search / Replace...
Easy.

Last edited by duepixel; 01-28-2011 at 10:01 AM.
duepixel is offline   Reply With Quote
Old 02-09-2011, 02:58 PM   #14
millhaus
Junior Member
millhaus began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Feb 2011
Device: sony 600
hey

i am also begging for more S & R fields.
my PRS600 doesnt properly show all the czech letters so this would be the simpliest way go with (i dont really want to flash my firmware).
or is threre any other way how to replace ° ý Ŕ ¨ ´ ˛ with ř ě č ů ď ň not only in pdb books but also in news downloaded from the internet?

thanks
millhaus is offline   Reply With Quote
Old 02-09-2011, 03:08 PM   #15
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,924
Karma: 5036099
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Use the transliterate unicode characters option in the look and feel section under conversion settings.
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
macro - Search and Replace oldbwl Workshop 17 03-05-2011 01:39 PM
302 PB 302 increase text darkness and search in library... fragile PocketBook 4 01-25-2011 02:00 PM
need regex help search and replace schuster Calibre 4 01-10-2011 09:00 AM
Search and replace in 0.2.0 paulpeer Sigil 7 03-13-2010 11:59 AM
Why no search and replace? charleski Sigil 10 11-24-2009 04:13 PM


All times are GMT -4. The time now is 12:47 AM.


MobileRead.com is a privately owned, operated and funded community.