Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-01-2011, 01:25 AM   #61
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
These have been replaced by the Search and Replace panel in conversion. If you want to delete text just specify a search and leave the replace box empty.
ldolse is offline   Reply With Quote
Old 02-01-2011, 06:10 PM   #62
CazMar
Book Geek
CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.CazMar ought to be getting tired of karma fortunes by now.
 
Posts: 596
Karma: 1499085
Join Date: Aug 2010
Location: Adelaide, Australia
Device: Kobo Touch, Asus MemPad 7" tablet, Nexus 5, Asus 10" tablet
Quote:
Originally Posted by ldolse View Post
These have been replaced by the Search and Replace panel in conversion. If you want to delete text just specify a search and leave the replace box empty.
Ok - thanks for the help.
CazMar is offline   Reply With Quote
Old 02-21-2011, 10:40 AM   #63
Luiz Braga
Junior Member
Luiz Braga began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2011
Device: Kindle
Question pdf headers and footers

I'm realy a novice in Calibre, just beginning with two first book convertions.
I try convert PDF books to Mobi. The only problem produced is that the footer and page numbers are embedded with last line of each page in the Mobi format. I did not try or even understand in replys above, how to remove that footers. Any help?
I see that if convert from PDF to RTF of course I can Manuly eliminate these footers but it is troublesome.
Luiz Braga is offline   Reply With Quote
Old 02-21-2011, 10:48 AM   #64
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by Luiz Braga View Post
I'm realy a novice in Calibre, just beginning with two first book convertions.
I try convert PDF books to Mobi. The only problem produced is that the footer and page numbers are embedded with last line of each page in the Mobi format. I did not try or even understand in replys above, how to remove that footers. Any help?
I see that if convert from PDF to RTF of course I can Manuly eliminate these footers but it is troublesome.
You'll have to use the search & replace feature in the conversion settings, there's a brief tutorial available. The search and replace uses regular expressions to describe the text to replace, if you're not comfortable using those, there's a tutorial available on them as well, which I'd suggest you start with if needed.
Manichean is offline   Reply With Quote
Old 04-16-2011, 03:33 PM   #65
HornGs
Junior Member
HornGs began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Apr 2011
Device: kindle
I've read the tutorial as well as this thread and I've found the information very useful. I just can't find how to extend my selection. for example we have

file://something0%01something0%0 (page 23 of 9000) [April 99, 1903]

I just want to match file://something0%0 and extend my match to the end of line. Or I could match of 9000) and extend to the previous end of line.

Is there a simple way to do that ?

edit: let me clarify ... I mean how to I match to the end of line in a PDF when there are no end of line tags.

I use file://.+br> when it's an html document.

Last edited by HornGs; 04-16-2011 at 05:12 PM.
HornGs is offline   Reply With Quote
Old 04-16-2011, 06:20 PM   #66
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Try something a little more specific like for example
Code:
file://something0%01something0%0\s+\(page\s+\d+\s+of\s+9000\)\s+\[April\s+99,\s+1903\]
Manichean is offline   Reply With Quote
Old 04-16-2011, 09:11 PM   #67
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
This should work too:
Code:
file://.*?\]
Although pdf does have end of line tags, and file:// is already built into pdf as an internal pattern, though admittedly I just checked the code and it seems like two slashes isn't as common as three or four slashes - I've just tweaked the number to look for.
ldolse is offline   Reply With Quote
Old 09-01-2012, 04:00 PM   #68
miquele
Connoisseur
miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.
 
miquele's Avatar
 
Posts: 75
Karma: 498122
Join Date: May 2010
Location: Europe
Device: Bookeen Cybook Gen3, Kindle 3, Kindle PW, Kindle Voyage
tricky regex

hello,

line spacing is off at conversion, so I would like to remove
<br>\n
(read blank, bracket, newline)
but only if in front of the blank is not a dot, otherwise it should remain.
The RegEx identifying the corrcet places is
[a-z] <br>\n
but now, obviously, one letter too much is replaced. Can I get back this character through a variable to be put into the Replacement Text line?
Otherwise, how can I tell Calibre to replace only once the is no dot in front of the matching RegEx?
Thanks for your help,
miquele
miquele is offline   Reply With Quote
Old 08-19-2013, 01:04 AM   #69
HeyPretty
Fantasy Junkie
HeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animalsHeyPretty is kind to children and small, furry animals
 
HeyPretty's Avatar
 
Posts: 5
Karma: 6750
Join Date: Aug 2013
Location: Seattle, WA
Device: Kindle3
Thumbs up

Quote:
Originally Posted by Confuzzled View Post
yeah I did use the test wizard sorry if i wasnt clear... Thats the oddity no yellow even when i just put in a simple string which according to the user manual should come up.

The code i played with is modifications of this code:
<b.*?>\s*Generated\s+by\s+ABC\s+Amber\s+LIT.*?</b> which as far as i should i'm aware should match i came up with something to this affect but using <p> i.e. page break instead of <b> bold wasn't sure of my defining structure tho so took this kovid structure and then when that didnt work tried to edit it until it did.

also tried removing <a> i.e the html link but it didnt work either. delphi was always my preference to python

my problem is this repeating code
<p class="calibre3"><b class="calibre1">Generated by ABC Amber LIT Conv<a href="http://www.processtext.com/abclit.html" class="calibre2">erter, http://www.processtext.com/abclit.html</a></b></p>

thanx so much
Your regex worked perfectly for me! Thanks so much for posting it. I'm so glad I searched fervently before I started trying to make my own.
HeyPretty is offline   Reply With Quote
Old 11-09-2013, 12:21 PM   #70
esc7
Member
esc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animalsesc7 is kind to children and small, furry animals
 
Posts: 10
Karma: 6738
Join Date: Dec 2011
Device: Kindle Paperwhite
Footerlike block of text

Hello everyone!
I'm downloading webpages with the Mozilla Print Pages 2 PDF add on and I'd like to remove the parts after the main body text (usually related posts, links, ads) with Calibre when converting to mobi, that are of course different at every post and every category. There is only one pattern that occurs in every single page I noticed, that starts with "Inshare(1-9)" and ends with the word "Albumclose " like here:

Last paragraph of text <br>
inShare2<br>
Related post1 <br>
Related ad1 <br>
Related post2<br>
Related post3<br>
Related ad2 <br>
Albumclose <br>
Is there any chance I can delete this whole block of text with Calibre's Search & Replace feature at every post automatically or is that impossible? I looked at this part of the manual but it didn't really work. Any help is appreciated

Last edited by esc7; 11-09-2013 at 12:48 PM.
esc7 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex help to remove HTML footer neonbible Calibre 4 09-09-2010 09:42 AM
Regex to remove header from PDF neonbible Calibre 4 09-07-2010 10:08 AM
Removing header and footer radicalnomad Calibre 2 08-26-2010 10:34 AM
Header/Footer removal Solicitous Calibre 2 03-30-2010 05:53 AM
Multiline Regex Footer hover Calibre 10 02-03-2010 04:23 AM


All times are GMT -4. The time now is 05:15 PM.


MobileRead.com is a privately owned, operated and funded community.