Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 03-29-2010, 06:46 AM   #16
JayLaFunk
Connoisseur
JayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enough
 
JayLaFunk's Avatar
 
Posts: 94
Karma: 538
Join Date: Nov 2009
Device: iPad
Would be good if there was a sticky with all your regex, detailing what they exactly do and a simple guide to put them in the right place, something like what they have over at MP3Tag...

Now can someone make one for removing all unwanted white space, I have a few ebooks where every line has a white space beneath it, have tried a few ways of getting rid to make the epub more compact but no joy so far

Jay
JayLaFunk is offline   Reply With Quote
Old 03-29-2010, 07:34 PM   #17
charleski
Wizard
charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.
 
Posts: 1,196
Karma: 1281258
Join Date: Sep 2009
Device: PRS-505
Quote:
Originally Posted by JayLaFunk View Post
I have a few ebooks where every line has a white space beneath it, have tried a few ways of getting rid to make the epub more compact but no joy so far

Jay
That sounds like a problem with the css rather than the body text.
charleski is offline   Reply With Quote
Old 03-30-2010, 06:54 AM   #18
JayLaFunk
Connoisseur
JayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enough
 
JayLaFunk's Avatar
 
Posts: 94
Karma: 538
Join Date: Nov 2009
Device: iPad
Quote:
Originally Posted by charleski View Post
That sounds like a problem with the css rather than the body text.
Sorry for appearing dumb about these things but do you mean the stylesheet.css

I loaded a correct epub in Sigil 0.2, copied the stylesheet.css info to Notepad then did the same with one of my epubs which has the white spaces, compared the info between the two but they both look similar.

Could someone post a stylesheet.css of how it should look, then I could learn from that

Jay
JayLaFunk is offline   Reply With Quote
Old 03-30-2010, 08:37 PM   #19
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Jay,

The amount of white space between paragraphs is generally called "paragraph spacing" or leading. As such it should be handled within the normal text definition of the CSS stylesheet of your ebook. It has nothing to do with RegEx at all.

Your ebook should have a stylesheet included within it; alternately it is at the top of any file which contains the body text of the book. If it's an epub it could be in every chapter. I don't know the exact command, but I believe this is one field that allows em, en, px or % measurements (among others) to be used. As such you might have to play with the values provided, changing the numbers enough to cause an obvious change to the text (add 20 to whatever number is there, see what the outcome is, etc).
Sabardeyn is offline   Reply With Quote
Old 03-31-2010, 01:34 AM   #20
paulpeer
Zealot
paulpeer is on a distinguished road
 
paulpeer's Avatar
 
Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
Quote:
Originally Posted by JayLaFunk View Post
Could someone post a stylesheet.css of how it should look, then I could learn from that

Jay
Can you find something like
Code:
line-height: .3 em;
in this CSS?

You can also look to the xhtml files. Is there a
Code:
<br />
after every paragraph?

Just remove them.
paulpeer is offline   Reply With Quote
Old 03-31-2010, 07:19 AM   #21
JayLaFunk
Connoisseur
JayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enough
 
JayLaFunk's Avatar
 
Posts: 94
Karma: 538
Join Date: Nov 2009
Device: iPad
Quote:
Originally Posted by paulpeer View Post
Can you find something like
Code:
line-height: .3 em;
in this CSS?

You can also look to the xhtml files. Is there a
Code:
<br />
after every paragraph?

Just remove them.
Hi

This is a epub with a line space under each line, this is the stylesheet.css from Sigil 0.20


@namespace h "http://www.w3.org/1999/xhtml";
.calibre {
display: block;
font-size: 1em;
margin-bottom: 0;
margin-left: 5pt;
margin-right: 5pt;
margin-top: 0;
page-break-before: always;
text-align: justify
}
.calibre1 {
border-bottom: 0;
border-top: 0;
display: block;
margin-bottom: 0;
margin-left: 0;
margin-right: 0;
margin-top: 0;
padding-bottom: 0;
padding-top: 0;
text-align: center;
text-indent: 1.5em
}
.calibre2 {
height: auto;
width: auto
}
.calibre3 {
display: block;
font-family: monospace;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
margin-top: 1em;
white-space: pre
}
a {
color: inherit;
text-decoration: inherit;
cursor: default
}
a[href] {
color: blue;
text-decoration: underline;
cursor: pointer
}

This is the code view of the same epub from sigil 0.19

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<style type="text/css">
/*<![CDATA[*/
@namespace h "http://www.w3.org/1999/xhtml";
.calibre {
display: block;
font-size: 1em;
margin-bottom: 0;
margin-left: 5pt;
margin-right: 5pt;
margin-top: 0;
page-break-before: always;
text-align: justify
}
.calibre1 {
border-bottom: 0;
border-top: 0;
display: block;
margin-bottom: 0;
margin-left: 0;
margin-right: 0;
margin-top: 0;
padding-bottom: 0;
padding-top: 0;
text-align: center;
text-indent: 1.5em
}
.calibre2 {
height: auto;
width: auto
}
.calibre3 {
display: block;
font-family: monospace;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
margin-top: 1em;
white-space: pre
}
a {
color: inherit;
text-decoration: inherit;
cursor: default
}
a[href] {
color: blue;
text-decoration: underline;
cursor: pointer
}

img.sgc-1 {height: 100%}

/*SG DO NOT MODIFY.
This style is used by Sigil.
It will be removed on export
along with the "sigilChapterBreak" HR tags. SG*/
hr.sigilChapterBreak {
border: none 0;
border-top: 3px double #c00;
height: 3px;
clear: both;
}
/*]]>*/
</style>
</head>

<body>
<div><img alt="cover" class="sgc-1" src="../images/img0002.jpg" /></div>

<div>
<hr class="sigilChapterBreak" />
</div>

<p class="calibre1"><img class="calibre2" src="../images/img0001.jpg" /></p>

<div>
<hr class="sigilChapterBreak" />
</div>
<pre class="calibre3">


Jay
JayLaFunk is offline   Reply With Quote
Old 03-31-2010, 07:59 AM   #22
paulpeer
Zealot
paulpeer is on a distinguished road
 
paulpeer's Avatar
 
Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
Quote:
Originally Posted by JayLaFunk View Post

.calibre3 {
display: block;
font-family: monospace;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
margin-top: 1em;
white-space: pre
}
Remove the line "white-space: pre" and see what happens.
If this is not enough, change at least the first occurences of
Code:
<pre class="calibre3">bla bla </pre>
into
Code:
<p>bla bla </p>

If you use "pre" all line breaks of the source code are literally followed. Is that what you want to occur?

Last edited by paulpeer; 03-31-2010 at 08:05 AM. Reason: some stuff added
paulpeer is offline   Reply With Quote
Old 03-31-2010, 08:13 AM   #23
JayLaFunk
Connoisseur
JayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enough
 
JayLaFunk's Avatar
 
Posts: 94
Karma: 538
Join Date: Nov 2009
Device: iPad
Quote:
Originally Posted by paulpeer View Post
Remove the line "white-space: pre" and see what happens. If you use "pre" all line breaks of the source code are literally followed. Is that what you want to occur?
Hi,

Really appreciate your help,

Using Sigil 0.19 in split view and did as you say and took out the line "white-space: pre" but nothing changed, still the same...

Could I send you the epub to look at in your Sigil

Jay
JayLaFunk is offline   Reply With Quote
Old 03-31-2010, 08:23 AM   #24
paulpeer
Zealot
paulpeer is on a distinguished road
 
paulpeer's Avatar
 
Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
Quote:
Originally Posted by JayLaFunk View Post
Using Sigil 0.19 in split view and did as you say and took out the line "white-space: pre" but nothing changed, still the same...
That means that probably there are some <pre> tags in the xhtml part as well. So change <pre class="calibre3"> to <p> and </pre> to </p> in the xhtml part.

Changing one or two of them is enough. If it works, change all of them.
paulpeer is offline   Reply With Quote
Old 03-31-2010, 09:36 AM   #25
JayLaFunk
Connoisseur
JayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enoughJayLaFunk will become famous soon enough
 
JayLaFunk's Avatar
 
Posts: 94
Karma: 538
Join Date: Nov 2009
Device: iPad
Quote:
Originally Posted by paulpeer View Post
That means that probably there are some <pre> tags in the xhtml part as well. So change <pre class="calibre3"> to <p> and </pre> to </p> in the xhtml part.

Changing one or two of them is enough. If it works, change all of them.
Cheers Paulpeer,

That did the trick, will save that for future use

Many thanks for your help

Jay
JayLaFunk is offline   Reply With Quote
Old 03-31-2010, 10:12 AM   #26
paulpeer
Zealot
paulpeer is on a distinguished road
 
paulpeer's Avatar
 
Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
Quote:
Originally Posted by JayLaFunk View Post
Cheers Paulpeer,

That did the trick, will save that for future use

Many thanks for your help

Jay
I'm glad I could help, Jay. But I recommend that you read an XHTML tutorial. Self made ePubs are full of those pitfalls ...
paulpeer is offline   Reply With Quote
Old 01-06-2013, 05:41 AM   #27
snipe2004
Junior Member
snipe2004 began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jan 2013
Device: Pc - Linux - Calibre
Hi guys,

I'm desperately looking for a regex which would automatically search in my eBooks' titles the string "2012" (for example) and put it in my pubdate field. And, once this is done, I would also like it to replace some other fields like this, but the pubdate is the hardest for me.

Let me show you an example :

The pdf file is : TitleOfTheSerie Volume1 Number1 (out of 3) (2012)
I would like to use the Search&Replace function of Calibre to extract those informations and put them in the correct place.
So : 2012 -> pubdate (I'll use january everytime)
TitleOftheSerie -> Serie
Number -> Serie[X]

Quote:
Originally Posted by HarryT View Post
Try using a search string of:

<p>([0-9]*)</p>

and a replace string of:

<h3>\1</h3>

ie, with parentheses around the "search string" for the numbers.
So this seems very useful to me! I already adapted it to this :

Search in : title
For : ([0-2][0-9][0-9][0-9])
And replace it by : janv. \1
In : pubdate

But I get an error :

Code:
unknown string format

calibre, version 0.9.11
ERREUR : Echec: unknown string format

Traceback (most recent call last):
  File "/usr/lib/calibre/calibre/gui2/dialogs/metadata_bulk.py", line 125, in do_one_safe
    self.do_one(id)
  File "/usr/lib/calibre/calibre/gui2/dialogs/metadata_bulk.py", line 290, in do_one
    self.s_r_func(id)
  File "/usr/lib/calibre/calibre/gui2/dialogs/metadata_bulk.py", line 851, in do_search_replace
    setter(id, val, notify=False, commit=False)
  File "/usr/lib/calibre/calibre/library/database2.py", line 2613, in set_pubdate
    dt = parse_only_date(dt)
  File "/usr/lib/calibre/calibre/utils/date.py", line 94, in parse_only_date
    ans = parse_date(raw, default=default, assume_utc=assume_utc)
  File "/usr/lib/calibre/calibre/utils/date.py", line 80, in parse_date
    dt = parse(date_string, default=default, dayfirst=parse_date_day_first)
  File "/usr/lib/python2.7/dist-packages/dateutil/parser.py", line 697, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/usr/lib/python2.7/dist-packages/dateutil/parser.py", line 303, in parse
    raise ValueError, "unknown string format"
ValueError: unknown string format
I also tried many combinations such as :
1-1-\1
01/01/\1
01 janv. \1
but none of them had worked
Could you help me?
Thanks in advance,
Snipe

Last edited by snipe2004; 01-06-2013 at 05:45 AM.
snipe2004 is offline   Reply With Quote
Old 01-06-2013, 08:53 AM   #28
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,394
Karma: 20212733
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
If I understand correctly you are only looking for "2012". If that is so, then there is no need for regex, just a simple search and replace. Try:
Find: 2012
Replace: Janv. 2012

if you are looking for ANY year then you can try this Regex:
Find: ([0-9]{4})
Replace: Janv. \1

I'm not sure what flavor of regex Calibre uses. You may have better luck if you check in their forum. This one is for the Sigil software.

Last edited by Turtle91; 01-06-2013 at 08:58 AM.
Turtle91 is offline   Reply With Quote
Old 01-06-2013, 09:57 AM   #29
snipe2004
Junior Member
snipe2004 began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jan 2013
Device: Pc - Linux - Calibre
Quote:
Originally Posted by Turtle91 View Post
If I understand correctly you are only looking for "2012". If that is so, then there is no need for regex, just a simple search and replace. Try:
Find: 2012
Replace: Janv. 2012

if you are looking for ANY year then you can try this Regex:
Find: ([0-9]{4})
Replace: Janv. \1

I'm not sure what flavor of regex Calibre uses. You may have better luck if you check in their forum. This one is for the Sigil software.
Unfortunately, it doen'st work :

Code:
calibre, version 0.9.11
ERREUR : Echec: unknown string format

Traceback (most recent call last):
  File "/usr/lib/calibre/calibre/gui2/dialogs/metadata_bulk.py", line 125, in do_one_safe
    self.do_one(id)
  File "/usr/lib/calibre/calibre/gui2/dialogs/metadata_bulk.py", line 290, in do_one
    self.s_r_func(id)
  File "/usr/lib/calibre/calibre/gui2/dialogs/metadata_bulk.py", line 851, in do_search_replace
    setter(id, val, notify=False, commit=False)
  File "/usr/lib/calibre/calibre/library/database2.py", line 2613, in set_pubdate
    dt = parse_only_date(dt)
  File "/usr/lib/calibre/calibre/utils/date.py", line 94, in parse_only_date
    ans = parse_date(raw, default=default, assume_utc=assume_utc)
  File "/usr/lib/calibre/calibre/utils/date.py", line 80, in parse_date
    dt = parse(date_string, default=default, dayfirst=parse_date_day_first)
  File "/usr/lib/python2.7/dist-packages/dateutil/parser.py", line 697, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/usr/lib/python2.7/dist-packages/dateutil/parser.py", line 303, in parse
    raise ValueError, "unknown string format"
ValueError: unknown string format
But sorry for the mistake, I'm going back to Calibre's forum.
snipe2004 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
What a regex is Worldwalker Calibre 20 05-10-2010 05:51 AM
using wildcards for conversion ? Riiyachan Calibre 7 04-20-2010 10:16 PM
Help with a regex A.T.E. Calibre 1 04-05-2010 07:50 AM
wildcards in sigil bobcdy Sigil 2 12-18-2009 10:19 PM
Regex help... Bobthebass Workshop 6 04-26-2009 03:54 PM


All times are GMT -4. The time now is 05:51 PM.


MobileRead.com is a privately owned, operated and funded community.