![]() |
#1 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Jan 2020
Device: kobo aura
|
Sigil for epub curation
Does Sigil make sense for ePub curation, were the original epubs are from various sources and of various formatting? My main goal is to have an ePub library with uniform metadata and preferably uniform formatting (with a newline between all paragraphs).
I do not care much about pictures or other things, basically I just want to make sure all ePub books are the same. This is more of a personal quirk, but it is important (I do the same manual curation for my personal iTunes library for example) Also, maybe this is better served with a script of some sort. I am still unfamiliar with working with ePub formats, but perhaps there is a way to simply extract the text from some original ePub and then dump that text into a new, curated ePub file? Perhaps I will have to know these things regardless of working via sigil or scripting it via python or other lang. Any help appreciated. |
![]() |
![]() |
![]() |
#2 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,347
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
In short - no-ish.
There are scripts that can make everything look the same. But "the same" doesn't mean it looks good. I'm not talking about stylistic choices, per se, but the number of different styles and formats that are put out there from various sources means that it would be almost impossible to determine how their particular brand of styling was being applied. You would need an AI to be able to make those kinds of decisions. You will need to manually determine what each paragraph does. I've been doing this for a few years now and every book requires me putting my hands on it to determine what kind of styling needs to be applied. Once you have made that determination, it is fairly simple and straight forward to apply a consistent style via CSS style sheets to make all your titles, headers, paragraph openers, other special paragraphs, standard paragraphs, backmatter, and frontmatter pages look consistent between different books. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
the only possibility that comes to mind for uniform formatiing is firstly convert to text, or a rich text variant that removes pretty much everything
, then add a minimal set of styles that you can live with e.g. one for header, one for body text.... but that's going to destroy lots of nice features of a more complex book. metadata . OTOH is pretty easy to scrape in and edit - there are tools for that in calibre. just to force a blank line space between paragraphs could be done by using calibre's conversion and extra css features, but you do need some basic knowledge of syntax, and of how CSS actually works. Go read a CSS / HTML primer or two, and think on... |
![]() |
![]() |
![]() |
#4 | ||
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,347
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Quote:
Code:
p {margin-top:2em} Quote:
|
||
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
agree, i should have said if you want to create the appearance of a blank line.... but was trying to keep it simple, not knowing what terms the OP was familiar with.
margin-op: 2em; would be way to large for my tastes, but its subjective setting martin top, or margin bottom to as low as 0.3em will introduce a visible gap between paragraphs but then there's also scene breaks to consider.... theres a reason why some authors or publishers pay to have their books formatted by experts - its not as simple as whack in a blank line here & there |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I agree (and That is what I do for all my PERSONAL touch-ups)
Code:
margin: .5em 0 0 0 ; Code:
<p>Last line</p> </body> |
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
good point about bottom margin. I had never thought that through.
on having all books be consistent, i once upon a time decided I wanted a 0.3em space between paragraphs and reworked a few hundred books... a couple of e-reader device changes later I decided that actually I liked 0.1em better... 2nd redo move on a year and I am thinking, actually, no space at all between paragraphs is what I like best.... whenever I dream of doing another global consistency project, I have to remind myself of that history ![]() I think the devices you choose to read on play a big part in the aesthetic of making a book look " just right" on screen 0.3 looked ok on a basic Sony PRS, but as my devices got bigger, and my eyesight got poorer, it became more about minimizing page turns, for comfort. So I began to shrink side margins, slash away top space above chapter headers, lose the between-paragraph spaces ... all of which could well horrify some other reader. But I tweak copies books exclusively for my own reading, so no one else has to like my layout choices |
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,569
Karma: 204127028
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
On the other hand ... there's no reason someone couldn't write their own Output plugin. I can't imagine the amount of work it would entail to take just about any epub as input and "homogenize" it into one's own personal standard. But the option is certainly there for the motivated individual.
|
![]() |
![]() |
![]() |
#9 |
Running with scissors
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,586
Karma: 14328510
Join Date: Nov 2019
Device: none
|
The problem with a script is that not all books use semantic markup; for example, instead of using h tags for headers they'll use p tags and style it to look like a header. And divs get used in situations that seem odd to me.
|
![]() |
![]() |
![]() |
#10 | |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 58
Karma: 438844
Join Date: Aug 2019
Device: PC, Linux Mint, Tablet, and Telephone
|
The answer to your question:
Quote:
Because an EPUB can be created in all kinds of ways and can be converted in all kinds of ways, a script is an endless path of constantly finding and solving problems in the script. An EPUB conversion script therefore seems to me to be a salvation route. Another option is to search and replace in the EPUB page to replace or remove the formatting codes. This can be done in Sigil itself or via an external HTML editor via copy and paste. However, that is very labor intensive and probably takes a multiple of the time to remove all formatting by a simple paste as text approach and manually enter the layout you want. Because simple is the nicest and easiest solution, I advise you to put a layout you want in a CSS file. This offers you the opportunity to implement this throughout the EPUB with a modification to your CSS layout file. Small is beautiful, which is achievable in Sigil |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Sigil and ePub 3 | AlexBell | Sigil | 2 | 05-11-2018 02:10 AM |
Epub crashes on Sigil for Mac, OK on Sigil for PC | crystamichelle | Sigil | 6 | 08-14-2013 02:52 PM |
Multiple files, *-tmp.epub, are left in /tmp/Sigil/scratchpad after closing Sigil | Ahmad Samir | Sigil | 8 | 11-28-2012 04:27 AM |
Sigil and epub 3 ? | helenouchkaia | Sigil | 6 | 12-06-2011 09:10 AM |
International Journal of Digital Curation | Nate the great | News | 0 | 12-07-2009 03:17 PM |