Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 06-14-2022, 05:19 PM   #1
enuddleyarbl
Guru
enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.
 
enuddleyarbl's Avatar
 
Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
What is "id=" For?

Sorry for this. It's been bothering me for a while, but, I just can't figure it out. What is an ID even for in an epub? According to things like:

https://www.w3.org/publishing/epub3/...tml#attrdef-id

it's a "shared attribute" and "The ID [XML] of the element, which MUST be unique within the document scope." So, I assumed it was just a name applied to part of the epub that could be referenced in other places to conveniently work with them.

But, for instance, right now, I'm looking at a book in the Editor and all the ID= elements are like:

<body id="BE6O0-76267ef1661c4ca4a716bfbfb65daab2" class="calibre7">
or
<p class="calibre8"><a id="c07" class="title"></a></p>

The body ones are understandable. They're for pointing at the chapters and are referenced in the text table of contents and in the toc.ncx file. But, AFAICS, the ids that follow after those body ones (they're simple strings looking like "c07" (which in this case stands for Chapter 7)) aren't referenced anywhere. They're not in the text table of contents, the toc.ncx file or anywhere else in the document except where they're first defined. What are they for? Are they just an artifact of Calibre's conversion process?

And, while I'm embarrassing myself here, what's with the <a ...></a> that hold those ids? I thought those were to reference http pages somewhere with an "href=".
enuddleyarbl is offline   Reply With Quote
Old 06-14-2022, 07:11 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
'a' = Anchor
Need to be able to jump to someplace (other than start of section. Top of file is assumed)?
Make an Anchor place to land on.

The other end (calling) has where to go.
a simple #C07 means it is in the same section (file)


https://www.w3.org/publishing/epub3/...tml#attrdef-id
This is an Off page (site) anchor reference
theducks is offline   Reply With Quote
Advert
Old 06-14-2022, 09:46 PM   #3
enuddleyarbl
Guru
enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.
 
enuddleyarbl's Avatar
 
Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
Thanks for the reply. Your link to w3 got mangled in adding it here. But, I found another location to explain things:

https://www.w3schools.com/htmL/html_id.asp

However, I went through all the files in the epub I'm currently looking at and none of those simple ids are referenced anywhere except where they're defined. I even searched for all occurrences of # and they were always associated with the non-simple ids in the body statements.

The only thing I can think of is that those simple ids are from the original epub (since they closely resemble the chapter names) and were used in the original TOC. But, maybe in Calibre's conversion or my playing around with editing the TOC, they got superseded with those big honkin' ids in the body statement. But, even that is odd since the id's shouldn't be defined in the anchor statements. They should just be referenced.

Anyway, thanks again.
enuddleyarbl is offline   Reply With Quote
Old 06-14-2022, 09:52 PM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by DaveLessnau View Post
Thanks for the reply. Your link to w3 got mangled in adding it here. But, I found another location to explain things:

https://www.w3schools.com/htmL/html_id.asp

However, I went through all the files in the epub I'm currently looking at and none of those simple ids are referenced anywhere except where they're defined. I even searched for all occurrences of # and they were always associated with the non-simple ids in the body statements.

The only thing I can think of is that those simple ids are from the original epub (since they closely resemble the chapter names) and were used in the original TOC. But, maybe in Calibre's conversion or my playing around with editing the TOC, they got superseded with those big honkin' ids in the body statement. But, even that is odd since the id's shouldn't be defined in the anchor statements. They should just be referenced.

Anyway, thanks again.
IMHO you are better off using a full id. deleting then Splitting and can orphan the simple reference (AKA Break it)
theducks is offline   Reply With Quote
Old 06-15-2022, 07:16 AM   #5
rjwse@aol.com
Addict
rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.
 
rjwse@aol.com's Avatar
 
Posts: 320
Karma: 2228060
Join Date: Dec 2013
Location: LaVernia, Texas
Device: kindle epub readers on android
I think of ID and CLASS as follows: when starting from scratch, don't use ID at all. It is a 'one-shot' designation. You can only use it once. A CLASS definition can be used as many times as you want. In the stylesheet it has a dot in front of it and the word is made up, something like .whatever{stylewhatever="something"; stylewhatever2="something"; stylewhatever3="something";} In the xhtml there is no dot and the CLASS is inside of a tag, such as, <P class="whatever">text</p> Total purists do not use CLASSES at all and go to the extra effort of using only styles. This requires extra typing. It has a great advantage of being able to see errors from the get-go, whereas classes are real head scratchers sometimes. Using classes lets you use advanced stuff that except for people who goof with ultracomplex stuff you probably don't need. Nevertheless, I do. Best regards, Pop
rjwse@aol.com is offline   Reply With Quote
Advert
Old 06-15-2022, 11:24 AM   #6
enuddleyarbl
Guru
enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.
 
enuddleyarbl's Avatar
 
Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
If the formatting isn't so bad (i.e., if I can actually read the book on my Forma), I usually just leave everything alone. But, if they've done something silly enough that it bothers me while reading, I go into the Editor and clean things up. In general, I hate those class statements (I'm not smart enough to figure them out). So, I usually delete most of them and use my generic styles. I've already gotten rid of the class statements on the <div>s around every paragraph and converted them over to <p>s. Since those anchored id= things don't seem to be used, I'm going to rip out the whole line. Then I'll do the same for the formatting around the chapter headings and just use <h2>s there. I don't understand why these publishers put all these weird things into what should be a simple, consistent, easy-to-read set of formats for a book. If it were a web page, fine (I guess). But, its a book. It was sold as a book. For a specific ereader. There's no reason for this kind of stuff.
enuddleyarbl is offline   Reply With Quote
Old 06-15-2022, 02:58 PM   #7
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,759
Karma: 5706256
Join Date: Nov 2009
Device: many
The ids can be referenced in css selectors, ncx (toc, pagelist), opf guide, nav (toc, landmarks, pagelist), external cfi's, javascripts (if epub3 using javascript), smil, and opf (internally) in general not to mention normal links, footnotes, endnotes, etc.

So before deleting them, check carefully.
KevinH is offline   Reply With Quote
Old 06-15-2022, 03:52 PM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,740
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I've noticed that a lot of publisher ID's are just to say what the section is that you are reading and have no other reason to be there and can be deleted.
JSWolf is offline   Reply With Quote
Old 06-22-2022, 07:21 AM   #9
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,010
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by JSWolf View Post
I've noticed that a lot of publisher ID's are just to say what the section is that you are reading and have no other reason to be there and can be deleted.
One ebook from a big publisher had a unique ID on EVERY paragraph. I only keep the ones used by the TOC, i.e. Chapter headings and similar.

Regex is your friend!
Quoth is offline   Reply With Quote
Old 06-22-2022, 09:40 AM   #10
enuddleyarbl
Guru
enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.
 
enuddleyarbl's Avatar
 
Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
The book I just edited has ids for each page number. For instance:

<a id="page_4"/>

I wonder if that's used by things like the default epub reader on Kobo? It's sure not used anywhere in the document (at least after what I did to it ).
enuddleyarbl is offline   Reply With Quote
Old 06-22-2022, 10:33 AM   #11
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,759
Karma: 5706256
Join Date: Nov 2009
Device: many
Most likely it is for a PageList for an ncx or nav section. They probably match a specific printed release. Useful if the book is academic and citations to pages or page ranges are needed. But off-times just left over from ocr scans.
KevinH is offline   Reply With Quote
Old 06-22-2022, 01:04 PM   #12
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Quote:
Originally Posted by DaveLessnau View Post
The book I just edited has ids for each page number. For instance:

<a id="page_4"/>

I wonder if that's used by things like the default epub reader on Kobo? It's sure not used anywhere in the document (at least after what I did to it ).
This sort of id can be useful when comparing a badly OCR'd EPUB with a PDF when you are correcting the EPUB to correct spelling to match the words used on the PDF. It can make it easier to locate the offending word in the PDF if yo know what page it is on.

I doubt if that is why the ids were generated in the first place but I've been very thankful for them on a couple of occasions.

BobC
BobC is offline   Reply With Quote
Old 06-22-2022, 04:06 PM   #13
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,740
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Quoth View Post
One ebook from a big publisher had a unique ID on EVERY paragraph. I only keep the ones used by the TOC, i.e. Chapter headings and similar.

Regex is your friend!
I've seen that ID per paragraph on a number of eBooks. Really stupid IMHO.
JSWolf is offline   Reply With Quote
Old 06-22-2022, 05:18 PM   #14
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by JSWolf View Post
I've seen that ID per paragraph on a number of eBooks. Really stupid IMHO.
But the question is: 'Do they really HARM anything?' They do add to the file size, but so does 14 screens of raves , excerpts from other work...
On a modern device, we may fit one less book (that I probably wont have time to read this year anyway).

We should worry more about things that don't work correctly like dead or wrong landing links.
theducks is offline   Reply With Quote
Old 06-22-2022, 11:02 PM   #15
enuddleyarbl
Guru
enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.enuddleyarbl ought to be getting tired of karma fortunes by now.
 
enuddleyarbl's Avatar
 
Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
In my case, I'm mostly worried about what happens if I delete those ids. So far, I've had no problem deleting every id= thing I've found. Usually, those are just for TOC types of things. But, after ripping everything out of the files and putting the proper <h1> and <h2> tags where I need them, I have the Calibre Editor recreate the toc.ncx file and then have it create an inline TOC from that. I then replace the book's inline TOC with the Calibre generated one. Again, no problems yet.

Of course, some of the silly HTML I think I'm seeing does bother me and I do wish publishers would be a bit more reasonable in what they put in there. But, then again, I also wish they'd read and correct the resulting books after they OCR scan them to a digital format.

Last edited by enuddleyarbl; 06-22-2022 at 11:04 PM.
enuddleyarbl is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Pressing "Restore Defaults" under "Book Details" wipes all "Look & Feel" settings. MarjaE Library Management 1 03-30-2021 11:46 AM


All times are GMT -4. The time now is 02:50 AM.


MobileRead.com is a privately owned, operated and funded community.