![]() |
#1 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
|
What is "id=" For?
Sorry for this. It's been bothering me for a while, but, I just can't figure it out. What is an ID even for in an epub? According to things like:
https://www.w3.org/publishing/epub3/...tml#attrdef-id it's a "shared attribute" and "The ID [XML] of the element, which MUST be unique within the document scope." So, I assumed it was just a name applied to part of the epub that could be referenced in other places to conveniently work with them. But, for instance, right now, I'm looking at a book in the Editor and all the ID= elements are like: <body id="BE6O0-76267ef1661c4ca4a716bfbfb65daab2" class="calibre7"> or <p class="calibre8"><a id="c07" class="title"></a></p> The body ones are understandable. They're for pointing at the chapters and are referenced in the text table of contents and in the toc.ncx file. But, AFAICS, the ids that follow after those body ones (they're simple strings looking like "c07" (which in this case stands for Chapter 7)) aren't referenced anywhere. They're not in the text table of contents, the toc.ncx file or anywhere else in the document except where they're first defined. What are they for? Are they just an artifact of Calibre's conversion process? And, while I'm embarrassing myself here, what's with the <a ...></a> that hold those ids? I thought those were to reference http pages somewhere with an "href=". |
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
'a' = Anchor
Need to be able to jump to someplace (other than start of section. Top of file is assumed)? Make an Anchor place to land on. The other end (calling) has where to go. a simple #C07 means it is in the same section (file) https://www.w3.org/publishing/epub3/...tml#attrdef-id This is an Off page (site) anchor reference |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
|
Thanks for the reply. Your link to w3 got mangled in adding it here. But, I found another location to explain things:
https://www.w3schools.com/htmL/html_id.asp However, I went through all the files in the epub I'm currently looking at and none of those simple ids are referenced anywhere except where they're defined. I even searched for all occurrences of # and they were always associated with the non-simple ids in the body statements. The only thing I can think of is that those simple ids are from the original epub (since they closely resemble the chapter names) and were used in the original TOC. But, maybe in Calibre's conversion or my playing around with editing the TOC, they got superseded with those big honkin' ids in the body statement. But, even that is odd since the id's shouldn't be defined in the anchor statements. They should just be referenced. Anyway, thanks again. |
![]() |
![]() |
![]() |
#4 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
|
|
![]() |
![]() |
![]() |
#5 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 320
Karma: 2228060
Join Date: Dec 2013
Location: LaVernia, Texas
Device: kindle epub readers on android
|
I think of ID and CLASS as follows: when starting from scratch, don't use ID at all. It is a 'one-shot' designation. You can only use it once. A CLASS definition can be used as many times as you want. In the stylesheet it has a dot in front of it and the word is made up, something like .whatever{stylewhatever="something"; stylewhatever2="something"; stylewhatever3="something";} In the xhtml there is no dot and the CLASS is inside of a tag, such as, <P class="whatever">text</p> Total purists do not use CLASSES at all and go to the extra effort of using only styles. This requires extra typing. It has a great advantage of being able to see errors from the get-go, whereas classes are real head scratchers sometimes. Using classes lets you use advanced stuff that except for people who goof with ultracomplex stuff you probably don't need. Nevertheless, I do. Best regards, Pop
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
|
If the formatting isn't so bad (i.e., if I can actually read the book on my Forma), I usually just leave everything alone. But, if they've done something silly enough that it bothers me while reading, I go into the Editor and clean things up. In general, I hate those class statements (I'm not smart enough to figure them out). So, I usually delete most of them and use my generic styles. I've already gotten rid of the class statements on the <div>s around every paragraph and converted them over to <p>s. Since those anchored id= things don't seem to be used, I'm going to rip out the whole line. Then I'll do the same for the formatting around the chapter headings and just use <h2>s there. I don't understand why these publishers put all these weird things into what should be a simple, consistent, easy-to-read set of formats for a book. If it were a web page, fine (I guess). But, its a book. It was sold as a book. For a specific ereader. There's no reason for this kind of stuff.
|
![]() |
![]() |
![]() |
#7 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,759
Karma: 5706256
Join Date: Nov 2009
Device: many
|
The ids can be referenced in css selectors, ncx (toc, pagelist), opf guide, nav (toc, landmarks, pagelist), external cfi's, javascripts (if epub3 using javascript), smil, and opf (internally) in general not to mention normal links, footnotes, endnotes, etc.
So before deleting them, check carefully. |
![]() |
![]() |
![]() |
#8 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,740
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I've noticed that a lot of publisher ID's are just to say what the section is that you are reading and have no other reason to be there and can be deleted.
|
![]() |
![]() |
![]() |
#9 | |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,010
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
Quote:
Regex is your friend! |
|
![]() |
![]() |
![]() |
#10 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
|
The book I just edited has ids for each page number. For instance:
<a id="page_4"/> I wonder if that's used by things like the default epub reader on Kobo? It's sure not used anywhere in the document (at least after what I did to it ![]() |
![]() |
![]() |
![]() |
#11 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,759
Karma: 5706256
Join Date: Nov 2009
Device: many
|
Most likely it is for a PageList for an ncx or nav section. They probably match a specific printed release. Useful if the book is academic and citations to pages or page ranges are needed. But off-times just left over from ocr scans.
|
![]() |
![]() |
![]() |
#12 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
|
Quote:
I doubt if that is why the ids were generated in the first place but I've been very thankful for them on a couple of occasions. BobC |
|
![]() |
![]() |
![]() |
#13 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,740
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
![]() |
![]() |
![]() |
#14 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() On a modern device, we may fit one less book (that I probably wont have time to read this year ![]() We should worry more about things that don't work correctly like dead or wrong landing links. |
|
![]() |
![]() |
![]() |
#15 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
|
In my case, I'm mostly worried about what happens if I delete those ids. So far, I've had no problem deleting every id= thing I've found. Usually, those are just for TOC types of things. But, after ripping everything out of the files and putting the proper <h1> and <h2> tags where I need them, I have the Calibre Editor recreate the toc.ncx file and then have it create an inline TOC from that. I then replace the book's inline TOC with the Calibre generated one. Again, no problems yet.
Of course, some of the silly HTML I think I'm seeing does bother me and I do wish publishers would be a bit more reasonable in what they put in there. But, then again, I also wish they'd read and correct the resulting books after they OCR scan them to a digital format. Last edited by enuddleyarbl; 06-22-2022 at 11:04 PM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Pressing "Restore Defaults" under "Book Details" wipes all "Look & Feel" settings. | MarjaE | Library Management | 1 | 03-30-2021 11:46 AM |