03-25-2011, 06:53 PM | #1 |
Guru
Posts: 932
Karma: 15752887
Join Date: Mar 2011
Location: Norway
Device: Ipad, kindle paperwhite
|
Graphical layout vs. semantically "correct" XHTML code
I am quite new to ebooks, and am working on a project to make an ePub out of an old book. I have scanned it and used OCR to extract the plain text out of it, but now remains the work of formatting this text. I write all the HTML code myself to have 100% control over the semantics and use CSS to do layout, but I have run into some problems concerning the table of contents.
I know that the ePub have a special file (.toc) to take care of this, but since I am scanning an old printed book, I want the electronic version to be as close to the original as possible. I would therefore like to have a table of contents in my epub (as a part of the spine and as a part of the linear reading order) so I can read the digital edition just as if it was printed on paper. The table of contents should look something like this _______________Section1 _1__Title1_____________________________12 _2__Title2_____________________________20 _3__Title3_____________________________28 .... _______________Section2 10__Title10____________________________128 11__Title11____________________________140 All the underscores are supposed to be spaces, but I could not do that when posting. Note that the titles of each chapter is aligned, the chapter numbers are right aligned, the chapter pages (I know that the concept of pages no longer makes sence, but in this case I choose to keep them since this is an old and rare book and since pages can be retained by the pagemap element) are left aligned. The titles should be clickable (as hyperlinks). I also think of making page numbers clickable. The main goal is that the table of contents shall look exactly like the printed version, but I want to use semantically correct coding as well. I can easily create this layout in a table, but I have my doubt that a table is the best way, semantically speaking, to represent the contents of a book. I have tried an ordered list (<ol>), but ran into problems with the chapter numbers beginning at 1 again in the second list (I need to have one list of chapters for each section), and have had problems with not being able to set the start value of an <ol>. I have also tried to use spans and format each part (the chapter numbers in one span, chapter title in another and page number in a third), and use different css-codes (by adjusting float, display:block, display:inline-block,adjust margins and so on) to make something that looks correct in a browser. I have not yet tried packing it as .epub. By experience I know that each reading device will do a different layout, and that most readers will both represent it gaphically correct if I use a table, but that many will have troubles with floating paragraphs and different types of display and visibility-attributes. Nevertheless I think that a table is the least semantically correct of my options, but would like to hear what views you experts have on this before I decide. What is beeing used "out there" to represent contents, and are there any smart ways I have not thought of yet? Any tips and/or comments are most welcome. |
03-25-2011, 09:46 PM | #2 |
Wizzard
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
|
Personally, I'd say that a table is well-suited both graphically and semantically for a detailed table of contents such as you have there.
After all, it's not like you're using it to line up unrelated text, but yours actually functions to give 1:1:1 data on what #/name/page location the chapters have. If it helps any, Project Gutenberg lays out their print-equivalent-with-page-numbers reproduction TOCs in tables, too. |
Advert | |
|
03-26-2011, 04:51 AM | #3 | |||
frumious Bandersnatch
Posts: 7,534
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
Otherwise, I agree with ATDrake that a table seems OK here Quote:
Quote:
What you could do is use the table-* display values for your divs and spans. This should give an equivalent result to a table, but you are not coding directly a table, and you could change the layout just by altering the CSS. |
|||
03-26-2011, 03:12 PM | #4 | |||
Guru
Posts: 932
Karma: 15752887
Join Date: Mar 2011
Location: Norway
Device: Ipad, kindle paperwhite
|
Quote:
Nice to hear that, because it will be much easier to use tables to format column layout, since that's what they're best at. I've always heard that tables should be avoided since they are little flexible to reflow and that it is wrong to chose tables just for straightening out layout-issues. Quote:
Quote:
chapter1Title1Page1 chapter2Title2Page2 If the display:table property is not properly supported on most major platform, I might wish to go for a <table> just to make things compatible, but if display:table is supported, I feel more comfortable with that because e.g. a text-to-speech tool will "see" ordinary spans instead of a table. By the way: thanks for quick and useful feedback. I now have a few more ideas how to use things. |
|||
03-27-2011, 04:10 AM | #5 | |
frumious Bandersnatch
Posts: 7,534
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
chapter1 Title1 Page1 chapter2 Title2 Page2 which is not perfect, but it's not too bad either. |
|
Advert | |
|
04-02-2011, 04:57 PM | #6 |
Guru
Posts: 932
Karma: 15752887
Join Date: Mar 2011
Location: Norway
Device: Ipad, kindle paperwhite
|
Thank you, Jellby, that's a great idea. I have implemented your solution, and it is looking pretty good. Is there any other way to increase the space between Title1 and Page1 except using emspace or several spaces of type ?
I also have come across another small problem with the title page: I would like to have the title and the author on top of the title page. This I can manage without problems with ordinary <h>-tags with margins. But I have not found a good way to get the publishers name on the bottom og the page. The layout I'm hoping to achieve is something like this: _______________ |****TITLE |***AUTHOR | | | | | | | |**PUBLISHER | |_______________ The lines is supposed to indicate the page/screen. The stars (*) are white spaces. What I would like is that when I change font size, the author name and publisher moves towards each other (so that the content stays on one side as long as it is possible): ___________ |**TITLE |*AUTHOR | | |PUBLISHER | |__________ The aspect ratio is the same, but the fonts are bigger and the text closer to each other. When the font size is so big that all the contents cannot fit into one page, I want a page break between author and publisher: _______________ |****TITLE |***AUTHOR | | | | |_______________ _______________ | | |**PUBLISHER | |_______________ I have tried several ways to do this: 1) #publisher{ position:absolute;bottom:1em;} 2) #publisher{ position:relative;top:60%; } 3) body,html{ height:100%;} #publisher{ display:table-row; } and insert a span with display:table-cell and vertical-align:bottom and put a div around all this with { display:table;height:100%; }, but none of this have worked as expected. The two first gives problem with overlap, pluss it doesn't work in ADE and are discouraged in ePub as far as I can read from the forum. The third works in ADE, escept that I can't get the publisher to the bottom of the page. Any tips to how to solve this? Are there anything I have not thought of, or done wrong? |
04-03-2011, 06:16 AM | #7 |
frumious Bandersnatch
Posts: 7,534
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
There's no way to have things at the bottom of the page, because there is no "page" concept in ePUB (other than for the @page margins). I'd just add some vertical space, and let it be.
|
04-03-2011, 02:36 PM | #8 |
Guru
Posts: 932
Karma: 15752887
Join Date: Mar 2011
Location: Norway
Device: Ipad, kindle paperwhite
|
That was what I was afraid of. I have also tried another approach:
1) CSS html,body { height:95%;margin-top:0em;margin-bottom:0em;overflow:visible; } .titlecell1 { padding-top:1em;padding-bottom:1em;min-height:60%; } 2) HTML <div class="titlecell1" style="background-color:red;"> <h1 class="title1">TITLE</h1> <h2 class="title2">AUTHOR</h2> </div><h3 class="title3">PUBLISHER</h3> And this works - almost. I get the title and the author inside the top div. This div has a min-height of 60%, so that it occupies at least 60% of the screen while allowing it to be resized when the font size increases and the content requires more than 60% of the screen size. For small font sizes the last <h3> will sit nicely and quietly below the 60% top div. Thus it is possible to have the publisher at the bottom 40% of the screen. It will only be moved when the contents of the top div expands and requires more than the top 60% of the page. The problem is that when the font size is so large (or the screen is so small) that all the contents cannot fit into one page, a page-break is inevitable. When that happens, the content of the last page is not visible. ADE shows TITLE and AUTHOR on one page, but on the next page I find the table of contents rather than the publisher. My guess is that the publisher disappears because I have set body to 100%, i.e. that the publisher is "below" the reading screen instead of placed on the next screen. Could there be a work-around for this? |
04-04-2011, 02:47 AM | #9 | |
speaking for myself
Posts: 139
Karma: 2166
Join Date: Feb 2008
Location: San Francisco Bay Area
Device: PRS-505
|
Quote:
Personally, I do these sort of pages (covers, title pages, etc) with SVG scaled to fit to exactly one page. Otherwise you have to invent yet another CSS layout strategy for any new layout. |
|
04-04-2011, 04:01 AM | #10 | |
The Grand Mouse 高貴的老鼠
Posts: 72,278
Karma: 309002296
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
Quote:
Instead, you get text overlapping then the font gets too big for the page size. I don't know of a way to get the title page to fit nicely if it can, but to wrap to a new page if necessary. |
|
04-04-2011, 06:55 AM | #11 | ||
Guru
Posts: 932
Karma: 15752887
Join Date: Mar 2011
Location: Norway
Device: Ipad, kindle paperwhite
|
Quote:
You say that I could use XPGT and that you use SVG. What is the difference between SVG and XPGT? I know that there exists an adobe-specific stylesheet called page-template.xpgt, but have never used it because I like to specify my own stylesheets in CSS. I have tried to read the file, but didn't understand much of it. Any tips to sites where I can find any documentation and syntax for the page-template? As for SVG, if I understand correctly, SVG contents will only be visible in adobe and invisible in all other viewers? Quote:
Last edited by Iznogood; 04-04-2011 at 06:58 AM. |
||
04-04-2011, 07:35 AM | #12 | |
frumious Bandersnatch
Posts: 7,534
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
Note that using percent in heights (or vertical margins) will often not give you what you expect, because, unless you fix the height of some container block, there is no vertical length the percent can refer to. You'd expect it to refer to the screen height, but as far as I know, this is not in the spec, and there's no guaranty any reading software behaves like that. For the moment I'd advise you not to try to do any "clever" things with page layout and use simple blocks with sensible margins instead, it's much less prone to breaking and can look relatively fine in most circumstances. |
|
04-05-2011, 01:10 AM | #13 |
Guru
Posts: 932
Karma: 15752887
Join Date: Mar 2011
Location: Norway
Device: Ipad, kindle paperwhite
|
I think I'll follow that advice, Jellby. At least until there are more universal and cross-platform means to solve this problem. I'm looking forward to ePub3 mentioned by Peter Sorotokin and hope that this problem can be solved then. When can I hope that ePub 3 can be released?
I have just a few more questions and some minor decitions to make, and then I'm finished with converting my first book to ePub -<i> or <em>? I have some words in my book that are emphasized with italics, and I am considering changing marking to <em> for these instead of <i> as I have used until now. My little problem is: should I replace all i tags with em, and where is it semantically correct to use <i>? So far I have used <i> to make emphasized dialog, mark up sitations in text (a letter), mark references to book titles (he has also written <i>title</i> (...)) and styling times (at <i style="font-size:0.8em;">11:49</i> we were finished (...)) -<strong> Where would it be appropriate and semantically correct to use strong? Strong is a bit heavier emphasize than em, if I understand correctly? Ellipsis Is it best to use three periods, or … (the unicode for &hellip? I'm sorry to trouble the forum with trifles like these, but I have read quite a bit at the internet, and have read many contradictory statements that I decided to ask the experts about this matter Last edited by Iznogood; 04-05-2011 at 01:19 AM. |
04-05-2011, 01:29 AM | #14 |
Wizzard
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
|
From what I've read (mostly @ W3C and various related usage evangelists), it would be semantically correct to use <i> in situations where you're reproducing portions of text which are for some unobviously related reason (such as purposes of style or visual separation) italicized.
I've seen anthologies and poetry books where brief intro text and/or comments on the work were italicized, while the actual story/poem was in regular font style, to make it clear they were separately written. Also epigraphs and the like. If the dialogue/narrative seems to be emphasized for reasons related to contextual meaning (foreign language phrases, someone being emphatic in their speech/observations, etc.), then <em> or <span> with style seems to be the way to go. (I freely admit that I'm lazy and rarely bother using either.) <cite> for citations will give the same stylistic effect but greater semantic usefulness if one day they make better search engines which can take advantage of that. Yes, <strong> goes where you would normally use bold, as being one step even more emphatic than italics. Hope this helps. ETA: I should add that a lot of the evangelism was rather tilted towards deprecating <i>, <b>,<u> etc. entirely in favour of completely semantic markup. But a vague consensus seems to be that if you don't know/can't tell why the text was italicized/bolded/underlined in the first place, just leave it as is and don't try to second-guess the original typesetter's intentions; just maybe add a few class attributes so you can easily group and distinguish the italicized blocks from the italicized single lines, or whatever. Last edited by ATDrake; 04-05-2011 at 01:36 AM. |
04-05-2011, 01:59 AM | #15 | ||||
speaking for myself
Posts: 139
Karma: 2166
Join Date: Feb 2008
Location: San Francisco Bay Area
Device: PRS-505
|
Quote:
Quote:
Quote:
If SVG is no supported, you will still most likely to see the text - but without formatting. What other viewers do you have in mind? Quote:
|
||||
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
CSS to "wrap" text between two .xhtml files? | december | Sigil | 28 | 12-06-2013 03:29 PM |
"Insert Image" renames .jpg's incorrectly... but shows the correct image! | megacoupe | Sigil | 4 | 03-06-2011 08:13 PM |
Importing "big" XHTML files in Sigil | paulpeer | Sigil | 8 | 03-19-2010 05:00 AM |
Synchronising "Book" and "Code" views | HarryT | Sigil | 2 | 08-11-2009 07:07 AM |
an easy way to correct "Quotationmarks" in Word? | ProDigit | Sony Reader | 8 | 11-27-2008 12:53 PM |