View Single Post
Old 03-25-2011, 06:53 PM   #1
Iznogood
Guru
Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.Iznogood ought to be getting tired of karma fortunes by now.
 
Iznogood's Avatar
 
Posts: 932
Karma: 15752887
Join Date: Mar 2011
Location: Norway
Device: Ipad, kindle paperwhite
Graphical layout vs. semantically "correct" XHTML code

I am quite new to ebooks, and am working on a project to make an ePub out of an old book. I have scanned it and used OCR to extract the plain text out of it, but now remains the work of formatting this text. I write all the HTML code myself to have 100% control over the semantics and use CSS to do layout, but I have run into some problems concerning the table of contents.

I know that the ePub have a special file (.toc) to take care of this, but since I am scanning an old printed book, I want the electronic version to be as close to the original as possible. I would therefore like to have a table of contents in my epub (as a part of the spine and as a part of the linear reading order) so I can read the digital edition just as if it was printed on paper.

The table of contents should look something like this

_______________Section1
_1__Title1_____________________________12
_2__Title2_____________________________20
_3__Title3_____________________________28
....
_______________Section2
10__Title10____________________________128
11__Title11____________________________140

All the underscores are supposed to be spaces, but I could not do that when posting. Note that the titles of each chapter is aligned, the chapter numbers are right aligned, the chapter pages (I know that the concept of pages no longer makes sence, but in this case I choose to keep them since this is an old and rare book and since pages can be retained by the pagemap element) are left aligned. The titles should be clickable (as hyperlinks). I also think of making page numbers clickable.

The main goal is that the table of contents shall look exactly like the printed version, but I want to use semantically correct coding as well. I can easily create this layout in a table, but I have my doubt that a table is the best way, semantically speaking, to represent the contents of a book.

I have tried an ordered list (<ol>), but ran into problems with the chapter numbers beginning at 1 again in the second list (I need to have one list of chapters for each section), and have had problems with not being able to set the start value of an <ol>.

I have also tried to use spans and format each part (the chapter numbers in one span, chapter title in another and page number in a third), and use different css-codes (by adjusting float, display:block, display:inline-block,adjust margins and so on) to make something that looks correct in a browser. I have not yet tried packing it as .epub.

By experience I know that each reading device will do a different layout, and that most readers will both represent it gaphically correct if I use a table, but that many will have troubles with floating paragraphs and different types of display and visibility-attributes. Nevertheless I think that a table is the least semantically correct of my options, but would like to hear what views you experts have on this before I decide. What is beeing used "out there" to represent contents, and are there any smart ways I have not thought of yet? Any tips and/or comments are most welcome.
Iznogood is offline   Reply With Quote