![]() |
#46 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
![]() |
![]() |
![]() |
#47 |
Connoisseur
![]() Posts: 54
Karma: 29
Join Date: Oct 2006
|
Those pages I added were time consuming but mainly because I was figuring out the layout. I do plan on working through the whole book but I haven't found a plain text version available so I am ocr'ing the pdf from archive.org. This is currently the slowest part as I am proofing and converting quotes and dashes over.
Right now it's more the challenge on seeing how it could be done and figuring out any of the quirks that may crop up. For example, if you increase the display font size in your browser, the pages expand lengthwise to accommodate it. It just runs into problems with items that are specifically positioned, such as the table of contents. I think I'll continue playing with this and see what I can come up with. Last edited by sartori; 11-06-2007 at 08:32 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#48 | |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,229
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
|
|
![]() |
![]() |
![]() |
#49 | |
Addict
![]() ![]() ![]() ![]() Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
Quote:
http://books.google.com/books?id=j-s...est+literature They have apparently OCRed the text, as you can "view text" for each individual page. Sadly, the downloadable PDF doesn't include the OCRed text. That would have saved you some effort. |
|
![]() |
![]() |
![]() |
#50 |
Addict
![]() ![]() ![]() ![]() Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
|
![]() |
![]() |
Advert | |
|
![]() |
#51 |
Connoisseur
![]() Posts: 54
Karma: 29
Join Date: Oct 2006
|
kovidgoyal,
So if I was to create a secondary css file that hides all the page breaks and page numbers and just displays the text with simple formatting (ie justified, centered, different sizes) html2lrf would be able to create a decent looking lrf from the file? |
![]() |
![]() |
![]() |
#52 |
Addict
![]() ![]() ![]() ![]() Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
Hey, did you check Gutenberg? I just saw that they have six volumes.
http://www.gutenberg.org/browse/authors/w#a993 |
![]() |
![]() |
![]() |
#53 | |
Connoisseur
![]() Posts: 54
Karma: 29
Join Date: Oct 2006
|
Quote:
|
|
![]() |
![]() |
![]() |
#54 | |
Addict
![]() ![]() ![]() ![]() Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
Quote:
Well, good luck with the project. What you have so far looks very nice. |
|
![]() |
![]() |
![]() |
#55 |
Enthusiast
![]() ![]() ![]() ![]() Posts: 34
Karma: 336
Join Date: Dec 2006
Location: Texas
Device: Sony Reader
|
I'm rather surprised that my (admittedly minor) point has generated such a discussion, so allow me to make one or two more:
Scholarly citation is meant to serve two main purposes: 1. establish the authority for a reference so that if someone cares to check your accuracy or honesty, the location of the quotation or reference can be pinpointed and verified; 2. provide a context for a quotation or reference so that the reader can understand the total argument or occasion to which it belongs. I am convinced that electronic forms of delivery will ultimately prevail; if future readers can locate the exact source with ease (perhaps even greater ease than was possible in the print world--hyperlinks, search engines, whatever works), then we don't need page numbers. We do need to know how closely the electronic version resembles its print source. However, there is sometimes more information in a print or handwritten source than can be easily captured in its digitized version. Medieval manuscripts, an English scholar realized recently, can sometimes be dated and associated more precisely by using DNA information from its parchment (aka, sheepskin) and ink media. Yet, as the digitization of the Beowulf manuscript also showed, high-resolution and other scanning techniques can also reveal aspects of the original that would otherwise be impossible to recognize. When you've got only one copy (like the Beowulf manuscript), you need all the help you can get. So the original is irreplaceable for the scholar, in many cases, because its verbal content is only part of the information it contains. Perhaps in the future we will find a way to capture all the information we are likely to need for the foreseeable future, but then there are always surprises, as the identification of parchment provenance using DNA analysis illustrates. At some point we'll simply have to draw the line and admit that we can't do everything; some information will have to be lost. The goal of the user of a particular document will determine if that loss is critical, incidental, or trivial. For most of us, it won't matter. But for archeologists of the text, it will. |
![]() |
![]() |
![]() |
#56 |
Banned
![]() ![]() ![]() Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
|
panurge said:
> then we don't need page numbers. we still need them, because prior aspects of the record use them. we cannot forfeit all those earlier pointers... > We do need to know how closely > the electronic version resembles its print source. and, for that, we need to sync the two. by page number. (because, realistically, what else are we going to use?) > there is sometimes more information > in a print or handwritten source > than can be easily captured in its digitized version. that's a different problem. but we always had that one. there's no substitute for access to the original, at least for some things. still, for a good many _other_ things, access to a digital copy is better than nothing, _much_ better than we used to have (i.e., which was nothing...) if you have feedback on the numerous examples i gave, i'd love to hear it. if not, that's fine too... -bowerbird |
![]() |
![]() |
![]() |
#57 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,229
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It wont display the hidden elements. Whether the resulting LRF will look good or not depends on the kind of HTML you use. But I'm always willing to add support for more esoteric HTML to html2lrf, within reason :-)
|
![]() |
![]() |
![]() |
#58 |
Connoisseur
![]() Posts: 54
Karma: 29
Join Date: Oct 2006
|
Ok, thanks. I think I'll play around with this tomorrow and see if I can come up with a 'plain' css version of the same page.
|
![]() |
![]() |
![]() |
#59 |
Enthusiast
![]() ![]() ![]() ![]() Posts: 34
Karma: 336
Join Date: Dec 2006
Location: Texas
Device: Sony Reader
|
[> then we don't need page numbers.
we still need them, because prior aspects of the record use them. we cannot forfeit all those earlier pointers... > We do need to know how closely > the electronic version resembles its print source. and, for that, we need to sync the two. by page number. (because, realistically, what else are we going to use?)] Page numbers are simply a way of keeping track of pages. The earliest printed books don't have them. For incunabulae, the books published in the second half of the 15th century, there were numbers, not of pages but of groups of pages, so that when the book was put together for binding the sections would not be out of order. Manuscripts may or may not have page numbers. Sometimes the first word of the following page was printed (or written) at the bottom of the preceding page to establish sequence. What really counts, for the most part, is textual accuracy--that is, identity of the two texts. For routine purposes, one wouldn't have to refer to the original if the electronic copy were certifiably accurate. But there's the rub, perhaps. When I edit an older text, say an unprinted manuscript, I'm not usually obliged to give its original page numbers. I just need to identify the original source and signal each time I depart from its authority (for example, to correct an obvious error in spelling or printing). The scholarly world has had many ways of ensuring synchronization between two texts; page numbers are one but not the only one. Of course they are helpful, but historically printers have sometimes ignored them. In the case of Greek and Latin texts, individual passages were identified by paragraph and sentence numbering, and that is still used among classicists today, as was observed above. So, yes, I agree that page numbers are useful for synchronizing two versions of a text; in the case of verse, however, we go by line numbers and larger divisions or sections of the poem. So the physical page isn't always what matters. My only intention in bringing up this matter was to point out that digitization of books in the future may not be as simple a matter as we would like and that there is no one solution that will fit some of these odd cases. Nor will past practice always be a reliable guide to what will work in the future. At some point electronic texts will be recognized as the accepted authority, and page numbers will no longer matter; for us, in a time of transition, they still do on occasion, depending on our relationship to what we're reading. Let me say that as someone who guards, keeps track of, and preserves books from harm, I'm delighted to see such a vigorous discussion about how to address the problem and find solutions. We are in a time of tremendous change that will have at least as much impact on the distribution of information as resulted from the invention of moveable type, and groups like this one are at the forefront because they include not simply programmers and designers but regular readers and enthusiasts who understand the users' needs. More power and glory to them. |
![]() |
![]() |
![]() |
#60 |
Enthusiast
![]() ![]() ![]() ![]() Posts: 34
Karma: 336
Join Date: Dec 2006
Location: Texas
Device: Sony Reader
|
Perhaps I should have also said "because they include not simply regular readers and enthusiasts but also programmers and designers." I'm looking forward to examining all the examples that have been posted in this thread as soon as I can get the time to do so.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Page numbers | Fincary | Astak EZReader | 4 | 02-18-2010 03:06 PM |
page numbers | nenad | Amazon Kindle | 2 | 12-19-2009 09:01 AM |
Professional and scholarly ebooks account for 75% of ebook market? | anurag | News | 1 | 11-26-2009 12:40 PM |
Page numbers, AGAIN | orlincho | Bookeen | 92 | 08-19-2008 07:15 AM |
Page numbers (again) | Prospect | Workshop | 50 | 04-10-2008 02:19 AM |