Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 04-20-2018, 07:21 AM   #16
BobC
Fanatic
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 571
Karma: 1156144
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
I had a look at the Wikisource version and I see it shows that it is derived from a version at Gutenberg.

This could have accounted for how the Wikisource version has been split, though I see Wikisource has it split amongst 23 files whereas Gutenberg has managed it across 10 (but then the automated approach by Gutenberg does seem to split without reference to chapters etc and probably simply by size).

There is no reason to assume the Wikisource version is split according to original publication, however looking at some of the files they do seem to split logically (no obvious split in the middle of a conversation for instance) and would make a good guide to where to split.

As mentioned earlier in the thread the Adelaide version doesn't have the italics that can be found in the Wikisource (or Gutenberg) versions. You takes your choice as to which is the best to use as a basis, they all have pros and cons. Gutenberg would probably need all the files merging into one and then re-splitting but then you get the italics, Adelaide needs splitting but there are no italics and with Wikisource you need to download each section separately and then build them into a single EPUB. To what degree both Wikisource and Adelaide have been proofed and corrected against a paper copy isn't clear, so how far they diverge from the Gutenberg text is something that might be worth looking at before choosing.


BobC
BobC is offline   Reply With Quote
Old 04-20-2018, 06:05 PM   #17
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 22,140
Karma: 22087764
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: K4NT, Galaxy Tab A
I agree, it seems the device is doing this.
Odd, it is not near the 260K (safe) break point needed.

IMHO, it might be better to find a Story logical (scene break) and split the section into 2 parts there.
theducks is offline   Reply With Quote
Old 04-20-2018, 08:53 PM   #18
AlexBell
Wizard
AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.
 
AlexBell's Avatar
 
Posts: 3,326
Karma: 10272564
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
Quote:
Originally Posted by Tex2002ans View Post

And Alex, like JSWolf said, once you split those files into separate HTML files, there just isn't much you can do. Each HTML file starts on its own page, you don't have control over that. It just so happens to be one of the quirks of reading systems we have to live with.
Thanks for all your other comments. But I'm afraid I don't agree with the above part of your response. There are no chapters in Cousin Pons. The Project Gutenberg version opens with the first line, and one can read to the end of the book without starting a new file. I've opened it up, and there are several HTML files within it. And Crutledge's version of A Distinguished Provincial in Paris in the MR library also does not have any chapters, and does include several HTML files containing successive parts of the story. But the text as read on my Sony is continuous, as it was in Balzac's original book.

I think the answer is in the content.opf file, and am trying to decipher the structure of Crutledge's ebook. But it was done in Sigil, (which I don't use) so it's hard to know what makes the difference.

And thanks to everyone else who responded to my post yesterday. So far as italics and diacritics are concerned I think it's part of my job to check with the original, and insert them where necessary.

Last edited by AlexBell; 04-20-2018 at 09:03 PM. Reason: Added more text
AlexBell is offline   Reply With Quote
Old 04-20-2018, 09:30 PM   #19
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 1,058
Karma: 6000097
Join Date: Jul 2012
Device: Nook
Could you link over to the files? It's hard to make judgements on this stuff without seeing the actual code.
Tex2002ans is offline   Reply With Quote
Old 04-21-2018, 04:51 AM   #20
BobC
Fanatic
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 571
Karma: 1156144
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Quote:
Originally Posted by AlexBell View Post
Thanks for all your other comments. But I'm afraid I don't agree with the above part of your response. There are no chapters in Cousin Pons. The Project Gutenberg version opens with the first line, and one can read to the end of the book without starting a new file. I've opened it up, and there are several HTML files within it.
What are you using to read the Gutenberg version with ? I have a copy and whatever viewer I use I get a new display page when I move between individual HTML files within the EPUB.

So, for instance
Quote:
Well, I myself believe that there is an intelligence in works of art; they know art-lovers, they call to them—‘Cht-tt!’”
which is at the end of the first html file (OEBPS/@public@vhost@g@gutenberg@html@files@1856@1856-h@1856-h-0.htm.html) is on one page then
Quote:
Mme. de Marville shrugged her shoulders and looked at her daughter; Pons did not notice the rapid pantomime.
which is at the beginning of the next HTML file (OEBPS/@public@vhost@g@gutenberg@html@files@1856@1856-h@1856-h-1.htm.html)starts on a new display page.

Of course it is possible that by happenstance depending on the font used and the other display settings on your device the end of one HTML file corresponds with the end of a display page, giving the illusion of continuity between the separate HTML files that make up the EPUB, however what is certain is that each HTML file will start to be displayed on a new page - that is the nature of the viewer and nothing to do with the OPF which will, amongst other things control the order in which the HTML files follow each other for display..

Are you aware that there is a copy of Cousin Pons at Archive.org - this is in the form of an image of the book and as you say it is just one long text without any obvious scene changes or chapters. I've used the DJVU versions of Archive.org texts previously when proofreading; the hidden text layer can be useful for locating where certain words occurs so you can examine the original image of the page.

BobC
BobC is offline   Reply With Quote
Old 04-21-2018, 06:59 PM   #21
AlexBell
Wizard
AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.
 
AlexBell's Avatar
 
Posts: 3,326
Karma: 10272564
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
Thanks Bob.

I use a Sony PRT3. I'll go back again and check, and get back to you.

Yes, you were right and I was wrong. I'll have to crawl back into my little hole and start over.

Last edited by AlexBell; 04-21-2018 at 11:03 PM.
AlexBell is offline   Reply With Quote
Old 04-22-2018, 05:56 AM   #22
BobC
Fanatic
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 571
Karma: 1156144
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
The "new file = new page" is an intrinsic part of EPUB. If you have ever played with a monolithic format such as FB2 where the book is just one long file you would understand why large files are a problem and EPUB breaks them up into effectively segments that a small handheld device can handle.

With FB2 and a long book such as the King James Bible or Seven Pillars of Wisdom it becomes increasingly slow to progress through reading the book and even opening it in the first place can either be glacially slow or even impossible.

While it might be possible to deal with a mega-sized file in FB2 or a single-segment EPUB on a desktop computer with Gigabytes of RAM and the ability to use a swapfile, in reality that's not where you would want to read most Ebooks.

The Terry Pratchett "Discworld" books tend to be, just like "Cousin Pons" unbroken by chapters and too large for a single file. These commercially produced books also have to artificially decide where to split the files with the consequent new page.

As @theducks mentioned before it's a matter of where to split your file and into how many parts, personally I'd want a minimum of four to ensure maximum speed/performance; it's a matter of locating suitable places to do the splitting. The Wikisource version might give some hints with where they have chosen.

BobC
BobC is offline   Reply With Quote
Old 04-24-2018, 02:15 AM   #23
Sarmat89
Zealot
Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.
 
Posts: 146
Karma: 2100000
Join Date: Nov 2015
Device: none
Quote:
Originally Posted by BobC View Post
If you have ever played with a monolithic format such as FB2 where the book is just one long file you would understand why large files are a problem
If you use an event-based parser, you don't need excessive memory. All you need is to scan the file once, and build the lookup table. Formats actually optimized for portative devices, like MOBI or PDB, have those tables pre-built.
The actual reason EPUB needs segmentation, is a bad container format which doesn't allow random access, and the atrociously complex HTML+CSS+JS content format, which needs a DOM tree created just to make heads and tails of it.
Sarmat89 is offline   Reply With Quote
Old 04-24-2018, 04:55 AM   #24
AlexBell
Wizard
AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.
 
AlexBell's Avatar
 
Posts: 3,326
Karma: 10272564
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
Thanks again.

I've decided that I am going to learn how to design ePub3 ebooks, if I can, before ePub4 rears its head.

I have 'EPUB 3 Best Practices' by Matt Garrish and Markus Gylling on order, but it will take 2-3 weeks to get to Launceston. Can anyone recommend any other teaching resources for learning ePub3?

When I began to design ePub2 ebooks I made myself templates for chapter files and config.sys and toc.ncx files. I'd like to do the same for ePub3 ebooks if I can. Can anyone recommend a public domain ePub3 ebook, perhaps in the MR library, that I can download and open up to see what the files look like?
AlexBell is offline   Reply With Quote
Old 04-24-2018, 08:07 AM   #25
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 4,278
Karma: 14242649
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by AlexBell View Post
Can anyone recommend any other teaching resources for learning ePub3?
You could simply convert one of your epub2 books to an epub3 book with Sigil and the epub3 output plugin.

Quote:
Originally Posted by AlexBell View Post
Can anyone recommend a public domain ePub3 ebook, perhaps in the MR library, that I can download and open up to see what the files look like?
O'Reilly offers some free epub3 books for download. For example:

The Little Book of HTML/CSS Coding Guidelines

The IDPF also offers several epub3 sample books for download:

EPUB 3 Samples Project
Doitsu is offline   Reply With Quote
Old 04-24-2018, 09:28 PM   #26
AlexBell
Wizard
AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.
 
AlexBell's Avatar
 
Posts: 3,326
Karma: 10272564
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
Quote:
Originally Posted by Doitsu View Post
You could simply convert one of your epub2 books to an epub3 book with Sigil and the epub3 output plugin.


O'Reilly offers some free epub3 books for download. For example:

The Little Book of HTML/CSS Coding Guidelines

The IDPF also offers several epub3 sample books for download:

EPUB 3 Samples Project
Thanks, that's most helpful.

I'm afraid I'm old and set in my ways, so I'd rather continue designing my ebooks myself rather than have Sigil do it for me. I'll download the IDPF sample books and open them up.
AlexBell is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Question about metadata in epub files vs opf files machstem Calibre 3 06-19-2017 12:43 PM
Library Thing: EPUB Borrowing in the (Non-)English World and Can I Join? avid01 General Discussions 14 07-01-2014 05:10 AM
epub files on kobo have all been changed into shortcut files? emme278 Kobo Reader 6 11-01-2013 04:32 AM
Unwanted epub files once mobi files have been converted. fletchdt Conversion 5 03-22-2012 10:18 PM
Txt files - Convert to Epub - Multiple files into one book - noob help Cernan Calibre 6 05-18-2010 10:12 AM


All times are GMT -4. The time now is 05:15 PM.


MobileRead.com is a privately owned, operated and funded community.