Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 12-30-2008, 05:02 PM   #1
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Creating multiple ebook formats from same source files!

I want to improve the way I make .imp ebooks from (rich) .html and up to now my MO has been to perfect what I would like displayed on my hardware reader ebook-wise (pardon the pun) and then make other ebook formats for uploading here. Other formats include .prc/.mobi (Mobipocket) and .lrf/.epub (Sony).

Currently, if I want to produce a .imp, .prc and .lrf (and in the future .epub) ebook for uploading here, then I will "nail it" using eBook Publisher for the .imp ebook, then use a copy of the .opf with Mobipocket Creator and finally a command-line Calibre program, either html2lrf.exe or opf2html.exe.

However, I see from using the (software) Mobipocket Reader and Calibre lrfviewer/ebook-viewer that my other ebook formats suffer from shortcomings of my .html source files used.

As an example, I converted a PG offering entitled Little Stories for Little Children by Anonymous using the HTML .zip (22896-h.zip). I produced a .imp/.epub (using a beta eBook Publisher), .prc (using Mobipocket Creator) and .lrf/.epub (using Calibre v0.4.121). I attach as Little Stories for Little Children.zip my source files as revised by me including the original 22896-h.htm as well as a simple "diffs" .txt file to see what I changed.

Now to the problems:
  • what appears centered in a .imp ebook is right-aligned sometimes when using Mobipocket Reader,
  • the <pre> tag is not well supported in Mobipocket ebooks,
  • fonts sizes seem larger in Sony ebooks (though I think the user can control this; just the software viewer doesn't)
  • <hr>/paragraph breaks/<h1> are exaggerated, etc...
  • Conversely, when I want to signify a page-break, my .imp reader doesn't understand <p style="page-break-after: always"> (which is theorectically better) and only recognizes <p style="page-break-before: always">.
  • Yuk!
What I would like to find out is what works best for ALL ebook formats. A basic "hit list" of html constructs to use and/or avoid.

I would appreciate any comments from those that prepare .prc/.mobi ebooks from .html on how they would change my source .html files to better make a Mobipocket ebook. The same goes for those who make Sony ebooks from .html.

In the end, I hope we can standardize the creation of ebooks from .html so that the best possible ebooks can be created from a single (multi-purpose) source.

Please upload any better Mobipocket or Sony (or eBookwise) ebook you can make using my source below (please upload source/diffs as well).

Any thoughts in this regard?

Last edited by nrapallo; 12-30-2008 at 06:18 PM. Reason: typo
nrapallo is offline   Reply With Quote
Old 01-04-2009, 12:25 AM   #2
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Perhaps not too many people use .html as a source; BD seems so popular amongst the veteran uploaders here.

For those that do work with .html as a source, be sure to read the Project Gutenberg guidelines on producing their .html ebooks here.

Especially useful for those old (the format not the ebook creators ) .txt die-hards is Section H.13. How can I make a HTML version from my plain text file? therein!
nrapallo is offline   Reply With Quote
Advert
Old 01-04-2009, 09:21 AM   #3
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,495
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
I've taken a quick look at your source and compared it to the original to see what you've changed, and when you've left.

For the few Mobipocket ebooks I've created, I tend to work from HTML too.

I'd advise simplifying the HTML even more - getting rid of the page numbers, for example, as Mobipocket doesn't seem to obey the invisible attribute.

More specifically, in the CSS I'd eliminate any body text font size specification, and the justification of the body text, and any general margins on the body text. Those choices out to be left up to the reader through the software.

For identifying links in the document, Mobipocket prefers the use of id rather than name.

I always use page-break-before because I specify it for certain headings (e.g. chapter headings), which seems to make more sense than adding it to the last paragraph of a chapter.

On a stylistic note, I prefer to specify no space between paragraphs (by setting margin-top and margin-bottom to 0em), but to have a text indent on the first line of each paragraph, except the first paragraph of each chapter. But that it a personal preference.

I do like the idea of coming up with a 'generic' HTML format that works with the software to create multiple formats. I've avoided creating LRF or other files, because I can't test how they'll look.

For Mobipocket, you need an extra item in the opf file in the metadata part of the manifest:

<x-metadata><EmbeddedCover>images\illus-0001-1.jpg</EmbeddedCover></x-metadata>

and also a guide item to the table of contents is a good idea, just before the end of the package

<guide><reference type="toc" title="Table of Contents" href="Little%20Stories%20for%20Little%20Children.h tm%23contents"></reference></guide>


Anyway, I attach a zip of my html,opf & images (for which I corrected the white point), along with my prc.

Paul

Quote:
Originally Posted by nrapallo View Post
I want to improve the way I make .imp ebooks from (rich) .html

[...]

Now to the problems:
  • what appears centered in a .imp ebook is right-aligned sometimes when using Mobipocket Reader,
  • the <pre> tag is not well supported in Mobipocket ebooks,
  • fonts sizes seem larger in Sony ebooks (though I think the user can control this; just the software viewer doesn't)
  • <hr>/paragraph breaks/<h1> are exaggerated, etc...
  • Conversely, when I want to signify a page-break, my .imp reader doesn't understand <p style="page-break-after: always"> (which is theorectically better) and only recognizes <p style="page-break-before: always">.
  • Yuk!
What I would like to find out is what works best for ALL ebook formats. A basic "hit list" of html constructs to use and/or avoid.

I would appreciate any comments from those that prepare .prc/.mobi ebooks from .html on how they would change my source .html files to better make a Mobipocket ebook. The same goes for those who make Sony ebooks from .html.

In the end, I hope we can standardize the creation of ebooks from .html so that the best possible ebooks can be created from a single (multi-purpose) source.

Please upload any better Mobipocket or Sony (or eBookwise) ebook you can make using my source below (please upload source/diffs as well).

Any thoughts in this regard?
Attached Files
File Type: zip Little_Mobipocket.zip (653.7 KB, 465 views)
File Type: prc Little Stories for Little Children.prc (664.4 KB, 419 views)

Last edited by pdurrant; 07-20-2009 at 12:29 PM. Reason: yes, id not is
pdurrant is online now   Reply With Quote
Old 01-04-2009, 09:51 AM   #4
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
Posts: 2,553
Karma: 11499146
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, + Mini, +Color, Astak Ez Reader Pro, PPW1, Aura H2O
I generally work with MS Word,and save as a doc file; or occasionally as an HTML from there.
This is good enough for standard books.

Books with a lot of references (like the bible, or encyclopedia) can better be edited in HTML; because HTML takes longer to edit and finalize; but you're more able to find formatting errors, or you're more able to custom tailor the HTML.

BD generally strips HTML from all excessive info, appart from the body text, and pastes it in 'header1', 'header2', or paragraph text.
So there's not really a reason why not to use a doc instead of html (for a normal book). I mean,there's really not that much HTML in a normal book.

BD is not capable of calling a "header2" (<H2>) a subtitle. But generally Header1 will become a chapter title.

The last thing I've learned BD recognizes is references and bookmarks.
"a href" and "a name".

MS Word creates a lot of overload on HTML files, in case you plan on creating the HTML from there, and it often takes a lot of pruning.
Often you can save about 15-25% of space, just removing unnecessary data from MS Word HTML's.
I prefer creating them in openoffice writer, since it tends to leave less of a mess behind.

I've also been thinking of publishing the html sources, since apart from a Sony Reader, I have no device to compare my ebooks on,and generally only release the LRF file.

Besides, probably like others on the forums, the (hand) creation of an LRF file already takes time enough, and probably there are many out there who won't mind sharing the original sources, for others to convert.
If it where as simple as just running it through a convertor it would be ok, however, I see many books posted by people on the forum with very lousy formatting!

I'm not talking about guys just starting out posting books,and not have it 100% together yet, like those that have some border issues, or font size issues.

But those uploading files, almost as if a text file with a few added pictures was put into an LRF jacket and published.
Sadly, some of the best uploaders, also have some of the worst formatting in their books. It may be because of automatic conversions.
I mean: Titles are not aligned,and starting from mid-page, pagebreaks are missing, text has a lousy formatting (last word of a line always appears on next line), lettertype is just TOO BIG to comfortably read it in medium or large (on a Sony PRS-505; eg.nly 10-15 words fit a screen in Large view),etc...
Maybe conversion tools have improved the last months, and faithfully can convert one format into another without loss of formatting quality.. After all, most of the formats are for 800x600 screen resolutions,so formatting, fonts and sizes should not differ much from one or the other ebook.

I'd honestly prefer a lot of the files uploaded to this site, to be removed and reformatted by hand.
Because apart from the text which you can read, the formatting as well as the covers of books are just done horrible on many books...

I mean, one of the posting guidelines is to not post a book if it only took you a few minutes of work to create them. Then there would be no benefit in uploading the files,and you might as well just read the txt or html file on your reader directly downloaded from the Gutenberg (or siminar) website(s).

Last edited by ProDigit; 01-04-2009 at 10:18 AM.
ProDigit is offline   Reply With Quote
Old 01-04-2009, 11:05 AM   #5
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
I seldom use HTML to produce ebooks. I have used DB for a long time and I have had more problems with HTML sources than any other format. DOC or RTF seem to work better and even a well prepared TXT file seems more adapt than HTML. I use the BD TOC functions rather than importing any TOC from the outside.

That said, I reviewed the LRF output and there are sevral things I would have liked to see that I did not and several things I saw that I would liked to have not seen.

While I did see page breaks for the main stories, I did not see page breaks for the TOC, title, and other front material. This produced a run-on situation with the title of the book split between the bottom of one page and the top of the next.

I saw all of the original page references. Many of the PG HTML sources place these on the side away from the body of the text and that is fine. Given the narrow column width of most readers, this is not viable. Some PG HTML sources put them inline (as this LRF output was) and for long pages of text it is not too bad, here with very short pages it is a major intrusion. Some PG texts have even put the page number within a word when it is split over two pages.

I wish you all the possible success with the project Nick. If anyone can pull it off, I believe you can.
RWood is offline   Reply With Quote
Advert
Old 01-04-2009, 11:13 AM   #6
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
@pdurrant
Thanks for the feedback for creating Mobipocket ebooks from HTML. I gather there are two areas of concern, the underlying .html and .opf coding. I will look at your changes and see what I can use/compromise/discard when I re-make a .imp ebook from it. It will better help find that "common ground".

@ProDigit
HTML (even filtered HTML) from Word can be a "nightmare". I usually try running the Word HTML through TidyGUI and select the Configuration setting to handle that the source file is from Word 2000+. I then search and replace large spans of my default formatted paragraphs to strip the in-line style <p style="">. It does a decent job, but not as good as initially starting from .html.

One trick I've used in the past with great sucess is when starting with a webpage or similar .html, copy the displayed text and paste it into a blank email created by MS Outlook Express 6. Then click the Source tab at the bottom of your email message and Select All and Copy that HTML 4.0 code into a new text file opened by your favorite text editor. Then save that file as your starting HTML base for the ebook. It avoids using Word, but requires some quick search and replaces to get rid of http:// references to images and requires you to manually copy images to the source directory. Just food for thought.

Oh, by the way, any chance of showing me what you would/could change to make my above .lrf/.epub version better suited to your own preferences. Can you post a .lrf to see the results? with source?
nrapallo is offline   Reply With Quote
Old 01-04-2009, 11:24 AM   #7
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by RWood View Post
I seldom use HTML to produce ebooks. I have used DB for a long time and I have had more problems with HTML sources than any other format. DOC or RTF seem to work better and even a well prepared TXT file seems more adapt than HTML. I use the BD TOC functions rather than importing any TOC from the outside.
It is clear that HTHL and BD do NOT intermix!!! I've never had my HTML processed properly by BD, thus my avoidance of same.

Quote:
That said, I reviewed the LRF output and there are sevral things I would have liked to see that I did not and several things I saw that I would liked to have not seen.

While I did see page breaks for the main stories, I did not see page breaks for the TOC, title, and other front material. This produced a run-on situation with the title of the book split between the bottom of one page and the top of the next.
Yeah, I kind of "ignored/forgot" those page-breaks as I got them for free when I resized the large images to display on my reader and they "filled" up the page so that the next line DID appear on the next page. I noticed it too on the .lrf/prc ebook version that they were "missing". Good point!

Quote:
I saw all of the original page references. Many of the PG HTML sources place these on the side away from the body of the text and that is fine. Given the narrow column width of most readers, this is not viable. Some PG HTML sources put them inline (as this LRF output was) and for long pages of text it is not too bad, here with very short pages it is a major intrusion. Some PG texts have even put the page number within a word when it is split over two pages.
Page references are a personal preference. I don't mind them, but it does take some effort to remove them properly, which I wasn't ready to expend for this simple test. But I do see that it would be one of the items that should be addressed when converting HTML to ebooks. I don't mind removing them either, but it's a shame to have to undo all the hard work done by the PGDP to put them there in the first place.

In the end, I do agree that they do not belong in the ebook version though.

Quote:
I wish you all the possible success with the project Nick. If anyone can pull it off, I believe you can.
Thanks, we've already got a great start, and I think will improve, as more and more people chime in!
nrapallo is offline   Reply With Quote
Old 01-04-2009, 11:46 AM   #8
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
Posts: 2,553
Karma: 11499146
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, + Mini, +Color, Astak Ez Reader Pro, PPW1, Aura H2O
Quote:
Originally Posted by RWood View Post
...

That said, I reviewed the LRF output and there are sevral things I would have liked to see that I did not and several things I saw that I would liked to have not seen.

While I did see page breaks for the main stories, I did not see page breaks for the TOC, title, and other front material. This produced a run-on situation with the title of the book split between the bottom of one page and the top of the next.
...
When I create an LRF in BD, it automatically adds pagebreaks on the TOC and Title.
Even if I'd remove the lines (<HR> in html),the TOC and Title/Author always appear on a separate page regardless.

So I generally remove them, hoping to save some space.
ProDigit is offline   Reply With Quote
Old 01-04-2009, 11:47 AM   #9
mtravellerh
book creator
mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.
 
mtravellerh's Avatar
 
Posts: 9,635
Karma: 3856660
Join Date: Oct 2008
Location: Luxembourg
Device: PB360°
I always use Html files as source files myself. But I am coming from a different point of view as I do (or rather did) first and foremost Mobi books. And you are perfectly right: <pre> is not very well supported at all by mobi. I make my htmls as simple as possible, using mostly only header and paragraph texts. If I need special formatting, I use breaks, aligns, bigger and smaller fonts. These 3 work fine with all formats and keep it simple.

I avoid BD for the same reason Nick does. BD tends to play around with my perfectly good HTML code.

I used to create a mobi file first and then LRF and IMP by importing that PRC file into Calibre and Mobi2IMP. I have changed that method somewhat now by using my HTML source file with Calibre and thus creating LRF and Epub, because epubs are easier and better to create that way.
mtravellerh is offline   Reply With Quote
Old 01-04-2009, 12:48 PM   #10
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by nrapallo View Post
I want to improve the way I make .imp ebooks from (rich) .html and up to now my MO has been to perfect what I would like displayed on my hardware reader ebook-wise (pardon the pun) and then make other ebook formats for uploading here. Other formats include .prc/.mobi (Mobipocket) and .lrf/.epub (Sony).
I'm working on an "oeb2mobi" for Calibre which will allow the generation of Mobipocket books from OEB XHTML+CSS content. After that, I want to try to shore up differences among the outputs of Calibre's various target format conversions. The goal would be to allow book creators to use one standards-compliant HTML source and produce books which look "the same as possible" in all the various output formats. No IMP support yet, but perhaps one day...
llasram is offline   Reply With Quote
Old 01-04-2009, 12:54 PM   #11
zelda_pinwheel
zeldinha zippy zeldissima
zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.
 
zelda_pinwheel's Avatar
 
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
Quote:
Originally Posted by llasram View Post
I'm working on an "oeb2mobi" for Calibre which will allow the generation of Mobipocket books from OEB XHTML+CSS content. After that, I want to try to shore up differences among the outputs of Calibre's various target format conversions. The goal would be to allow book creators to use one standards-compliant HTML source and produce books which look "the same as possible" in all the various output formats. No IMP support yet, but perhaps one day...
hallelujah !!!! that is the dream. please let us know how it is coming along. karma for that, my friend.

and as for imp support, it would be nice, but as long as we can use nick's mobi2imp, to me it is a secondary priority.
zelda_pinwheel is offline   Reply With Quote
Old 01-04-2009, 01:31 PM   #12
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by zelda_pinwheel View Post
hallelujah !!!! that is the dream. please let us know how it is coming along. karma for that, my friend.
Thank you!

The Mobipocket "container" support is pretty much all done, including a few features I don't think other open source Mobipocket generators have (UTF-8 encoded content and "uncrossable" boundaries for "non-linear" content like footnotes and the table of contents).

Rendering HTML+CSS into Mobipocket mark-up is... going. Not to rag on Mobipocket unnecessarily, but the Mobipocket HTML rendering engine is more limited and quirkier than I would have thought possible . My basic strategy is to emulated full CSS-based rendering for what Mobipocket can support, and emulate CSS-less rendering for what it can't. So the idea is that authors would just write markup which degrades cleanly and not worry about what features are or are not supported. For example, Mobipocket doesn't support floated blocks, so for any floats I also ignore explicit CSS 'display's.

The tricky bits are tables and lists. I was considering ignoring Mobipocket's built-in list support and just rendering them explicitly -- generating sequences based on 'list-style-type' etc. But for that I was trying to use Mobipocket's table support, which is reeeeally quirky. For example: if the specified width of a cell is too small to contain its content, then it just disappears -- poof!, isn't rendered at all.

So anyway, it's getting there .
llasram is offline   Reply With Quote
Old 01-04-2009, 01:34 PM   #13
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
An alternative is to render complex markup as images (see for example the --render-tables-as-images option in html2lrf)
kovidgoyal is online now   Reply With Quote
Old 01-04-2009, 01:37 PM   #14
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
The biggest problem with mobipocket, format-wise, is that (as far as I know) it only allows to set vertical spacing above elements, so if you want to have some space below a chapter heading, or a piece of poetry, or an indented block... you have to set the space (with "height") in the element below that (a paragraph, a div, or whatever).

I create mobipocket books with html2mobi, from very simple HTML. Basically, I use only <P>, <DIV>, <I>, <B>, <Hx>, <A>, <IMG>... (sometimes <BR>, <SUP>, <HR>... if needed), and the only properties are "align", "height" (for vertical space) and "width" (for first-line paragraph indent (and "href" for <A>, "src" for <IMG>). The pagebreaks I add them with <mbp:pagebrak/>, and the guide items are defined in the <head>. If someone is interested in seeing any of my source files, just ask by PM.

EDIT: Oh, and I encode everything in ASCII, so I write &mdash;, &eacute;, etc. (or rather let a program write them).

Last edited by Jellby; 01-04-2009 at 01:40 PM.
Jellby is offline   Reply With Quote
Old 01-04-2009, 01:45 PM   #15
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by llasram View Post
The Mobipocket "container" support is pretty much all done, including a few features I don't think other open source Mobipocket generators have (UTF-8 encoded content and "uncrossable" boundaries for "non-linear" content like footnotes and the table of contents).
Good! I tried using <mbp:pagebreak crossable="no">, as suggested in the documentation, with html2mobi, but it didn't work in the Cybook or in the desktop reader. Do you think this is due to a missing feature in html2mobi?

Quote:
I was considering ignoring Mobipocket's built-in list support and just rendering them explicitly -- generating sequences based on 'list-style-type' etc. But for that I was trying to use Mobipocket's table support, which is reeeeally quirky. For example: if the specified width of a cell is too small to contain its content, then it just disappears -- poof!, isn't rendered at all.
And if a table doesn't fit in the page, it's just cut in the middle of a line (vertical or horizontal), at least with the few tables I've used.

Something a bit unrelated, is there a way (in CSS) to set the punctuation after the labels in a list, i.e., having "1:" or "1.-" instead of "1."?
Jellby is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
<Command Line> Add multiple books in multiple formats himitsu Calibre 8 09-25-2010 11:07 PM
Bug: entries with multiple formats trigger multiple conversions flinx1 Calibre 12 05-21-2010 06:23 AM
Error Converting Zip Files w/ Multiple Formats TheHeartlessHero Calibre 2 04-10-2010 10:54 AM
Process for creating several eBook formats from MS Word doc jinlo Workshop 10 06-12-2009 11:05 AM
Free eBook in multiple DRM-free formats cmwilson Deals and Resources (No Self-Promotion or Affiliate Links) 46 05-20-2009 10:03 AM


All times are GMT -4. The time now is 02:39 AM.


MobileRead.com is a privately owned, operated and funded community.