Mobipocket output - Page 18

kovidgoyal · 12-30-2009, 11:16 PM

CALIBRE_DEVELOP_FROM will affect ebook-convert as well. If you want to understand the sequence of operations involved in a conversion look at the run method in the file plumber.py

chorpler · 12-30-2009, 11:26 PM

Quote:

Originally Posted by kovidgoyal

CALIBRE_DEVELOP_FROM will affect ebook-convert as well. If you want to understand the sequence of operations involved in a conversion look at the run method in the file plumber.py

Thank you VERY much Kovid, that's exactly what I was looking for.

jimad · 01-01-2010, 05:32 PM

>I have a 636 page pdf google book that I added to the calibre library, and now am trying to add to my Kindle, converted to mobi. The job seems to be stuck, but I'm not sure how long the conversion should take. I'm not sure whether to stop the process and start over, or whether I've done something wrong.

The original google pdf book consists of 636 bit map scanned pages of the original book -- that is what you are looking at if you open the PDF in Adobe Reader, for example. The file also contains OCR attempted conversions of the words on the page in order that you can attempt to search on the contents of that book. If you ask calibre to convert this google pdf to a mobi file format calibre tries to convert it ALL -- meaning that it has 636 huge bitmap images of pages to convert to 636 huge bit map images in the MOBI file. Is this really what you intended to try to do? The other option, which may be more sensible, depending on your needs, is instead of downloading the google PDF version of the book, download the google EPUB version of the book -- which just contains the google OCR results. This calibre can quickly and easily convert to MOBI. However if the OCR contains scan errors -- which it will -- the results may be usable for your needs, or not. Recently I've found the google OCR efforts to be pretty good. I have a Kindle DX, so I can read the google PDF files directly, looking at the original bitmap page scans, or I can use calibre to convert the EPUB format to MOBI and read that. For my purposes, both approaches work pretty well, and both had their advantages and disadvantages. Reading the PDF retains nuances of the original text -- including the occassional scanners thumb and decades of student's scribble marks, but the scan process tends to render text a little on the heavy and blurry side . Converting the EPUB results in a "real" e-book, where the fonts are clear, can be resized, reflowed, etc -- but now contains some scannos which one must "read around". Also, if you were successful in converting those 636 bitmap page images to MOBI file format, if you read on a smaller Kindle such as the International, you may find the pages have shrunk to a size small enough to make reading uncomfortable -- depending on the strength of your eyes and/or your reading glasses. Cheers!

chorpler · 01-03-2010, 11:40 PM

OK, thanks to Kovid's help above, I was able to figure out why converting a book from OEB to Mobipocket strips out all of the <reference> links in the <guide> section of the OPF file except one Table of Contents link and a cover link. It turns out the OEB input contains a module named guide.py, in the directory:

src/calibre/ebooks/oeb/transform

which actually has the specific function of stripping everything but one cover and one TOC link out of the <guide> section of the input OPF file. Commenting out the last three lines of the guide.py file (and of course setting the CALIBRE_DEVELOP_FROM variable to the c:\calibre\src directory, if that's where the calibre source is) fixes the problem:

Code:

 
            #if x.lower() not in ('cover', 'titlepage', 'masthead', 'toc',
            #        'title-page', 'copyright-page', 'start'):
                #self.oeb.guide.remove(x)

I presume this must have some use -- apparently it was designed to choose which cover image to use, if there are multiple cover images specified? But it's having unintended consequences with Mobipocket output...

Strether · 01-08-2010, 10:24 PM

I've read all 18 pages of this thread and the user manual and haven't found a discussion of the problem I'm having converting prc documents to mobi. If a paragraph begins with an italicized word, it's not indented. This applies to any poetry that's quoted that is also italicized, that would normally be indented a few spaces from the left margin. Any way of getting around this?

Jim

nickredding · 01-18-2010, 07:35 PM

I'm seeing a TOC problem with MOBI output from a news feed. If an article consists of a headline, author, picture (jpeg) and then the article text, the MOBI TOC entry for that article takes you to a MOBI page that starts with the picture (this is true when viewing in Calibre, with MOBIpocket reader and on Kindle). However, if you advance to this article from the previous article via "next page" you get the headline, subhead and byline followed by the picture. If you advance to this article via "next article" you get the same behaviour as from the TOC. So, for some reason, the top of this article is set to the picture, not the headline.

Articles that don't start with a picture don't have this issue. Accessing them from the TOC or via "next article" gives you the headline, etc.

Interestingly, when I get HTML output from the news feed, it is structured correctly. Any ideas?

kovidgoyal · 01-18-2010, 11:50 PM

The TOC and next article links are generated from the .ncx file. Use the

--debug-pipeline option and check if the NCX file is linking to the correct place.

nickredding · 01-19-2010, 02:13 AM

Quote:

The TOC and next article links are generated from the .ncx file. Use the

--debug-pipeline option and check if the NCX file is linking to the correct place.

I've run it with debug-pipeline and the ncx and html files in the processed directory look ok. The problem must be in the MOBI output plugin. Can you suggest a next step to try to run this down?

kovidgoyal · 01-19-2010, 03:01 AM

The code to generate the MOBI TOC is in the file calibre/ebooks/mobi/writer.py

have a look and ask if you have more questions.

nickredding · 01-19-2010, 04:39 PM

I have determined that in the case of an article that is indexed correctly, the MOBI file has a bookmark placed in a <DIV id="filepos...."> tag before the navigation bar and article contents, whereas for an article with a picture between the headline and body, the bookmark is placed in a <a id="filepos..."> tag immediately before the <p class=...><img> tags representing the picture. That is why the TOC links to the picture, not the nav bar and headline. I'm not having any luck figuring out from the source writer.py why this is. Any suggestions?

kovidgoyal · 01-19-2010, 04:55 PM

look at the html in the input subdirectory from using --debug-pipeline is there a difference in the two cases that corresponds to the difference in the mobi files?

nickredding · 01-19-2010, 09:07 PM

There is no difference except for the photo. Here is the input html for the article that is not indexed properly

Code:

<div class="navbar calibre_rescale_70" style="text-align: center;">
	| <a href="../article_1/index.html">Next</a> |
	<a href="../index.html#article_0">Section menu</a> |
	<a href="../../index.html#feed_0">Main menu</a> | <hr /></div>
<div id="storyheader">
	<div class="headline">
		<h1>Symphony Splash seeks sponsor for Victoria's most popular public event</h1>
	</div>
	<div class="subheadline">
		<h2>$75,000 needed for free outdoor concert; players to take salary cut</h2>
	</div>
	<div class="byline">
		<span class="name">By Jim Gibson, Times Colonist</span><span class="timestamp">January 
		19, 2010</span></div>
</div>
<div id="storycontent" class="para18">
	<div id="imageBox">
		<div class="wrapper_0_10_0_0">
			<div class="storyimage" id="">
				<a href="javascript:void(0);" onclick="tabClick(' - Photos Tab',false,'storypage','story_photo_content',true,true);">
				<img id="storyphoto" class="thumbnail" border="0" alt="The 2009 event of Symphony Splash drew an estimated 40,000 people to the Inner Harbour on Aug. 2." src="images/img2.bin.jpg" /></a></div>
			<div class="imagetext">
				<h1 id="photocaption">The 2009 event of Symphony Splash drew an 
				estimated 40,000 people to the Inner Harbour on Aug. 2.</h1>
				<h2 id="photocredit"><b>Photograph by: </b>Adrian Lam, Times Colonist</h2>
			</div>
		</div>
	</div>
	<div id="page1">
		<p>Symphony Splash, Victoria's most popular public event, is looking for 
		a new sponsor.</p>

Here is the corresponding code in the MOBI output (transformed back into HTML by ebook-convert mobi->HTML), Notice the bookmark <a id="filepos970"></a> about half way down--after the headline, subhead and byline

Code:

      <hr class="calibre5"/>
      <p class="calibre6">
        <span class="calibre3">
          <span class="bold">Symphony Splash seeks sponsor for Victoria's most popular public event</span>
        </span>
      </p>
      <p class="calibre6">
        <span class="calibre3">
          <span class="bold">$75,000 needed for free outdoor concert; players to take salary cut</span>
        </span>
      </p>
      <p class="calibre6">By Jim Gibson, Times Colonist</p>
      <p class="calibre7">January 19, 2010</p>
      <a></a>
      <a id="filepos970"></a>
      <p class="calibre7">
        <img src="images/00006.jpg" class="calibre8"/>
      </p>
      <p class="calibre9">
        <span class="calibre3">
          <span class="bold">The 2009 event of Symphony Splash drew an estimated 40,000 people to the Inner Harbour on Aug. 2.</span>
        </span>
      </p>
      <a></a>
      <p class="calibre10">
        <span class="calibre3">
          <span class="bold">Photograph by: Adrian Lam, Times Colonist</span>
        </span>
      </p>
      <a></a>
      <p class="calibre11">Symphony Splash, Victoria's most popular public event, is looking for a new sponsor.</p>
      <p class="calibre11">The Victoria Symphony's free outdoor concert, which drew an estimated 40,000 people to the Inner Harbour last Aug. 2, needs a replacement for Bayview Residences, the title sponsor for the last three years.</p>
      <p class="calibre11">Bayview says it will continue to make "a significant contribution" to Splash, but not as title sponsor.</p>

Here is the input HTML for the next artyicle which is indexed properly

Code:

<div class="navbar calibre_rescale_70" style="text-align: center;">
	| <a href="../article_2/index.html">Next</a> |
	<a href="../index.html#article_1">Section menu</a> |
	<a href="../../index.html#feed_0">Main menu</a> |
	<a href="../article_0/index.html">Previous</a> | <hr /></div>
<div id="storyheader">
	<div class="headline">
		<h1>Handling of domestic violence overhauled</h1>
	</div>
	<div class="subheadline">
		<h2>B.C. pressured to act after gaps in services cited in murder-suicide</h2>
	</div>
	<div class="byline">
		<span class="name">By Rob Shaw and Lindsay Kines, Times Colonist</span><span class="timestamp">January 
		19, 2010</span></div>
</div>
<div id="storycontent" class="para18">
	<div id="page1">
		<p>The B.C. government unveiled changes yesterday to the way police and 
		Crown prosecutors handle domestic violence cases, but critics say it's not 
		enough to plug holes in the system.</p>
		<p>The province will help pay for a Greater Victoria regional domestic violence 
		unit, launch a B.C. Coroners Service panel to review domestic violence homicides 
		and try to better co-ordinate policies between Crown and police in the wake

and the corresponding code in the MOBI output. Notice the bookmark <div id="filepos6685" ...> at the beginning, where it should be.

Code:

    <div id="filepos6685" class="calibre1">
      <p class="calibre2">
        <span class="calibre3">
          <tt class="calibre4"> | </tt>
        </span>
        <a href="#filepos13231">
          <span class="calibre3">
            <tt class="calibre4">Next</tt>
          </span>
        </a>
        <span class="calibre3">
          <tt class="calibre4"> | </tt>
        </span>
        <a href="../index.html#article_1">
          <span class="calibre3">
            <tt class="calibre4">Section menu</tt>
          </span>
        </a>
        <span class="calibre3">
          <tt class="calibre4"> | </tt>
        </span>
        <a href="../../index.html#feed_0">
          <span class="calibre3">
            <tt class="calibre4">Main menu</tt>
          </span>
        </a>
        <span class="calibre3">
          <tt class="calibre4"> | </tt>
        </span>
        <a href="#filepos970">
          <span class="calibre3">
            <tt class="calibre4">Previous</tt>
          </span>
        </a>
        <span class="calibre3">
          <tt class="calibre4"> | </tt>
        </span>
      </p>
      <hr class="calibre5"/>
      <p class="calibre6">
        <span class="calibre3">
          <span class="bold">Handling of domestic violence overhauled</span>
        </span>
      </p>
      <p class="calibre6">
        <span class="calibre3">
          <span class="bold">B.C. pressured to act after gaps in services cited in murder-suicide</span>
        </span>
      </p>
      <p class="calibre6">By Rob Shaw and Lindsay Kines, Times Colonist</p>
      <p class="calibre7">January 19, 2010</p>
      <a></a>
      <p class="calibre11">The B.C. government unveiled changes yesterday to the way police and Crown prosecutors handle domestic violence cases, but critics say it's not enough to plug holes in the system.</p>
      <p class="calibre11">The province will help pay for a Greater Victoria regional domestic violence unit, launch a B.C. Coroners Service panel to review domestic violence homicides and try to better co-ordinate policies between Crown and police in the wake of a tragic 2007 murder-suicide in Oak Bay, said B.C. Solicitor General Kash Heed.</p>

There is no difference (except for the inclusion of the image file in the incorrectly index article) in the "processed directory either.

nickredding · 01-20-2010, 07:45 PM

I have discovered what causes this problem with MOBI output from news feeds. If there is a DIV tag with an empty string for an id, i.e.

Code:

<DIV id=""> ... </DIV>

in a downloaded article, the Calibre code responsible for setting bookmarks for the TOC decides to use that DIV, instead of the begining of the displayed content of the article.

The workaround is to use preprocess_html to find DIVs with id="" and delete the id attribute.

I have been unable to figure out where in the Calibre code this is happening and unfortunately, I'm giving up on this. I've looked in all of the obvious places, and short of exhaustively going through every single source file in Calibre I don't see any way for me to track it down. Perhaps someone who is familiar with the deep-down internals can take this cause and find the error in the code.

kovidgoyal · 01-21-2010, 12:13 AM

GRiker (who wrote the MOBI toc code) will take a look at it when he has time.

ficbot · 01-21-2010, 12:27 AM

I am having a lot of trouble with my mobi conversions. The conversion to LRF was much easier and more reliable. Specifically:

1) On books downloaded from here which are already in mobi, it is not always loading to the Kindle with correct metadata. So if I download it and it says something like 'author unknown' and I use the 'edit metadata' command to fix this and add in the author, it won't use that as the author when transferred to the Kindle. This has happened with maybe 4 out of 50 books.

2) It cannot justify the text when converting an LRF file. I badly want the text to be justified. I can't stand reading it with the ragged edges. I have tried ticking and unticking every box in there to no avail. I wish there was a checkbox to over-ride whatever the file says and force it to always justify the text when it converts.

3) On my 'liberated' eReader files, all of which have been converted to HTML using the exact same process: some of them convert with no glitches. Some have the 'do not justify' box pre-checked when I go into the options and some don't. Some will not justify at all no matter what I do. I am baffled. These files were all created the same way so why should they have different options and some will convert properly and some will not? What I am doing is running the decoder script, taking the resulting HTML file and opening it in a web browser. I select all, copy and paste into a Neo Office document. Then I save that as HTML, open THAT in a web browser and select all/copy again (to get HTML that is free from the funky coding you get from an Office suite), then paste that into Kompozer, an HTML program. I save the file, fun Text Wrangler to search for extra line breaks and remove them, then do a last save. These all are clean HTML files with no extra frills and all converted to LRF beautifully. But to mobi, it is hit or miss and I am just baffled as to why.

Can anyone help me? I just want plain, simple book files where everything is justified. Why is this so hard? What am I doing wrong? I am on a mac fwiw.

01-01-2010, 05:32 PM	#258
jimad Connoisseur Posts: 53 Karma: 52 Join Date: Apr 2008 Device: Kindle	Google Book PDF to MOBI >I have a 636 page pdf google book that I added to the calibre library, and now am trying to add to my Kindle, converted to mobi. The job seems to be stuck, but I'm not sure how long the conversion should take. I'm not sure whether to stop the process and start over, or whether I've done something wrong. The original google pdf book consists of 636 bit map scanned pages of the original book -- that is what you are looking at if you open the PDF in Adobe Reader, for example. The file also contains OCR attempted conversions of the words on the page in order that you can attempt to search on the contents of that book. If you ask calibre to convert this google pdf to a mobi file format calibre tries to convert it ALL -- meaning that it has 636 huge bitmap images of pages to convert to 636 huge bit map images in the MOBI file. Is this really what you intended to try to do? The other option, which may be more sensible, depending on your needs, is instead of downloading the google PDF version of the book, download the google EPUB version of the book -- which just contains the google OCR results. This calibre can quickly and easily convert to MOBI. However if the OCR contains scan errors -- which it will -- the results may be usable for your needs, or not. Recently I've found the google OCR efforts to be pretty good. I have a Kindle DX, so I can read the google PDF files directly, looking at the original bitmap page scans, or I can use calibre to convert the EPUB format to MOBI and read that. For my purposes, both approaches work pretty well, and both had their advantages and disadvantages. Reading the PDF retains nuances of the original text -- including the occassional scanners thumb and decades of student's scribble marks, but the scan process tends to render text a little on the heavy and blurry side . Converting the EPUB results in a "real" e-book, where the fonts are clear, can be resized, reflowed, etc -- but now contains some scannos which one must "read around". Also, if you were successful in converting those 636 bitmap page images to MOBI file format, if you read on a smaller Kindle such as the International, you may find the pages have shrunk to a size small enough to make reading uncomfortable -- depending on the strength of your eyes and/or your reading glasses. Cheers!

01-03-2010, 11:40 PM	#259
chorpler Zealot Posts: 128 Karma: 278 Join Date: Jun 2008 Device: Kindle; PRS-500; MobiPocket on Windows Mobile	OK, thanks to Kovid's help above, I was able to figure out why converting a book from OEB to Mobipocket strips out all of the <reference> links in the <guide> section of the OPF file except one Table of Contents link and a cover link. It turns out the OEB input contains a module named guide.py, in the directory: src/calibre/ebooks/oeb/transform which actually has the specific function of stripping everything but one cover and one TOC link out of the <guide> section of the input OPF file. Commenting out the last three lines of the guide.py file (and of course setting the CALIBRE_DEVELOP_FROM variable to the c:\calibre\src directory, if that's where the calibre source is) fixes the problem: Code: #if x.lower() not in ('cover', 'titlepage', 'masthead', 'toc', # 'title-page', 'copyright-page', 'start'): #self.oeb.guide.remove(x) I presume this must have some use -- apparently it was designed to choose which cover image to use, if there are multiple cover images specified? But it's having unintended consequences with Mobipocket output... Last edited by chorpler; 01-04-2010 at 12:19 AM. Reason: Forgot you had to comment out the last THREE lines, not just the last line, or it complains about indentation and whatnot

01-18-2010, 07:35 PM	#261
nickredding onlinenewsreader.net Posts: 333 Karma: 10143 Join Date: Dec 2009 Location: Phoenix, AZ & Victoria, BC Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire	MOBI output from news feed I'm seeing a TOC problem with MOBI output from a news feed. If an article consists of a headline, author, picture (jpeg) and then the article text, the MOBI TOC entry for that article takes you to a MOBI page that starts with the picture (this is true when viewing in Calibre, with MOBIpocket reader and on Kindle). However, if you advance to this article from the previous article via "next page" you get the headline, subhead and byline followed by the picture. If you advance to this article via "next article" you get the same behaviour as from the TOC. So, for some reason, the top of this article is set to the picture, not the headline. Articles that don't start with a picture don't have this issue. Accessing them from the TOC or via "next article" gives you the headline, etc. Interestingly, when I get HTML output from the news feed, it is structured correctly. Any ideas?

01-20-2010, 07:45 PM	#268
nickredding onlinenewsreader.net Posts: 333 Karma: 10143 Join Date: Dec 2009 Location: Phoenix, AZ & Victoria, BC Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire	MOBI output from news feed--workaround I have discovered what causes this problem with MOBI output from news feeds. If there is a DIV tag with an empty string for an id, i.e. Code: <DIV id=""> ... </DIV> in a downloaded article, the Calibre code responsible for setting bookmarks for the TOC decides to use that DIV, instead of the begining of the displayed content of the article. The workaround is to use preprocess_html to find DIVs with id="" and delete the id attribute. I have been unable to figure out where in the Calibre code this is happening and unfortunately, I'm giving up on this. I've looked in all of the obvious places, and short of exhaustively going through every single source file in Calibre I don't see any way for me to track it down. Perhaps someone who is familiar with the deep-down internals can take this cause and find the error in the code.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
main menu, section menu, css for calibre mobipocket output	naisren	Calibre	2	08-24-2010 12:42 AM
Trying to get consistent look to all output	daveps	Calibre	0	03-08-2010 03:18 PM
Anyone Have mobipocket desktop? Mobipocket server is down.	Ireadfreely	Kindle Formats	3	10-27-2008 11:29 AM
convert from 'new' mobipocket to 'old' mobipocket?	Indigo Ink	Kindle Formats	11	06-22-2008 02:43 AM
Mobipocket Reader 4.8 and Mobipocket eNews Creator	Mobipocket	Reading and Management	1	01-29-2004 09:03 AM

12-30-2009, 11:16 PM	#256
kovidgoyal creator of calibre Posts: 45,620 Karma: 28549044 Join Date: Oct 2006 Location: Mumbai, India Device: Various	CALIBRE_DEVELOP_FROM will affect ebook-convert as well. If you want to understand the sequence of operations involved in a conversion look at the run method in the file plumber.py

01-08-2010, 10:24 PM	#260
Strether Guru Posts: 765 Karma: 2825929 Join Date: Feb 2007 Location: Fresno Device: Kindle 1; iPad Air; iPhone 7; Kobo Libra; Kindle Oasis 3	I've read all 18 pages of this thread and the user manual and haven't found a discussion of the problem I'm having converting prc documents to mobi. If a paragraph begins with an italicized word, it's not indented. This applies to any poetry that's quoted that is also italicized, that would normally be indented a few spaces from the left margin. Any way of getting around this? Jim

01-18-2010, 11:50 PM	#262
kovidgoyal creator of calibre Posts: 45,620 Karma: 28549044 Join Date: Oct 2006 Location: Mumbai, India Device: Various	The TOC and next article links are generated from the .ncx file. Use the --debug-pipeline option and check if the NCX file is linking to the correct place.

01-19-2010, 03:01 AM	#264
kovidgoyal creator of calibre Posts: 45,620 Karma: 28549044 Join Date: Oct 2006 Location: Mumbai, India Device: Various	The code to generate the MOBI TOC is in the file calibre/ebooks/mobi/writer.py have a look and ask if you have more questions.

01-19-2010, 04:39 PM	#265
nickredding onlinenewsreader.net Posts: 333 Karma: 10143 Join Date: Dec 2009 Location: Phoenix, AZ & Victoria, BC Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire	I have determined that in the case of an article that is indexed correctly, the MOBI file has a bookmark placed in a <DIV id="filepos...."> tag before the navigation bar and article contents, whereas for an article with a picture between the headline and body, the bookmark is placed in a <a id="filepos..."> tag immediately before the <p class=...><img> tags representing the picture. That is why the TOC links to the picture, not the nav bar and headline. I'm not having any luck figuring out from the source writer.py why this is. Any suggestions?

01-19-2010, 04:55 PM	#266
kovidgoyal creator of calibre Posts: 45,620 Karma: 28549044 Join Date: Oct 2006 Location: Mumbai, India Device: Various	look at the html in the input subdirectory from using --debug-pipeline is there a difference in the two cases that corresponds to the difference in the mobi files?

01-21-2010, 12:13 AM	#269
kovidgoyal creator of calibre Posts: 45,620 Karma: 28549044 Join Date: Oct 2006 Location: Mumbai, India Device: Various	GRiker (who wrote the MOBI toc code) will take a look at it when he has time.

01-21-2010, 12:27 AM	#270
ficbot Wizard Posts: 2,409 Karma: 4132096 Join Date: Sep 2008 Device: Kindle Paperwhite/iOS Kindle App	I am having a lot of trouble with my mobi conversions. The conversion to LRF was much easier and more reliable. Specifically: 1) On books downloaded from here which are already in mobi, it is not always loading to the Kindle with correct metadata. So if I download it and it says something like 'author unknown' and I use the 'edit metadata' command to fix this and add in the author, it won't use that as the author when transferred to the Kindle. This has happened with maybe 4 out of 50 books. 2) It cannot justify the text when converting an LRF file. I badly want the text to be justified. I can't stand reading it with the ragged edges. I have tried ticking and unticking every box in there to no avail. I wish there was a checkbox to over-ride whatever the file says and force it to always justify the text when it converts. 3) On my 'liberated' eReader files, all of which have been converted to HTML using the exact same process: some of them convert with no glitches. Some have the 'do not justify' box pre-checked when I go into the options and some don't. Some will not justify at all no matter what I do. I am baffled. These files were all created the same way so why should they have different options and some will convert properly and some will not? What I am doing is running the decoder script, taking the resulting HTML file and opening it in a web browser. I select all, copy and paste into a Neo Office document. Then I save that as HTML, open THAT in a web browser and select all/copy again (to get HTML that is free from the funky coding you get from an Office suite), then paste that into Kompozer, an HTML program. I save the file, fun Text Wrangler to search for extra line breaks and remove them, then do a last save. These all are clean HTML files with no extra frills and all converted to LRF beautifully. But to mobi, it is hit or miss and I am just baffled as to why. Can anyone help me? I just want plain, simple book files where everything is justified. Why is this so hard? What am I doing wrong? I am on a mac fwiw.

Advert

Advert