Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 05-03-2022, 04:39 PM   #1
Snekguy
Junior Member
Snekguy began at the beginning.
 
Posts: 5
Karma: 10
Join Date: May 2022
Device: none
Calibre and XHTML files

I hope this is the right place for this question.

So, I've been publishing via Smashwords for several years now, and they seem to have made some changes to their file requirements recently that are preventing me from passing their EPUBcheck validation.

I've always used Calibre, and it's great, I have a good workflow set up and I have a decent understanding of the program. I publish on Amazon too, and I have no issues over there.

One of the errors that EPUBcheck is throwing out is "XHTML Content Document file name "01.html" should have the extension ".xhtml".

All of my chapters are HTML files, and that is now apparently an issue for Smashwords. I tried renaming the file extensions, which just produced blank pages on some readers (but not others?) and I tried converting the files using automated tools to no avail.

I noticed that some of the files within the EPUB that Calibre generates are XTHML by default, and I was wondering if Calibre has any built-in tools or plugins that might help me out? Is there some simple option I'm missing?
Snekguy is offline   Reply With Quote
Old 05-03-2022, 05:28 PM   #2
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
If your epub is epub3, I think epubcheck expects all your text files to have the .xhtml extension. I don't think it has the same requirement for epub2.

The calibre Editor does have a built-in option called 'Change the file extension for the selected files'. To access it, in the File browser pane, select all the text files you want to change then right-click and select the option from the pop-up menu. I believe the option automatically makes any related changes required in other files in the epub (e.g. OPF, links etc) to keep everything consistent.

Last edited by jackie_w; 05-03-2022 at 05:33 PM.
jackie_w is offline   Reply With Quote
Advert
Old 05-05-2022, 01:27 PM   #3
Snekguy
Junior Member
Snekguy began at the beginning.
 
Posts: 5
Karma: 10
Join Date: May 2022
Device: none
Quote:
Originally Posted by jackie_w View Post
If your epub is epub3, I think epubcheck expects all your text files to have the .xhtml extension. I don't think it has the same requirement for epub2.

The calibre Editor does have a built-in option called 'Change the file extension for the selected files'. To access it, in the File browser pane, select all the text files you want to change then right-click and select the option from the pop-up menu. I believe the option automatically makes any related changes required in other files in the epub (e.g. OPF, links etc) to keep everything consistent.
You are entirely correct, and have saved me a huge headache, thank you. It does indeed seem that only EPUB 3 files require XHTML, and my usual HTML files passed the check when I uploaded an EPUB 2 file. Top tier quality control from Smashwords.

I am however struck with one more issue, and I wonder if you could advise?

Each chapter of my book now throws out the same error:
"Error while parsing file: attribute "link" not allowed here; expected attribute "class", "dir", "id", "style", "title" or "xml:lang"

I've located the lines and positions of each of these errors, but I don't know enough about HTML to know what I can remove. Each one looks like this:

<body lang="en-US" link="#000080" vlink="#800000" dir="ltr" class="calibre">

Does this require a larger edit or can I simply chop out the offending parts?

Thanks again!
Snekguy is offline   Reply With Quote
Old 05-06-2022, 09:23 AM   #4
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by Snekguy View Post
I am however struck with one more issue, and I wonder if you could advise?

Each chapter of my book now throws out the same error:
"Error while parsing file: attribute "link" not allowed here; expected attribute "class", "dir", "id", "style", "title" or "xml:lang"

I've located the lines and positions of each of these errors, but I don't know enough about HTML to know what I can remove. Each one looks like this:

<body lang="en-US" link="#000080" vlink="#800000" dir="ltr" class="calibre">

Does this require a larger edit or can I simply chop out the offending parts?
I don't know what the following attributes are supposed to do or how they got there. Perhaps artifacts from a word-processor (???) if that's where you create your original source documents.
Code:
link="#000080" vlink="#800000"
If you don't know either then I'd think it's safe to just delete them.

For the lang="en-US" attribute I'm not totally sure because I'm no expert on these finer points. I'm more used to seeing language attributes in the <html> tag rather than the <body> tag e.g. for epub3 or epub2
Code:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xml:lang="en" lang="en">
Code:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
If you still get errors after removing link and vlink consider moving the lang attribute. But if you don't get errors leave it where it is. However, you'll probably get best advice about the language attribute if you ask in the MR Epub or Workshop sub-forums. That's where the expert ebook creators hang out.
jackie_w is offline   Reply With Quote
Old 05-06-2022, 01:22 PM   #5
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,057
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
link is the color of any hyper link that has never been used from that browse.
vlink is the color to use after it has been Visited.

Normally these default (browser), so you only specify if you want to override the default. Typical that is black (#000000)
theducks is offline   Reply With Quote
Advert
Old 05-07-2022, 10:29 PM   #6
Snekguy
Junior Member
Snekguy began at the beginning.
 
Posts: 5
Karma: 10
Join Date: May 2022
Device: none
Quote:
Originally Posted by jackie_w View Post
I don't know what the following attributes are supposed to do or how they got there. Perhaps artifacts from a word-processor (???) if that's where you create your original source documents.
Code:
link="#000080" vlink="#800000"
If you don't know either then I'd think it's safe to just delete them.
On the money again, these must have been left over from when I converted the ODT file to HTML as the basis for the EPUB. I can confirm that it's perfectly fine to just delete the mentioned entries and it doesn't impact the book at all.
After making the necessary tweaks, the book passed Smashword's EPUBcheck.
Why they have these requirements I can't fathom, but I have a way to work around them now. Thank you very much for your help, I was starting to think I might have to stop distributing there.


Just in case anyone else comes across the same issue, here's a summary of what went wrong and how I was able to resolve it.

Problem: EPUBcheck looks for XHTML file extensions and doesn't find them.
Solution: Export the EPUB as version 2 instead of version 3, bypassing the XHTML requirement.

Problem: EPUBcheck gives the error "Error while parsing file: attribute "link" not allowed here; expected attribute "class", "dir", "id", "style", "title" or "xml:lang"
Solution: These links are artifacts left over from the document conversion process and the offending entries can simply be deleted.
Snekguy is offline   Reply With Quote
Old 05-07-2022, 11:52 PM   #7
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,210
Karma: 168983734
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Snekguy View Post
Just in case anyone else comes across the same issue, here's a summary of what went wrong and how I was able to resolve it.

Problem: EPUBcheck looks for XHTML file extensions and doesn't find them.
Solution: Export the EPUB as version 2 instead of version 3, bypassing the XHTML requirement.
Better solution: Rename the files to xhtml from html. With either Sigil or calibre's e-book editor, it's pretty trivial. This allows you to use epub3 enhancements especially the accessibility related ones.
DNSB is offline   Reply With Quote
Old 05-09-2022, 01:13 AM   #8
Snekguy
Junior Member
Snekguy began at the beginning.
 
Posts: 5
Karma: 10
Join Date: May 2022
Device: none
Quote:
Originally Posted by DNSB View Post
Better solution: Rename the files to xhtml from html. With either Sigil or calibre's e-book editor, it's pretty trivial. This allows you to use epub3 enhancements especially the accessibility related ones.
This was what I tried doing first, but it didn't work, and resulted in blank pages appearing on some of the e-reader apps I used to test the file.
Snekguy is offline   Reply With Quote
Old 05-09-2022, 01:28 PM   #9
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,210
Karma: 168983734
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Snekguy View Post
This was what I tried doing first, but it didn't work, and resulted in blank pages appearing on some of the e-reader apps I used to test the file.
Did you manually rename them or did you use the editor to change the extension?
DNSB is offline   Reply With Quote
Old 05-10-2022, 11:54 AM   #10
Snekguy
Junior Member
Snekguy began at the beginning.
 
Posts: 5
Karma: 10
Join Date: May 2022
Device: none
Quote:
Originally Posted by DNSB View Post
Did you manually rename them or did you use the editor to change the extension?
I used the editor to change the file extensions in the Calibre 'edit book' menu.
Snekguy is offline   Reply With Quote
Old 05-10-2022, 01:49 PM   #11
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by Snekguy View Post
I used the editor to change the file extensions in the Calibre 'edit book' menu.
You can't manually change a file extension in isolation without "breaking" other things. Did you also make the necessary related changes to OPF, TOC, NCX/NAV plus any hyperlinks (e.g. footnote and index)?

Using the calibre Editor utility (mentioned in post #2) to do the extension rename would have automatically cleaned up the collateral damage.
jackie_w is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can I create a template for the xhtml files? Banjo Sigil 9 01-02-2019 07:37 PM
Some files.html & toc.xhtml (also Cover.xhtml) chaot Workshop 23 02-13-2017 12:20 PM
Newbie Question: Separate XHTML files become separate ePub files marck Conversion 3 09-02-2011 12:58 PM
EPUB files formatted okay XHTML not so much condor Nook Color & Nook Tablet 13 04-29-2011 10:31 AM
Merge multiple XHTML files at once gmw Sigil 1 12-28-2010 02:35 AM


All times are GMT -4. The time now is 06:31 AM.


MobileRead.com is a privately owned, operated and funded community.