|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#1 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: May 2012
Device: iPad
|
ebook-convert creates extra split.htm files, with random text, inserts at hard breaks
Hi,
Using 2.74.0 on fedora 25, recently updated from fedora 23 (unknown previous version of calibre). So this may (or may not) be version related. It did not happen previously. I am creating epubs from html using ebook-convert in a bash script. I create a cover jpg externally and use --cover xx.jpg on the command line. The html text starts with a frontispiece. and then the actual text, normally a Table of Contents which is original to the text (but cleaned up by me). The frontispiece itself ends with a "<div style="page-break-before:always;"></div>" line ( a 'hard' page break) and then some title headers which become the top of the main text. I have been getting random extra text pages, between the cover and the 'frontispiece' and after the frontispiece, before the main body. This occurs on conversion. I unzipped the epub and found that there is a ...split_000.htm file, which contains some random extra text, a split_002.htm file with the *same* text, and that text is repeated at the bottom of the ...split_001.htm file, inserted *after* the <div style="page-break-before:always;"></div>" line. The extra text, in one file I unzipped came from deep in the text, but on another file, it was from near the top of the text. The ..split_000.htm file is inserted right after the cover, there is hard break, then ..split_001.htm (with its copy of the text at the bottom, below the proper page break point, then split_002.htm with a hard break ,and finally split_003.htm with the correct extra lines/text which was originally at the bottom of the frontispiece below the hard break point in that text. The html file is a concatenation of the frontis and body files. It looks fine in a browser. Problem is with the conversion. Deleting the 000 and 002.htm files, and re-zipping produces an epub which *looks* correct, but internal references to a Table of Contents link (which is actually in 003.htm), fails because there is no 000.htm file. (And 000.htm has no ToC). So convert is re-writing the links to point to the wrong place? If I remove the text content of the 'bad' files, the resulting epub has no cover, and has blank pages where the text was, and internal links in the body, point to the blank page which is ahead of the frontispiece and after the (now missing) cover. I am lost indeed and need help. Direct email and lots of examples available on request, of course! This is the convert command line, slightly cleaned of real data: /usr/bin/ebook-convert $oldfile ../epub/$newfile --base-font-size 7 --linearize-tables --input-profile default --output-profile ipad --input-encoding utf-8 --cover ../../cover/${g}.jpg --no-svg-cover --remove-paragraph-spacing --remove-paragraph-spacing-indent-size -1 --unsmarten-punctuation --disable-delete-blank-paragraphs --max-toc-links 0 --no-chapters-in-toc --toc-threshold 0 --author-sort "" --authors "" --publisher "" --title "${title}" > /dev/null 2>&1 As I said, this is in a for loop in a script, and runs against 4000 odd files on some runs. It used to work fine.. |
![]() |
![]() |
![]() |
#2 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: May 2012
Device: iPad
|
False alarm!
The only thing I had changed in the convert command in some time, was to add: --linearize tables and --disable-delete-blank-paragraphs. Reverting those fixed the problem. Why? I have NO idea... One of those caused weird artifacts. But I can go to bed, happy! Thanks for all the work David! |
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Random spacing when I convert files in Calibre. | MuseRising | Calibre | 3 | 07-03-2012 06:46 AM |
Convert EPUB to HTML Zip extra meta text | meme | Conversion | 2 | 05-28-2012 01:34 PM |
Random page breaks and random subscripts? | sark666 | Kobo Reader | 2 | 09-04-2010 02:25 AM |
Libprs500 to convert Word2003 htm files? | NigelS | Calibre | 3 | 01-14-2008 12:03 PM |