03-22-2013, 10:23 PM | #1 |
Connoisseur
Posts: 57
Karma: 1186
Join Date: Jun 2012
Device: none
|
books with one html file for each page - how to convert?
Hi all -
I have some books in a very annoying format, one html file for each page. At the top and bottom of every page are navigation links to next/previous page. I'm trying to figure out how to convert them with decent formatting to read on my kindle, or at least maybe as a pdf. I was thinking about using Calibre's search and replace to get rid of the navigation links and tables. But first I'm just trying to get the pages all together... At first I tried to merge all the html together, and spent a long time trying to fix it up in a text editor, to get rid of all the page breaks and have a better flow. But it was pretty tedious, it's all marked up in tables and so on, it took forever, and the footnotes didn't go so well... Anyway I suppose I wouldn't really mind keeping the page structure, because there's an index at the end that refers to the correct pages. I read in the manual that I could make a table of contents html file, with a link to every page, and add that to Calibre. That seemed to work, but then I end up with this huge list at the beginning of the book, dozens of pages with hundreds of links, one to every single page in the book. Is there some way to avoid that, or to edit that afterwards so there are only links to the relevant chapter beginnings? Then I read somewhere else that I could just add the first file, cover.html (which links to the next page, which links to the next, and so on) and Calibre will follow all the links, and the links inside the pages it links to, etc. - but is there some limit to the depth of this? When I tried, only the first eight pages or so showed up inside the zip file it created, and in the .mobi I made, which had a table of contents with only two entries, "next page" and "previous page". The next page after page_ix.html is page_1.html; the link is there and it works in my browser, but page_1 and all subsequent pages are missing. I also tried renaming cover.html to index.html (and editing the navigation link in the second page) and making a zip of the folder, and adding that. So then obviously all the pages were in the zip file, but when I tried to make an epub out of it, again only the first few pages were there. Well, aside from that... if anyone has any suggestions of a better way to go about this, I'd appreciate it! |
03-23-2013, 07:28 AM | #2 |
Wizard
Posts: 1,759
Karma: 30063305
Join Date: Dec 2006
Location: Singapore
Device: Boyue
|
use something like this to merge the html files.
http://www.iterati.org/ebookTools/vHtmlMerger/ The maybe use sigil to clean up and make a clean epub |
03-23-2013, 08:00 AM | #3 |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Follow the info in this section of the FAQ and you'll be fine.
|
03-23-2013, 11:28 AM | #4 | |
Well trained by Cats
Posts: 29,804
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Idea: Convert the Book with Chapter detection on (assume real chapter markers exist), that should concatenate the individual pages. |
|
03-23-2013, 09:06 PM | #5 | |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
If he did this correctly he wouldn't be asking. The link provides an example.
Quote:
Idea: create the index correctly and try again. |
|
03-24-2013, 01:17 AM | #6 |
Connoisseur
Posts: 57
Karma: 1186
Join Date: Jun 2012
Device: none
|
hi all, thanks for your replies...
Dopedangel, at first I used SoftSnow Merger to merge the html files, but I still had all the previous/next page navigation links etc., and it was really a pain to try to get rid of them and have the pages join up to each other, mainly because everything is inside a complicated structure of html tables. Anyway I'll give vHtmlMerger a try and see if it makes it any easier. I've never used Sigil, was kind of hoping to be able to do it in Calibre without having to learn a whole 'nother app... DoctorOhh, theducks is right, I did follow the info in that section of the faq correctly. I created "another HTML file that contains links to all the other files in the desired order". Usually that's for when you have a book with each chapter in a separate html file. But in this case, the original book I have is a folder with 297 html files, one for every page. I created an html table of contents with a link to each file, just as in the example. So I ended up with this table of contents, with 297 entries in it, at the beginning of the book. That's exactly what I expected to happen. It just isn't a very optimal solution, to have this 15-page table of contents with an entry for every page in it. I mean it does work. I was just wondering if there's a way I could then edit that table of contents html, after it had been imported into calibre. I tried unzipping the zip and editing it, and zipping it again, but then it was unhappy and wouldn't convert to mobi anymore... well, if you have a suggestion of some other way I should be making this index, I'm happy to hear it. I did try making an html table of contents for only the pages of the beginnings of chapters. Calibre did bring in a bunch of other pages (not sure if it was all of them) but they were all mixed up, in the wrong order. I also tried the "breadth first" setting, didn't help. theducks, I'm not sure how to go about trying your idea. I'm assuming you mean the chapter detection in the heuristic processing. But I'm still stuck at how to get all the html pages into Calibre in the first place, in the right order, without making this huge table of contents by hand... Well, maybe I'll try some more experiments if I have time tomorrow, thanks again! Last edited by sumguy; 03-24-2013 at 01:21 AM. |
03-24-2013, 01:37 AM | #7 | |
null operator (he/him)
Posts: 20,572
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
BR |
|
03-24-2013, 02:06 AM | #8 | |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Quote:
Yes use the heuristic processing. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Convert HTML to MOBI (HTML recognized as ZIP file) | pdubois | Conversion | 1 | 01-25-2011 12:55 PM |
can't convert prc file to html | kdd6 | Calibre | 5 | 12-21-2010 08:45 AM |
How can i convert HTML or txt file to EPUB file ? | guguqiaqia | ePub | 7 | 05-28-2010 09:15 PM |
Convert HTML file to MOBI for Kindle | IMFletch | Calibre | 5 | 04-16-2010 01:06 PM |
Plucker: can't right-click on html file to convert... | jplowman | Reading and Management | 1 | 08-08-2009 11:21 PM |