Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 03-22-2013, 10:23 PM   #1
sumguy
Connoisseur
sumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheese
 
Posts: 52
Karma: 1000
Join Date: Jun 2012
Device: none
books with one html file for each page - how to convert?

Hi all -

I have some books in a very annoying format, one html file for each page. At the top and bottom of every page are navigation links to next/previous page. I'm trying to figure out how to convert them with decent formatting to read on my kindle, or at least maybe as a pdf. I was thinking about using Calibre's search and replace to get rid of the navigation links and tables. But first I'm just trying to get the pages all together...

At first I tried to merge all the html together, and spent a long time trying to fix it up in a text editor, to get rid of all the page breaks and have a better flow. But it was pretty tedious, it's all marked up in tables and so on, it took forever, and the footnotes didn't go so well... Anyway I suppose I wouldn't really mind keeping the page structure, because there's an index at the end that refers to the correct pages.

I read in the manual that I could make a table of contents html file, with a link to every page, and add that to Calibre. That seemed to work, but then I end up with this huge list at the beginning of the book, dozens of pages with hundreds of links, one to every single page in the book. Is there some way to avoid that, or to edit that afterwards so there are only links to the relevant chapter beginnings?

Then I read somewhere else that I could just add the first file, cover.html (which links to the next page, which links to the next, and so on) and Calibre will follow all the links, and the links inside the pages it links to, etc. - but is there some limit to the depth of this? When I tried, only the first eight pages or so showed up inside the zip file it created, and in the .mobi I made, which had a table of contents with only two entries, "next page" and "previous page". The next page after page_ix.html is page_1.html; the link is there and it works in my browser, but page_1 and all subsequent pages are missing.

I also tried renaming cover.html to index.html (and editing the navigation link in the second page) and making a zip of the folder, and adding that. So then obviously all the pages were in the zip file, but when I tried to make an epub out of it, again only the first few pages were there.

Well, aside from that... if anyone has any suggestions of a better way to go about this, I'd appreciate it!
Attached Files
File Type: txt calibre-log.txt (6.3 KB, 69 views)
sumguy is offline   Reply With Quote
Old 03-23-2013, 07:28 AM   #2
Dopedangel
Wizard
Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.Dopedangel ought to be getting tired of karma fortunes by now.
 
Dopedangel's Avatar
 
Posts: 1,088
Karma: 8499999
Join Date: Dec 2006
Location: Singapore
Device: Coolreader(Nexus 5)\Coolreader(Nook Touch)
use something like this to merge the html files.
http://www.iterati.org/ebookTools/vHtmlMerger/

The maybe use sigil to clean up and make a clean epub
Dopedangel is offline   Reply With Quote
Old 03-23-2013, 08:00 AM   #3
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,799
Karma: 12528001
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Follow the info in this section of the FAQ and you'll be fine.
DoctorOhh is offline   Reply With Quote
Old 03-23-2013, 11:28 AM   #4
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,430
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by DoctorOhh View Post
Follow the info in this section of the FAQ and you'll be fine.
He did that, he got a book with a TOC for every page.

Idea: Convert the Book with Chapter detection on (assume real chapter markers exist), that should concatenate the individual pages.
theducks is offline   Reply With Quote
Old 03-23-2013, 09:06 PM   #5
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,799
Karma: 12528001
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by theducks View Post
He did that, he got a book with a TOC for every page.
If he did this correctly he wouldn't be asking. The link provides an example.

Quote:
Originally Posted by theducks View Post
Idea: Convert the Book with Chapter detection on (assume real chapter markers exist), that should concatenate the individual pages.
That might work...

Idea: create the index correctly and try again.
DoctorOhh is offline   Reply With Quote
Old 03-24-2013, 01:17 AM   #6
sumguy
Connoisseur
sumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheesesumguy can extract oil from cheese
 
Posts: 52
Karma: 1000
Join Date: Jun 2012
Device: none
hi all, thanks for your replies...

Dopedangel, at first I used SoftSnow Merger to merge the html files, but I still had all the previous/next page navigation links etc., and it was really a pain to try to get rid of them and have the pages join up to each other, mainly because everything is inside a complicated structure of html tables. Anyway I'll give vHtmlMerger a try and see if it makes it any easier. I've never used Sigil, was kind of hoping to be able to do it in Calibre without having to learn a whole 'nother app...

DoctorOhh, theducks is right, I did follow the info in that section of the faq correctly. I created "another HTML file that contains links to all the other files in the desired order". Usually that's for when you have a book with each chapter in a separate html file. But in this case, the original book I have is a folder with 297 html files, one for every page. I created an html table of contents with a link to each file, just as in the example. So I ended up with this table of contents, with 297 entries in it, at the beginning of the book.

That's exactly what I expected to happen. It just isn't a very optimal solution, to have this 15-page table of contents with an entry for every page in it. I mean it does work. I was just wondering if there's a way I could then edit that table of contents html, after it had been imported into calibre. I tried unzipping the zip and editing it, and zipping it again, but then it was unhappy and wouldn't convert to mobi anymore... well, if you have a suggestion of some other way I should be making this index, I'm happy to hear it.

I did try making an html table of contents for only the pages of the beginnings of chapters. Calibre did bring in a bunch of other pages (not sure if it was all of them) but they were all mixed up, in the wrong order. I also tried the "breadth first" setting, didn't help.

theducks, I'm not sure how to go about trying your idea. I'm assuming you mean the chapter detection in the heuristic processing. But I'm still stuck at how to get all the html pages into Calibre in the first place, in the right order, without making this huge table of contents by hand...

Well, maybe I'll try some more experiments if I have time tomorrow, thanks again!

Last edited by sumguy; 03-24-2013 at 01:21 AM.
sumguy is offline   Reply With Quote
Old 03-24-2013, 01:37 AM   #7
BetterRed
null operator
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 3,019
Karma: 1682890
Join Date: Mar 2012
Location: NSW Australia
Device: none
Quote:
Originally Posted by sumguy View Post
....I was just wondering if there's a way I could then edit that table of contents html, after it had been imported into calibre.
Did you try Calibre's new TOC Edit tool ===>>> http://www.mobileread.com/forums/sho...d.php?t=208299

BR
BetterRed is offline   Reply With Quote
Old 03-24-2013, 02:06 AM   #8
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,799
Karma: 12528001
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by sumguy View Post
DoctorOhh, theducks is right, I did follow the info in that section of the faq correctly. I created "another HTML file that contains links to all the other files in the desired order". Usually that's for when you have a book with each chapter in a separate html file. But in this case, the original book I have is a folder with 297 html files, one for every page. I created an html table of contents with a link to each file, just as in the example. So I ended up with this table of contents, with 297 entries in it, at the beginning of the book.
I guess I was a little slow on the uptake. Try converting the book with 297 entries in the TOC to htmlz which should give you one large html file. Edit the file to remove the TOC and convert to epub or mobi as theducks suggested and if the chapters are clearly identified this may do the trick.

Quote:
Originally Posted by sumguy View Post
theducks, I'm not sure how to go about trying your idea. I'm assuming you mean the chapter detection in the heuristic processing.
Yes use the heuristic processing.
DoctorOhh is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert HTML to MOBI (HTML recognized as ZIP file) pdubois Conversion 1 01-25-2011 12:55 PM
can't convert prc file to html kdd6 Calibre 5 12-21-2010 08:45 AM
How can i convert HTML or txt file to EPUB file ? guguqiaqia ePub 7 05-28-2010 09:15 PM
Convert HTML file to MOBI for Kindle IMFletch Calibre 5 04-16-2010 01:06 PM
Plucker: can't right-click on html file to convert... jplowman Reading and Management 1 08-08-2009 11:21 PM


All times are GMT -4. The time now is 04:26 PM.


MobileRead.com is a privately owned, operated and funded community.