01-17-2011, 05:55 PM | #1 |
Zealot
Posts: 110
Karma: 5176
Join Date: Dec 2010
Device: Mac OSX, iPad, iPod, & Nook
|
Multi Slide HTML files
Hey all,
Occasionally, I run into a book file that is HTML and has each chapter as a separate slide. Such as: arryn.html asoiaf.css barath.html greyjoy.html im_map-north.png im_map-south.png lannis.html martell.html slide2.html slide3.html slide4.html slide5.html slide6.html slide7.html slide8.html slide9.html slide10.html slide11.html slide12.html slide13.html slide14.html slide15.html slide16.html slide17.html slide18.html slide19.html slide20.html slide21.html slide22.html slide23.html slide24.html slide25.html slide26.html slide27.html slide28.html slide29.html slide30.html slide31.html slide32.html slide33.html slide34.html slide35.html slide36.html slide37.html slide38.html slide39.html slide40.html slide41.html slide42.html slide43.html slide44.html slide45.html slide46.html slide47.html slide48.html slide49.html slide50.html slide51.html slide52.html slide53.html slide54.html slide55.html slide56.html slide57.html slide58.html slide59.html slide60.html slide61.html slide62.html slide63.html slide64.html slide65.html slide66.html slide67.html slide68.html slide69.html slide70.html slide71.html slide72.html slide73.html slide74.html stark.html targ.html toc.html tully.html tyrell.html I am on a Mac and have been using textutil in the terminal to concatanate the slides and then find and replace unwanted text. Is there an easier way to import these files into Calibre? TIA Happy Monday Archon |
01-17-2011, 06:06 PM | #2 |
Book Geek
Posts: 596
Karma: 1499085
Join Date: Aug 2010
Location: Adelaide, Australia
Device: Kobo Touch, Asus MemPad 7" tablet, Nexus 5, Asus 10" tablet
|
This is probably a result of converting a scanned book from PDF to EPUB. Unfortunately if OCR (optical character recognition) software is not used then the PDF file simply becomes a series of "pictures" of pages - much like scanning a series of photographs. PDF documents produced from a word processor treat the content as text, so they convert to EPUB quite easily. A lot of old books are scanned as "page images" rather than text, I don't know if there is a solution. You could try searching Google books or the Archive Org. for the original PDF of the book and if you have access to a good OCR program try the conversion to text - but don't expect miracles!
|
Advert | |
|
01-17-2011, 06:54 PM | #3 |
Guru
Posts: 695
Karma: 822675
Join Date: May 2010
Device: Kobo Aura, Nokia Lumia 920 (Freda)
|
You can create a new HTML page that links to each of those pages in order (kinda like a table of contents), and then import that HTML page. Calibre should find all of the rest that are linked and put them into the same zip, and when you convert to other formats like epub it should just do the right thing.
|
01-18-2011, 05:02 AM | #4 |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Generally speaking usually all you have to do is add the TOC.html or the index.html to calibre and it will grab up the other associated html files into one zip file ready for conversion.
|
01-18-2011, 07:55 AM | #5 |
Zealot
Posts: 110
Karma: 5176
Join Date: Dec 2010
Device: Mac OSX, iPad, iPod, & Nook
|
Thanks toddos and dwanthny that will save me a lot of time with these files.
And thanks to Kovid and all the developers for making it so easy to import a split file. Archon |
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Multi-HTML to anything | sidd.artha | Calibre | 4 | 10-05-2010 04:34 PM |
Multi-HTML Help! | tarifelagund | Other formats | 4 | 09-07-2009 03:49 PM |
multi-page HTML with images to ePub or LRF | Nvidiot | Workshop | 19 | 07-13-2009 07:20 PM |
converting multi-page HTML to Mobipocket | shinew | Calibre | 13 | 02-21-2009 01:33 PM |
Multi-html files as chapters... | WigglePig | Sony Reader | 5 | 09-16-2008 04:06 AM |