Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 10-12-2010, 11:18 AM   #1
schizopolis
Junior Member
schizopolis began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: kindle 3
Remove <previous next> from html

I've noticed that a lot of older 'ebooks' in html format have an ugly header and footer with <previous [page number] next> at the top and bottom in an appaling mint green slab, and converting them to mobi for my kindle retains the ugly, useless format and renders it largely unreadable.

Is there any quick way (that doesn't involve opening each html individually in notepad and deleting them individually) to sort this out?

I've searched the forum for help on this topic, but having 'html' as a search phrase brings up every single forum post with an internet link in it.
schizopolis is offline   Reply With Quote
Old 10-12-2010, 11:55 AM   #2
Christina
Junior Member
Christina began at the beginning.
 
Christina's Avatar
 
Posts: 8
Karma: 34
Join Date: Dec 2009
Location: South of England
Device: Sony PRS-600
I'm far from an expert, but this is a problem I have ahead of me (promised myself I'd convert a lot of old .html ebooks for my Aunt for Christmas). Could you convert it to an epub and edit it in Sigil? Using the find and replace function, or the CSS code. I've not tried it with an .html file.

If not, I am sure one of the lovely experts here will have a solution :-)
Christina is offline   Reply With Quote
Old 10-12-2010, 12:13 PM   #3
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80446
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
You should be able to use header/footer removal in the structure detection part of the conversion settings to do that.
Manichean is offline   Reply With Quote
Old 10-12-2010, 12:30 PM   #4
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,428
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by schizopolis View Post
I've noticed that a lot of older 'ebooks' in html format have an ugly header and footer with <previous [page number] next> at the top and bottom in an appaling mint green slab, and converting them to mobi for my kindle retains the ugly, useless format and renders it largely unreadable.

Is there any quick way (that doesn't involve opening each html individually in notepad and deleting them individually) to sort this out?

I've searched the forum for help on this topic, but having 'html' as a search phrase brings up every single forum post with an internet link in it.
Be aware there are really 3 versions of this "header".
The first and last pages only have 2 choices TOC and Next, Previous and TOC
I like Sigil because I get to see what is going to be included in the replace before I Replace.
theducks is offline   Reply With Quote
Old 10-12-2010, 01:30 PM   #5
schizopolis
Junior Member
schizopolis began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: kindle 3
Thanks for the replies. Header/Footer removal doesn't work in Caliber. At all.

I've had a look at Sigil, but it looks like I would have to go into each page individually and delete the header and footer by hand, which would take just as long as editing the html in notepad.
schizopolis is offline   Reply With Quote
Old 10-12-2010, 01:35 PM   #6
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80446
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by schizopolis View Post
Thanks for the replies. Header/Footer removal doesn't work in Caliber. At all.
Yes, it does... two possibilities: Either you're using it wrong or you've found a bug. What regex did you try and what's the text you want to match?
Manichean is offline   Reply With Quote
Old 10-12-2010, 01:52 PM   #7
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,428
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by schizopolis View Post
Thanks for the replies. Header/Footer removal doesn't work in Caliber. At all.

I've had a look at Sigil, but it looks like I would have to go into each page individually and delete the header and footer by hand, which would take just as long as editing the html in notepad.
Sigil
Code view. Down Only
S&R change from current file (once tested on the page wit all 3), to All HTML
open th first and last and fix in CV. Time 2 Minutes or less
theducks is offline   Reply With Quote
Old 10-12-2010, 11:27 PM   #8
schizopolis
Junior Member
schizopolis began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: kindle 3
That would work if the ebook were a single html file. As it is a series of individual files (page_1.html, etc.), and you can't open multiple files in Sigil, it doesn't work.
schizopolis is offline   Reply With Quote
Old 10-12-2010, 11:50 PM   #9
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,799
Karma: 12528001
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by schizopolis View Post
That would work if the ebook were a single html file. As it is a series of individual files (page_1.html, etc.), and you can't open multiple files in Sigil, it doesn't work.
If, with help here, you can't get the proper regex needed to remove these during the conversion to mobi, you might want to consider converting it to epub then use Sigil to Search & Replace as needed.

Last edited by DoctorOhh; 10-13-2010 at 12:07 AM.
DoctorOhh is offline   Reply With Quote
Old 10-13-2010, 05:41 AM   #10
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,428
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by schizopolis View Post
That would work if the ebook were a single html file. As it is a series of individual files (page_1.html, etc.), and you can't open multiple files in Sigil, it doesn't work.
You missed the step:
switch to Code View (CV)
Now you can select and use "All HTML files" in S & R

You only need to open the first file
(the other 95, work in background. Yes I have done a 95+ file S&R a few times)
as I noted, File1 and file95 are different from each other and the other 93 in between.
theducks is offline   Reply With Quote
Old 10-13-2010, 05:57 AM   #11
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,799
Karma: 12528001
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by theducks View Post
You missed the step:
switch to Code View (CV)
Now you can select and use "All HTML files" in S & R
I did not know that Sigil will open a single html file and automatically load all other associated html files. I thought the "All HTML files" in S & R was just for every other html inside the epub file.
DoctorOhh is offline   Reply With Quote
Old 10-13-2010, 06:06 AM   #12
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,428
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by dwanthny View Post
I did not know that Sigil will open a single html file and automatically load all other associated html files. I thought the "All HTML files" in S & R was just for every other html inside the epub file.
You are correct.
Did I miss that these were separate EPUB files?
theducks is offline   Reply With Quote
Old 10-13-2010, 06:14 AM   #13
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,799
Karma: 12528001
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by theducks View Post
You are correct.
Did I miss that these were separate EPUB files?
In the original post he talks about the source files being html and wanting to convert them to mobi without the next - previous stuff included.
DoctorOhh is offline   Reply With Quote
Old 10-13-2010, 10:36 AM   #14
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,428
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by dwanthny View Post
In the original post he talks about the source files being html and wanting to convert them to mobi without the next - previous stuff included.
And I assumed that Next Previous referred to the HTML pages in an EPUB , where they would be useful as there is some sort of relation between pages. Hence Next and Previous
theducks is offline   Reply With Quote
Old 11-18-2010, 12:17 AM   #15
schizopolis
Junior Member
schizopolis began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: kindle 3
Well, it's obvious nobody here has had to deal with this. Thanks for trying, though.
schizopolis is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre Recipe HTML content differs from raw html of index.html. krunk Calibre 4 09-20-2010 09:48 PM
Regex help to remove HTML footer neonbible Calibre 4 09-09-2010 09:42 AM
Remove page info from HTML when converting? JMikeD Calibre 5 04-04-2010 08:40 PM
RFE: Remove remove tags in bulk edit magphil Calibre 0 08-11-2009 10:37 AM
IBSuite v0.1 (previous pi) caritas Workshop 0 04-05-2009 10:48 AM


All times are GMT -4. The time now is 05:19 AM.


MobileRead.com is a privately owned, operated and funded community.