![]() |
#1 |
Sir Penguin of Edinburgh
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
html problem
I'm posting a set of files that have me stumped.
The ZIP file contains 6 HTML files from the CIA World Fact Book. One file does not display the same as the other 5. I cannot figure out why. There are a minimum number of tags in the files. Eternal gratitude and karma to the first person who figures it out. Thanks. |
![]() |
![]() |
![]() |
#2 |
GuteBook/Mobi2IMP Creator
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
![]() OK, the first five .html's have a problem with the heading tags surrounding the Appendix title. There's no closing </h2> as it's currently a typo <h2> in all the html's except the last one. The last one (6th) .html file has no <link> in the <head> section as well as a spurious ">" near the bottom of the file. Hope this helps! Last edited by nrapallo; 03-12-2009 at 10:31 PM. Reason: typo |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Sir Penguin of Edinburgh
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Quote:
![]() Thank you. |
|
![]() |
![]() |
![]() |
#4 |
GuteBook/Mobi2IMP Creator
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
You're welcome.
Also, the second file, appendix-b.html, has a mismatched number of "<" and ">". You should change the 56 occurences of "<br<I>" to "<br><I>". Looks like you're progressing along quite well. I'm still hand editing/fixing appendix-b.html for my REB1200 version and will have to figure out a better method when I get to the rankorder/fields pages! ![]() Cheers, |
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,707
Karma: 32763414
Join Date: Dec 2008
Location: Krewerd
Device: Pocketbook Inkpad 4 Color; Samsung Galaxy Tab S6
|
One thing I often do if I run into a problem like that, is to open the file in a XML editor.
Or lately, just create an ebook out of it, then check it and it will give you all the closing errors (and other HTML mistakes ![]() (I've had a lot of books that had most of the text center aligned/in italic/bold/underline, just because I hadn't closed the tag correctly!) |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
You kids get off my lawn!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,220
Karma: 73492664
Join Date: Aug 2007
Location: Columbus, Ohio
Device: Oasis 2 and Libra H2O and half a dozen older models I can't let go of
|
|
![]() |
![]() |
![]() |
#7 | |
Sir Penguin of Edinburgh
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Quote:
|
|
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,707
Karma: 32763414
Join Date: Dec 2008
Location: Krewerd
Device: Pocketbook Inkpad 4 Color; Samsung Galaxy Tab S6
|
|
![]() |
![]() |
![]() |
#9 |
book creator
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,657
Karma: 3856660
Join Date: Oct 2008
Location: Luxembourg
Device: Kindle Scribe
|
You know what's really a bummer and no XML will help you prevent? If you have an anchor within a header, say
Code:
<h2><a name""></a>whatever</h2> What happens? Well the link jumps right to the anchor, ignoring the header tag and displaying the "whatever" as simple text without formatting. Took me a while to figure that one out. |
![]() |
![]() |
![]() |
#10 | |
GuteBook/Mobi2IMP Creator
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
Code:
<a href="4.html"><a name="0000000699"></a><h3>Four Horsemen of Climate Apocalypse Rev Up their Fossil-Fueled Engines</h3></a> |
|
![]() |
![]() |
![]() |
#11 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,707
Karma: 32763414
Join Date: Dec 2008
Location: Krewerd
Device: Pocketbook Inkpad 4 Color; Samsung Galaxy Tab S6
|
Quote:
this: <h2><a name""></a>whatever</h2> will be this: <h2 id="name">whatever</h2> And it gives the same functionality and is even epub valid! |
|
![]() |
![]() |
![]() |
#12 |
Feedbooks.com Co-Founder
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,263
Karma: 145123
Join Date: Nov 2006
Location: Paris, France
Device: Sony PRS-t-1/350/300/500/505/600/700, Nexus S, iPad
|
We don't do this, Wired does. Cleaning up RSS feeds is incredibly annoying believe me: messy XHTML, wrong character encoding, entities encoded twice etc...
|
![]() |
![]() |
![]() |
#13 | |
GuteBook/Mobi2IMP Creator
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
I'm currently updating Mobi2IMP to properly convert your Feedbooks.com feeds (stored in mobipocket format) and I think I can say I'm winning the battle. ![]() Most of the times, the resulting conversion does work as it's supposed to! BTW, Hadrien, there's one quirk that you may try and fix. I did notice (though I can't off hand remember where I saw this) in some exploded .mobi RSS feeds that the HTML tag <br \> was used. I needed to substitute <br /> instead. Here's my solution, utilized to properly convert your RSS feeds, written as Perl RegEx: Code:
#fix up feedbooks.com news feeds quirks $html =~ s/<br(\s)*\\>/<br \/>/gi; $html =~ s/<a href([^>]*)><a name([^>]*)><\/a>/<a name$2><\/a><a href$1>/gi; Last edited by nrapallo; 03-17-2009 at 10:24 AM. Reason: typo |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
HTML importing problem | PaladinBL | Sigil | 13 | 03-16-2010 05:03 PM |
HTML Conversion Problem | bigtymer | Calibre | 7 | 01-14-2010 08:15 PM |
Problem bei html | Insider | Erste Hilfe | 3 | 01-07-2010 12:49 AM |
html to epub problem | cstal_star | Calibre | 4 | 08-15-2009 07:54 AM |
Problem converting HTML to Mobi | AprilHare | Calibre | 3 | 05-02-2009 09:34 PM |