![]() |
#1 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Mar 2010
Device: none
|
![]()
This has me mystified, I have a fine html doc (which as per the instructions I used "save as" from my original word doc) with no tricky things in it. It views perfectly by itself. when I open it in Sigil, the code window shows some file data (author, etc.), but the book view is totally blank. one single blank page.
I can't even see any way for me to have done this incorrectly, but I am stumped by this, and can't get any further, nor can I find any threads or help that even mentions this kind of problem. any ideas out there? all help deeply appreciated! -walter |
![]() |
![]() |
![]() |
#2 |
Zealot
![]() Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
|
Can you upload your file here, Walter? It's hard to give advice without seeing the document.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Mar 2010
Device: none
|
file sample
I can email the file to you if you can send em an email address. my email is: walter2@sphere.bc.ca
regards, walter |
![]() |
![]() |
![]() |
#4 |
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Don't upload it to the forum, attach it to your issue on the tracker.
Read the reporting issues wiki page. There is a correct way to report problems, and then there's every other way. |
![]() |
![]() |
![]() |
#5 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Mar 2010
Device: none
|
It's pretty clear to me that using the "save as HTML" is NOT the answer for importing Word docs. studying the output HTML files reveals the usual MS forest of un-needed tags and a lot of javascript even for the simplest things. As was pointed out to me here, this can't be read by any ebook reader.
my file has some spacing, drop caps, and underlines. all of which turned into literally pages of un-wanted tags. fine. I hand stripped everything out, saved as HTML, and made sure there was no javascript in the resulting code. but, I STILL got the empty white page of death in sigil. no problem, I tired save as RTF. Nope, again, the white page of death in Sigil. now this was pretty frustrating, as I just couldn't see where the problem could possibly be hiding. fine, I saved just as text. this worked, and sigil did import the text in fine, but I then had to go back in and fix the style issues in html and some on screen edits. so far so good. at least I have a working document. however I think it is fair to make these observations: Sigil should NOT recommend use of the Word's internal HTML conversion to make an import file for sigil as the "best way". I tired many different and simple text sniplets, all crash when attempting to load into sigil as HTML. since RTF also didn't work for me, I think you should be changing the suggested word export technique to plain text, as that would have saved me hours of work and many inexplicable problems. there are no doubt examples of Word files that can somehow work in a higher level export format, but there are so many issues with even simple files, that I just can't see it as the "recommended" way, especially since there is no guidance at all as to what can go wrong or why it does so in sigil. other than that time-wasting input format nightmare, I have to say sigil worked pretty well, although two problem are still making me crazy: 1. how do I get paragraphs to indent automatically? the default is left aligned blocks of text, not very attractive. I see no way to fix it. i tried altering a P tag in the CSS area but I could only get the inter-paragraph spaces to go away, not get a leading indent. 2. why on earth does the entire document reload at the very start whenever you change anything in the code window? talk about irritating...especially in a 249 page document...there's no quick way to return. I also noted that when saving, the program automatically appends .sgf to the file name, this makes saving as an epub file impossible. you have to go in and edit the file name to get rid of this quirk before saving as an epub file. one last thing that remains a mystery to me, does the TOC ever appear anywhere in the document? I have my entries in it, but within sigil, I can't see it or use it for navigation at all. how on earth do you make it actually appear? The wiki tutorial says zip on this topic. many thanks, walter |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |||||||
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Oh boy...
Quote:
Quote:
You should never see a white page after importing. Quote:
Quote:
Code:
p { text-indent:30px; } Quote:
Quote:
Quote:
People reading your epub book will be able to access the TOC through an always available menu entry. This is "the epub way". Of course, you can also make an inline TOC with links by hand, but I personally suggest you don't. The NCX TOC is there for a reason, and is more usable than an inline one to which you have to manually scroll etc. It also displays according to the UX of the Reading System: the Sony PRS-505 shows it as a menu, ADE shows it in a tab on the left of the screen etc. Last edited by Valloric; 03-24-2010 at 02:01 PM. Reason: typo |
|||||||
![]() |
![]() |
![]() |
#7 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Mar 2010
Device: none
|
![]()
For anybody that would like to test this problem for themselves, I have the three versions of the file, HTML (rar'd to fit the upload limits), RTF and TXT.
I am using the current release, 0.1.9 I just downloaded the new beta of Sigil, to test, do I have to remove my old version first, or can I just install the beta on top of it? just out of curiosity, in the end, does the epub format somehow bundle the images used, or do they travel along as individual files, as with html? many thanks for all the excellent help and suggestions, I am getting very close to a fully working file! best regards, walter sphere research corp. http://www.sphere.bc.ca |
![]() |
![]() |
![]() |
#8 | |
What Title ?
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,325
Karma: 1856232
Join Date: Jan 2009
Location: Bavaria Germany
Device: Sony Experia Z Ultra
|
Quote:
FWIW, the cleaned up file looks fine in Sigil except for some extra large spacing between paragraphs, but then I am not the author so I do not really know what to expect. There are obviously a lot of thing that I could have missed in my quick look. In any case the experience was amusing so thanks for the sample to play with, and I hope my experience in trying out your file may help point you toward a solution. |
|
![]() |
![]() |
![]() |
#9 | ||
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Quote:
But that's some awful HTML. I don't even think you can call that HTML. ![]() I then opened the file in Word 2007 and saved it as "Web page, filtered" and opened that file just fine with Sigil. The layout is the same in Sigil and Word (as far as I can tell from a quick glance). You should always use the filtered HTML option when saving HTML from Word, no matter what application you want to use to open the resulting file. Quote:
The way you phrased that question, I'd answer "yes" to both. They are stored as individual files inside the epub archive. An epub is just a ZIP archive with specific contents. Word 2007's new DOCX format works in a similar way (it's also a ZIP archive). Last edited by Valloric; 03-24-2010 at 02:18 PM. |
||
![]() |
![]() |
![]() |
#10 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Mar 2010
Device: none
|
Well, sad to report that I have word 2000, not 2007, and it has only one sad flavor of HTML export (grossly over-done and incomprehensible). I do have Open Office, however, and the idea of rinsing it though there has some appeal for other docs
what is this mysterious HTML Tidy application? many thanks, walter |
![]() |
![]() |
![]() |
#11 |
What Title ?
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,325
Karma: 1856232
Join Date: Jan 2009
Location: Bavaria Germany
Device: Sony Experia Z Ultra
|
It checks HTML documents for correctness, and tries to clean up what it can. It is a command line application, but there is also a GUI for it if needed.
http://tidy.sourceforge.net/ |
![]() |
![]() |
![]() |
#12 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 697
Karma: 150000
Join Date: Feb 2010
Device: none
|
Quote:
![]() But in the long run, loading the original .doc or .rtf or whatever into OpenOffice, then saving using the Writer2xhtml plugin is probably the best way to go. |
|
![]() |
![]() |
![]() |
#13 | |
What the Dog Saw
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 311
Karma: 981684
Join Date: Jul 2008
Location: Dunn Loring
Device: Sony PRS-650, Surface3
|
Quote:
http://support.microsoft.com/?kbid=236967 Last edited by yekim54; 03-24-2010 at 11:47 PM. |
|
![]() |
![]() |
![]() |
#14 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,623
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
about OpenOffice and Writer2xhtml, you may have a look here
![]() https://www.mobileread.com/forums/sho...77&postcount=7 |
![]() |
![]() |
![]() |
#15 |
Zealot
![]() Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
|
In my opinion it's a very good practice opening a DOC file in OpenOffice and saving it as an ODT file before exporting to HTML. As you might have seen, a DOC file is normally three or more times as big as an ODT file. So Open Office does a lot of cleaning work.
|
![]() |
![]() |
![]() |
Tags |
html conversion is blank, html problems, input doc failure |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Importing Open Office HTML in Sigil | paulpeer | Sigil | 17 | 03-18-2010 04:23 AM |
HTML importing problem | PaladinBL | Sigil | 13 | 03-16-2010 05:03 PM |
Blank spaces on the side of cover when importing from epubs | Dopedangel | Calibre | 6 | 02-09-2010 12:15 AM |
Sigil 1.6 - deleting blank line very slow | lol | Sigil | 2 | 12-24-2009 11:54 AM |
Importing HTML Files | Shadowlane | Calibre | 1 | 12-19-2009 03:04 PM |