MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Plugins (https://www.mobileread.com/forums/forumdisplay.php?f=268)
-   -   [Plugin] HTMLgen Output Plugin (https://www.mobileread.com/forums/showthread.php?t=295059)

PeTrDu 02-24-2018 03:53 AM

[Plugin] HTMLgen Output Plugin
 
1 Attachment(s)
Hi!
Here's this small addon: HTMLgen
not sure if it will be of any help to anyone.

Description
Small output addon for Sigil to export the epub contents to one html file...

My purpose was so I can view the text of the epub with LibreOffice+Languagecheker and check any typos with it, then fix it in the epub with Sigil...
but it can be used for whatever you like :)

Note that it's not a converter (if you want a converter use calibre or another tool).

What does it do exactly?
- it copies only the <body> part of all html/xhtml files, and all the css files.
- it copies all fonts/images/audio/video files to a folder with the same name of the html (+ "_files").
- it makes a few changes to make the internal links work.

Requirements:
Any Sigil version with Python 2.7/3.4 (and tkinker* module), any OS.
* If you have Python, you probably have tkinker module already, but it will warn you if you don't.

Install:
Like any other addon for Sigil, go to "Plugins" menu >>> Manage plugins >>> Add plugin ... and select the HTMLgen_vX.X.X.zip file.
If everything it's OK, it should appear the option in Plugins >>> Output >>> HTMLgen
If you click it, it will only ask you where to save the HTML, and that's all.

Languages: Catalan, English, French (thanks to Arios), Galician, German (thanks to brucewelch), Italian (thanks to Auramazda), Polish (thanks to bravosx), and Spanish.
If you want to, translate the text in the spoiler to your language and send me an MP, or post it here, so I can add it.
Spoiler:
[en]
htmlfile=HTML file
saveas=Save as HTML...
copied=Copied file: %s
w_nocopy=[WARNING] Cannot copy the file: %s
langload=Language loaded: English
notkinkr=[ERROR] 'tkinker' python module not found.
pyvers=Using Python %s
readmeta=Reading epub metadata...
askfn=Asking for output filename...
cancel=Operation canceled.
openfn=Opening output filename: %s
e_openfn=[ERROR] Couldn't open output filename.
header=Getting ready to write the html header...
reading=Reading %s...
htmsaved=HTML file saved.
newfoldr=Folder created: %s
w_mkdir=[WARNING] Couldn't create output directory: %s
w_nodir=[WARNING] Output directory not found: %s
timespnt=Time spent: %s sec.
htm_warn=HTML file saved with warnings or errors.
htm_ok=HTML file saved successfully.


Thank you:
Thanks to the translators (the ones in the list, and others that aren't there).
Thanks to BeckyEbook for the icon.
And special thanks to someone that helped me since the first private version 0.0.1 giving me some tips and advices.
Thank you all! :)

License: GNU General Public License v3 (GPL-v3)

Download:
See the attached file.

Regards.

BetterRed 02-24-2018 04:20 AM

FWIW - there's a Sigil plugin for Language Tools Grammar Checker ==>> [LanguageTool]: Grammar check.

BR

PeTrDu 02-24-2018 05:29 AM

@BetterRed: I know, and that's a great addon.

But it's not the same using the Sigil Results dialog, than using LibreOffice IDE.
Specially when there are a lot of messages.
Just consider it as another alternative.

I just wanted to try to do this and to share it, even if I'm not a good programmer.

Regards

roger64 02-24-2018 05:54 AM

Hi

Welcome here and thank you for this plugin!! What a great entrance.:)

BTW it can be useful too in French where we can use Grammalecte to check odt files in LO. I'll try it.

BetterRed 02-24-2018 06:37 AM

Quote:

Originally Posted by PeTrDu (Post 3661691)
@BetterRed: I know, and that's a great addon.

But it's not the same using the Sigil Results dialog, than using LibreOffice IDE.
Specially when there are a lot of messages.
Just consider it as another alternative.

I just wanted to try to do this and to share it, even if I'm not a good programmer.

:thumbsup: - I suspected your motivation might be the better UI in LO's implementation.

I agree, alternatives are good to have, I sometimes pick up spelling inconsistencies in Sigil that I missed in Word/Writer, most especially with proper nouns.

BR

DiapDealer 02-24-2018 10:34 AM

Thanks for your contribution! I added this to the Sigil plugin index.

Can I ask what license you've chosen to share this under? We like to know in case plugins ever get abandoned.

odamizu 02-24-2018 03:37 PM

Quote:

Originally Posted by PeTrDu (Post 3661664)
... Small output addon for Sigil to export the epub contents to one html file...

Great plug-in! I give you Karma as thanks and welcome! :D

roger64 02-24-2018 09:21 PM

I did try it on an ePub of mine. I've got a working and very usable file I could import in LO.

- All paragraph styles merged in Textbody style. Linebreaks were rightly maintained.
- The hierarchical structure of the file is maintained (h2 titles preserved).
- One thing that maybe could be improved is that the endnotes links disappeared.

It fits its initial and declared purpose: to allow checking the text in LO.
It also can be used to quickly reformat a standard but "strangely" formatted ePub without fiddling and correcting inline styles or other quirks.

YMMV It's worth a try.

Thank you.:)

PeTrDu 02-25-2018 09:04 AM

Thank everybody for all the replies, and for the welcoming :)

Quote:

Originally Posted by DiapDealer (Post 3661740)
Can I ask what license you've chosen to share this under?

@DiapDealer: It doesn't really matter to me, but since you ask, I think that the same License as Sigil's will be OK.
In short: GPL-v3
Added to the first post.

Quote:

Originally Posted by odamizu (Post 3661825)
Great plug-in! I give you Karma as thanks and welcome! :D

@odamizu: Thank you! I hope you find it useful somehow :)

Quote:

Originally Posted by roger64 (Post 3661931)
I did try it on an ePub of mine.

Thank you for testing it :) :thanks:

Quote:

Originally Posted by roger64 (Post 3661931)
- One thing that maybe could be improved is that the endnotes links disappeared.

@roger64: Can you give me a sample (an epub or a link to another example), to see if I can do something about that?

Regards.

roger64 02-25-2018 09:52 AM

I will PM you the link.

Arios 02-25-2018 07:36 PM

French translation
 
PeTrDu

Tanks for this Plugin.

I put here the French translation you've ask so that everyone can edit it if needed.
For me your plugin works well, even with the notes.

Spoiler:
[fr]
htmlfile=Fichier HTML
saveas=Enregistrer au format HTML...
copied=Fichier %s copié.
w_nocopy=[ATTENTION] Impossible de copier le fichier %s
langload=Langue chargée: Français
notkinkr=[ERREUR] le module python 'tkinker'est introuvable.
pyvers=Utilisation de Python %s
readmeta=Lecture des métadonnées du epub...
askfn=Demande d'un nom pour l'enregistrement du fichier...
cancel=Opération annulée.
openfn=Ouverture du fichier de sortie %s
e_openfn=[ERREUR] Impossible d'ouvrir ce nom de fichier.
header=Préparation à l'écriture de l'entête html...
reading=Lecture de %s...
htmsaved=Fichier HTML enregistré.
newfoldr=Le dossier %s a été créé.
w_mkdir=[ATTENTION] Impossible de créer le dossier de sortie %s
w_nodir=[ATTENTION] Le dossier de sortie %s est introuvable.
timespnt=Temps écoulé: %s sec.
htm_warn=Fichier HTML enregistré avec des mises en garde ou des erreurs.
htm_ok=Fichier HTML enregistré avec succès.

bravosx 02-26-2018 03:07 PM

Polish translation

PeTrDu please add to the plugin.

[pl]
htmlfile = Plik HTML
saveas = Zapisz jako HTML ...
copied = Kopiuj plik:% s
w_nocopy = [OSTRZEŻENIE] Nie można skopiować pliku:% s
langload = Załadowany język: polski
notkinkr = [BŁĄD] Nie znaleziono modułu python "tkinker".
pyvers = Używanie Pythona% s
readmeta = Czytanie metadanych epub ...
askfn = Pytanie o wyjściową nazwę pliku ...
cancel = Operacja anulowana.
openfn = Otwieranie pliku wyjściowego:% s
e_openfn = [BŁĄD] Nie można otworzyć pliku wyjściowego.
header = Przygotowanie do napisania nagłówka html ...
reading = Czytanie% s ...
htmsaved = Zapisano plik HTML.
newfoldr = Folder został utworzony:% s
w_mkdir = [OSTRZEŻENIE] Nie można utworzyć folderu wyjściowego:% s
w_nodir = [OSTRZEŻENIE] Nie znaleziono folderu wyjściowego:% s
timespnt = Upływający czas:% s sek.
htm_warn = Plik HTML zapisany z ostrzeżeniami lub błędami.
htm_ok = Plik HTML został pomyślnie zapisany.

Best regards and thank you
bravosx

Hitch 02-26-2018 07:05 PM

Quote:

Originally Posted by PeTrDu (Post 3661664)
Hi!
Here's this small addon: HTMLgen
not sure if it will be of any help to anyone.

Description
Small output addon for Sigil to export the epub contents to one html file...

My purpose was so I can view the text of the epub with LibreOffice+Languagecheker and check any typos with it, then fix it in the epub with Sigil...
but it can be used for whatever you like :)

Note that it's not a converter (if you want a converter use calibre or another tool).

What does it do exactly?
- it copies only the <body> part of all html/xhtml files, and all the css files.
- it copies all fonts/images/audio/video files to a folder with the same name of the html (+ "_files").
- it makes a few changes to make the internal links work.

Requirements:
Any Sigil version with Python 2.7/3.4 (and tkinker* module), any OS.
* If you have Python, you probably have tkinker module already, but it will warn you if you don't.

Install:
Like any other addon for Sigil, go to "Plugins" menu >>> Manage plugins >>> Add plugin ... and select the HTMLgen_vX.X.X.zip file.
If everything it's OK, it should appear the option in Plugins >>> Output >>> HTMLgen
If you click it, it will only ask you where to save the HTML, and that's all.

Languages: English, Spanish
If you want to, translate this strings to your language and send me an PM, or post it here, so I can add it.
Spoiler:
[en]
htmlfile=HTML file
saveas=Save as HTML...
copied=Copied file: %s
w_nocopy=[WARNING] Cannot copy the file: %s
langload=Language loaded: English
notkinkr=[ERROR] 'tkinker' python module not found.
pyvers=Using Python %s
readmeta=Reading epub metadata...
askfn=Asking for output filename...
cancel=Operation canceled.
openfn=Opening output filename: %s
e_openfn=[ERROR] Couldn't open output filename.
header=Getting ready to write the html header...
reading=Reading %s...
htmsaved=HTML file saved.
newfoldr=Folder created: %s
w_mkdir=[WARNING] Couldn't create output directory: %s
w_nodir=[WARNING] Output directory not found: %s
timespnt=Time spent: %s sec.
htm_warn=HTML file saved with warnings or errors.
htm_ok=HTML file saved successfully.


License: GNU General Public License v3 (GPL-v3)

Download:
See the attached file.

Regards.


Thanks for this. It saves time over merging them all! ;-)

Hitch

PeTrDu 02-27-2018 03:09 PM

Quote:

Originally Posted by roger64 (Post 3662081)
I will PM you the link.

@roger64:
Well, after checking the epub,
the only thing that I see it disappears it's the last <nav> section and its because it has the "hidden" attribute:
Code:

<nav ... hidden="" ...>
  ...
</nav>

That hides it in any webbrowser, and also in Sigil preview and book view (but not in code view, of course).
LibreOffice doesn't hide it, probably to be able to edit it.
I can remove the hidden="" part, but I think it's better not to do it. If you hide something with attributes or with css styles it's for some reason, isn't it?

The links works OK with any web-browser, but in LibreOffice, with the Ctrl+click to a link, some links works, other doesn't... I suppose it's a weird LibreOffice bug when it loads html files.

In the end, I'm afraid there's nothing much I can do about it.
I'm sorry about that.

On the other hand, with the test I found out a bug: the font files weren't properly linked, it will be fixed next version.

Quote:

Originally Posted by Arios (Post 3662274)
I put here the French translation you've ask so that everyone can edit it if needed.

@Arios: Merci, Arios!!! I'll add it.


Quote:

Originally Posted by bravosx (Post 3662601)
Polish translation

PeTrDu please add to the plugin.

@bravosx: Dziękuję!!! It will be added next version :2thumbsup


----------------
to everyone:
I'll wait a little bit to see if someone else wants to give me another translation,
I'll release it before Saturday, so have a little patience :)

Regards.

roger64 02-27-2018 07:52 PM

Quote:

Originally Posted by PeTrDu (Post 3663109)
@roger64:
Well, after checking the epub,
the only thing that I see it disappears it's the last <nav> section and its because it has the "hidden" attribute:
Code:

<nav ... hidden="" ...>
  ...
</nav>

That hides it in any webbrowser, and also in Sigil preview and book view (but not in code view, of course).
LibreOffice doesn't hide it, probably to be able to edit it.
I can remove the hidden="" part, but I think it's better not to do it. If you hide something with attributes or with css styles it's for some reason, isn't it?

The links works OK with any web-browser, but in LibreOffice, with the Ctrl+click to a link, some links works, other doesn't... I suppose it's a weird LibreOffice bug when it loads html files.

In the end, I'm afraid there's nothing much I can do about it.
I'm sorry about that.

On the other hand, with the test I found out a bug: the font files weren't properly linked, it will be fixed next version.

Thanks for your findings. :)

Good to know it's a LO bug. I did not try the links on a web browser. I tried also a commercial book where only the return links were working in the odt file. For information, there is a very fine plugin in Sigil, FootnoteLinker, which has been conceived precisely to reestablish quickly broken links on an ePub. I used it once for a book with one thousand links...

The hidden=" " attribute in the nav file is produced by the new Access-Aide Accessibility helper plugin. This "landmarks" part of the nav is to be machine readable only. Don't bother with this, it can be reestablished later very easily on the new ePub if need be.

I found that some books use a double chapter title, I mean two successive h1 (which I don't like but I can't help). This is translated by two <h1> tags with a pagebreak between (which is a normal "clean" behaviour). It's not that easy to batch glue them together. Any advice?

Edit: after further testing, this is to confirm that the major question is that LO converts this amazingly precise file to odt in a somewhat shaky way. And this part (html>odt) is 100% out of control of the plugin. To say it clearly, I was wrong to draw hasty conclusions about this plugin from the look of the odt file.


All times are GMT -4. The time now is 08:51 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.