Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 10-09-2010, 07:48 PM   #1
sergio blum
Member
sergio blum began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: SONY READER
<pre> tags and no text reflow in EPUB

Since the previous post was my very first one, I supposed I was not entitled — or it would not be appropriate — to create a new thread myself. I thought this was for advanced users of Mobileread.

Thanks to dwanthny’s reccomendation I have now done this and here I am. I hope the previous respondents — both dwanthny and ldolse will see me here.

After people become cognizant they forget how baffled they were when ignorant. Please be patient. The day before before yesterday I still did not know what a CSS is. Now I know already know the meaning of karma, ticket, <pre>, white-space (and what characters are collectively called whitespaces) etc. Also, I have made so many (thirty... forty) round trips between calibre and sigil that I am now relatively familiarized with the calibre in its basic aspects. So I believe I have the minimum requirements to implement suggestions from people willing to help me.

Let me begin by quoting my own first post in the old thread.

= = =
I'm not familiar with this XTML/CSS stuff, but have learnt some in the last 30 hours or so of trial and horrors.
Could a kind soul pleeease (please) explain step by step how to manually remove pre tags.
I'd rather have my code text reflow like body text -- courier or no courier.
Have Sony Reader PRS-700. Been using Calibre and Sigil.
Thanks in advance
= = =

I failed to say that the conversions I am trying to do are from CHM to EPUB.
The CHM contains body text and examples of ActionScript code.
Below, this is what I mean by code text.

ldolse, you said: “By the way, if someone has a file with multiple <pre> tags, please attach it to the linked bug”. I’m sorry, I dont understand what you mean by “linked bug”, but since the sentence contains “please attach”, I am attaching a small sample EPUB file. Maybe, just maybe, having a look a the XTML and the CSS will immediately convey more information than me trying describe the issues. So, please have a look at the file.


dwanthny also wrote:
/QUOTE
Did you try selecting preprocess input file during conversion (see attached)? In this thread, ldolse stated the following about preprocess and lit files in reference to a lit to epub conversion utilizing preprocessing that worked out for me.
Quote:
Originally Posted by ldolse
Some lit files are more or less text files wrapped in html with <pre> tags - sounds like this may be one of those. These come out exaclty as you describe with Calibre's default lit conversion pipeline. Preprocessing looks for those as a special case and runs them through the text input process before applying normal preprocessing.
Give it a try, let us know if it helps
/UNQUOTE

I only now found where the preprocessing option is in calibre’s conversion options... but am unure on how to proceed, for idolse also said:
/QUOTE
Preprocess won't work for epub, but if you rename the epub from .epub to .zip and add the zip version back to the book record Calibre treats it identically to compressed html, which means preprocessing will work. You shouldn't have to go from epub to rtf and back.
/UNQUOTE

Idolse, what do you mean by “book record”. Are you instructing me to change the extension to zip and then bring it into Calibre by clicking in the icon “Add books” and then going on to convert?

If so, I will try this, but I think it was kovidgoyal himself who said that this kind of problem was NOT due to bad conversion. In effect, it was said somewhere, the conversion was CORRECTLY replacing whatever flags/tags exists in CHM that purposefully tell text in a block not to reflow.

Oh, I will wait until you people have a look into my askforHELP.epub file.

But again, is it no possible to manually remove <pre> tags manually? Why is that?
I am very familiarized with writing macros in Word? Could anything useful be done in Word? Although I have never used them before, I have also installed both Dreamwaver and Adobre Acrobat Pro in my PC. Can they be useful in this context?

I also read somewhere a suggestion to convert <pre> tags into <p> tags. Does this mean just substitue <p for <pre in the XHTML? I did this and the individual lines of code text all bunched up in a single paragraph. Or must it be done in tandem with other changes?

Am I expressisng myself clearly? Is there any info I left out?
I just want — at the very least — have the code text converted into body text. The best would be to be able to specify different font sizes for body text and code text.

I cannot contemplate going to bed this one more day without learning how to solve this issue.

Please forgive for not being able to be more concise. It is a steep learning curve in the beginning. Thanks indeed in advance.

SB
Attached Files
File Type: epub askingforHELP.epub (7.8 KB, 311 views)
sergio blum is offline   Reply With Quote
Old 10-09-2010, 08:36 PM   #2
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by sergio blum View Post
I failed to say that the conversions I am trying to do are from CHM to EPUB. The CHM contains body text and examples of ActionScript code.
Any book of code often uses <pre> tags to ensure that the code is viewable exactly as required for a given language.

Quote:
Originally Posted by sergio blum View Post
But again, is it no possible to manually remove <pre> tags manually? Why is that?
It is possible, even easy, but your code examples will be jumbled by epubs attempt to reflow the text.

Quote:
Originally Posted by sergio blum View Post
I also read somewhere a suggestion to convert <pre> tags into <p> tags. Does this mean just substitue <p for <pre in the XHTML? I did this and the individual lines of code text all bunched up in a single paragraph. Or must it be done in tandem with other changes?
That is why there are <pre> tags so the code remains correctly laid out.

Quote:
Originally Posted by sergio blum View Post
Am I expressisng myself clearly? Is there any info I left out? I just want — at the very least — have the code text converted into body text. The best would be to be able to specify different font sizes for body text and code text.
Yes, you are expressing yourself clearly the only problem is the type of book. ldolse and I were both referring to recreational novels and not a book of code.

Any attempt to remove the <pre> tags for this book may compromise the correctness of your code examples. You could try enclosing each line of code with a <p> </p> tags then removing the <pre> tags. This will minimize the jumble but not prevent it entirely.

I have limited experience, but if you want this book as a reflowable epub then you should buy it in that format. Otherwise resign yourself to reading it with <pre> tags in place and using either landscape mode on your reader to gain more space to view it or read it on a computer with the ability to have a wide enough (or scrollable) viewing area to read the code as written.

Last edited by DoctorOhh; 10-10-2010 at 03:34 AM.
DoctorOhh is offline   Reply With Quote
Old 10-09-2010, 08:51 PM   #3
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,768
Karma: 54401244
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Another observation (not a spefic answer to your Pre)
You use color to differentiate the code.
Many readers are grey scale.
You might change you Pre to a Div with a Border line set in the class to make the code examples obvious
theducks is offline   Reply With Quote
Old 10-09-2010, 09:40 PM   #4
sergio blum
Member
sergio blum began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: SONY READER
dwanthny, thanks for the reply.

just seeing a reply lift my spirits, really.

But I will insist. I still have some hope.

Like I said, I know very very little of HTML/CSS

But wait, I did something just now that may solve the issue.

In Sigil I CTRL-C selected a chunk of body text and code text as shown in the Book View (meaning: NOT the XHTML in the Code View, but pure text).

Then CTRL-V in Word.

With trepidation, I see that the text pasted in Word mantained all the font styles. Arial is still Arial and Courier New is still Courier New, etc!

I am thinking as I write.

Like I said, with Word macros I can do anything!

For example, I could format all non-Courier with one standard style

And reformat Courier as I wished, say, bold, size 12pt, whatever. I could even change the font to Verdana. I mean, in Word I can do everything imaginable.

I dont know if you are familiar with these Word macros. They allow, for instance, to detect every point where text changes from Arial to Courier.

And in these places the macro could easily insert tags for begin/end of body text and also for code text.

The beauty -- I think, or am I still deluding myself? -- with this workflow is that once text is pasted into Word, goodbye ALL current tags! (including, of course, bye bye <pre>).

From here I think it is trivial, although it is where my ignorance in HTML leaves me unsure as to how to proceed.

Let me go to Dreamweaver for a moment.

OK I am back. In Dreamweaver I clicked in New > HTML

This is what I got:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Documento sem título</title>
</head>

<body>
</body>
</html>

Now it is your turn -- please, because of my pathetic ignorance of these matters.

Suppose the output of my macros in Word were this:


<myTagBodyText>
Create a URLRequest instance containing the XML data to send. Use flash.net.sendToURL( ) to send the data and ignore the server response, use flash.net.navigateToURL( ) to send the data and open the server response in a specific browser window, or use URLLoader.load( ) to both send the data and download the response into the .swf file.
Xxxx told someone to: "Open a ticket to request a feature to "reflow" <pre> tags".
</myTagBodyText>
<ActionScriptCode>
package {
import flash.display.*;
import flash.text.*;
import flash.filters.*;
import flash.events.*;
import flash.net.*;
</ActionScriptCode>

And a CSS of only two styles -- arial and courier new, of course.

Now, what do you think of this? I am thinking and writing in "real time".

Having gone so far, what do I simply do, now? Believe me: I dont know.

Where do I insert the whole block of lines beginning with

<myTagBodyText>



and ending with

</ActionScriptCode>

in the template I got from dreamweaver?

Is this waht I should do?


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Documento sem título</title>
</head>

<myTagBodyText>
Create a URLRequest instance containing the XML data to send. Use flash.net.sendToURL( ) to send the data and ignore the server response, use flash.net.navigateToURL( ) to send the data and open the server response in a specific browser window, or use URLLoader.load( ) to both send the data and download the response into the .swf file.
Xxxx told someone to: "Open a ticket to request a feature to "reflow" <pre> tags".
</myTagBodyText>
<ActionScriptCode>
package {
import flash.display.*;
import flash.text.*;
import flash.filters.*;
import flash.events.*;
import flash.net.*;
</ActionScriptCode>

</html>

then I would paste this back in Sigil.


And by now I already now how to write such a simpel CSS.

Granted, it is quit convoluted, but tell me: will it work?

For reference books I may be using for quite a long time, I think it woiuld definitely be worthwhile. If all this make sense... Does it? I reckon I could do this whole "conversion" in a couple of hours.

Are you still awake/online?

Thanks AGAIN for your patience, really.

Blum

PS- Plus the bonus of now knowing it is possible to quote selected text of other people's posts.
sergio blum is offline   Reply With Quote
Old 10-09-2010, 11:00 PM   #5
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by sergio blum View Post
Granted, it is quit convoluted, but tell me: will it work?
You're beyond my skill set. Good luck in your attempts, I'm afraid I can't help you further. Hopefully someone else might step up with the knowledge you need to be successful.

Quote:
Originally Posted by sergio blum View Post
PS- Plus the bonus of now knowing it is possible to quote selected text of other people's posts.
Judging from your first post, this still might need work.

Good Luck!

&
DoctorOhh is offline   Reply With Quote
Old 10-09-2010, 11:55 PM   #6
sergio blum
Member
sergio blum began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: SONY READER
Quote:
Originally Posted by theducks View Post
Another observation (not a spefic answer to your Pre)
You use color to differentiate the code.
Many readers are grey scale.
No I dont use colors. It was red because in Sigil I was beginning to see how CSS worked so I changed color to see if it reflected in the Book View. Color, like you said, unrelated to problem

Quote:
Originally Posted by theducks View Post
You might change you Pre to a Div with a Border line set in the class to make the code examples obvious
I do not know enough to do what you suggested.
Would it be too much to ask for an example I could imitate?

Many Thanks
sergio blum is offline   Reply With Quote
Old 10-10-2010, 12:08 AM   #7
sergio blum
Member
sergio blum began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: SONY READER
ldolse

ldolse

I would certainly welcome any additional comments after so much typing?

I am still hopeful.

I mean, even if it partially implied using a "manual" process outside Caliber and/or Sigil -- despite my not knowing "operationally" how to it, it seems strange to me that one would not be able to remove/change attributes of a humble tag.

Blum

I am still in the dark as to where insert my tagged blocks of text in the HTML page as tentatively mentioned in my post at 01:40 AM. You might call it kindergarten level, but I will have to find it by myself...
sergio blum is offline   Reply With Quote
Old 10-10-2010, 03:15 AM   #8
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
@sergio, as Dwanthy mentioned, my work with <pre> tags had to do with recreational novels, not with books full of code examples. Books of code need to retain the formatting of the code within the <pre> tags. If it doesn't look good on your reader it means the author of the book was expecting you to use it on a larger screen.

You could try playing with the css for <pre> and make the font size smaller for the <pre> tags so that it all fits on the screen. Maybe use your reader in landscape mode as well, if it has that feature.
ldolse is offline   Reply With Quote
Old 10-10-2010, 08:13 AM   #9
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,205
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
@sergio, Since I started with all this ebook stuff, I've found the need to learn a bit about both HTML/CSS and MSWord macros. The HTML/CSS was much easier to pick up. If you are expert in Word macros then learning HTML & CSS should be a walk in the park. Any time you spend learning HTML/CSS will not be wasted. The more you know the better your epubs will look.

Despite what others may say, with care you can get MSWord to produce excellent raw HTML. It's not so good (a.k.a. terrible) at creating great CSS but I find it best to strip out the generated CSS and insert a link to my own standard ebook CSS file.

To get started take a Word doc and SaveAs Webpage-filtered, then look at the HTML file in a good text editor (e.g. Notepad++). This should give you a good idea of how the styles you apply in Word translate to HTML tags and classes.

For instance, if you apply the MSWord built-in style 'HTML Preformatted' to a paragraph then the output HTML will contain that text wrapped in <pre>...</pre> tags. Style 'Normal(Web)' outputs <p>...</p> tags. There are a few 'magic' styles.

P.S. I admire your ambition at starting with a technical book full of code. I started small with a Proj Gutenberg short story.
jackie_w is offline   Reply With Quote
Old 10-10-2010, 01:49 PM   #10
sergio blum
Member
sergio blum began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: SONY READER
@jackie_w, I despair of not making myself understood, although of course I think I am being utterly clear -- the problem here not being that English not being my mother tongue. For sure, if we could "speak face to face" I'm sure you would save my sanity in no more than 45 seconds.
Look, I fear that in trying to re-explain what I need I will write more and throw in more noise/confusion. I read carefully what you wrote.
I quote you "found the need to learn a bit about both HTML/CSS and MSWord macros. The HTML/CSS was much easier to pick up. If you are expert in Word macros then learning HTML & CSS should be a walk in the park. Any time you spend learning HTML/CSS will not be wasted. The more you know the better your epubs will look."
i understand and agree but --there I go again no being understood -- if I try to learn about HTML/CSS now I will spend way toooo much time till I -- this is precisely where words fail me -- have a comprehensive view of the territory as a whole and NO NO

I cease and desist to go again this way

For now, I will only BEG (as in beginner): Please send one XTML page:

with stating that it will use a styelshett.css

and containing two tagged (h1 and h2) sentences like so:

I am a sentence tagged with h1 .
I am a sentence tagged with h2.

just this, PLUS the corresponding stylesheet.css
with the definitions of h1 and h2.

(I was going to write: I dont know how to write this simplest, lest we veer again off course)

Then after I test my ideas and they do work OK, I will write back to you detailing what I did, sending you the Word macro, and only hten, yes, we might compare notes and see if my proposed workflow can be enhanced.

Sorry. One's tone can be so easily misunderstood in writing. I thank very much indeed for you having posted your reply. you came nearest to fulfill my wish list. I read attently what you wrote more than once and indeed agree with each and every you said. Alas, at this point I JUST need this shorcut to build upon. Please reply ASAP. (In two days time I must go back to work from my vacation...)

Blum
sergio blum is offline   Reply With Quote
Old 10-10-2010, 02:14 PM   #11
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,205
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
@sergio, Here is some test HTML and a CSS file to go with it. I frequently use it to test things out on my readers. There's all sorts of tags in it. Nothing too complicated. I've used all items in real books at some time in the last 18 months.
Attached Files
File Type: zip sergio.zip (5.2 KB, 247 views)
jackie_w is offline   Reply With Quote
Old 10-10-2010, 03:42 PM   #12
sergio blum
Member
sergio blum began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: SONY READER
@jackie_w,

With trepidation, unrared what you sent me.
I cannot express how happy I am.

You have been preempted by only a few minutes. Don't be sad, though!

You see, I managed to get someone on the phone here in São Paulo city in Brazil and after much explaining (again) this person sent me an even more barebones template equivalent to what you sent. Thanks so much to you, it goes WITH saying. Your template, being more rich and varied, will enchance my understanding.

Below I am showing the page and the css I got here in SP and then WOW I am sure I have the workflow worked out. See if you agree.

the page:
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="stylesheet" type="text/css" href="../Styles/styles.css" />

<title></title>
</head>

<body>
<h1 id="heading_id_2">I am tagged as h1. Hallelujah!</h1>

<h2 id="heading_id_3">I am tagged as h2. Hallelujah!"</h2>
</body>
</html>

and the css:
body{
font-family: "Courier";
}

h1{
font-family: "Courier";
}

h2{
font-family: "Verdana";
}

...and now, here comes the sun.

Like I said, a Word macro can recognize text by font, size, color, whatever -- and wrap such text within the required tags
In this case immediately above the tags would be h1 and h2.

But there more.
Previously (approx forty five minutes ago) I knew that even with this Word macro solution I would have to painstakingly (1) open each html file in Word (2) run the macro (3) Paste the resulting tagged text in the Code Window of Sigil.
In my current case -- an actionscript cookbook -- this would envolve about a hundred files.

BUT BUT I tested an idea.. and it works!

A- Open EPUB in the winrar program
B- Shift select the relevant html files -- leaving untouched jpgs and subdirectories
C- Drag (unrar) the files selected to an empty folder
D- Write a macro that in turn opens each file in the folder, identify the fonts and apply the tags, and save back the file.

E- Once this is completed it is just a matter of reincluding the modified set of html files in the EPUB.

That's all, isn't it? Dosn't it make sense?
Then just Add File in Calibre and download it to the Reader.
If you're not satisfied with font sizes, just fine tune them and do the rocess again.

Again, first convert CHM into EPUB in Calibre
Open in Sigil to fine tune fonts as desired
Then proceed with the Word batch workflow listed above
Finally AddBook in Calibre and download to reader device.

I think this is it.
I am soooo tired. Its 04>20 PM here and I still have not eaten today.

I reckon I can write this macro in two hours (it will therefore take me about five).
I intend to do it tonight. As soon as I implement the whole workflow I will post it inthis thread so we can come up with more ideas on this. If it may be useful to more users, great. That's the whole point of the community, isn't it.

As a last comment: Once the macro is written, its behaviour could be extended so as not to be limited to two tags. It could deal with any number of desired styles.

I must also thank you for lettng me know about the Save HTML-filtered option in Word. I was in the dark as to which of the htm, html Save options would be appropriate for my needs.

I dont know how to express my gratitude for your patience and attention. I only hope once this workflow -- or variations around the theme -- is all done and tested I may also be useful to you.

Blum

(I am an English-Portuguese translator with a daily business newspaper in SP. That's why I am an "expert" with Word macros).
I expect I will be posting again tomorrow - Monday.
sergio blum is offline   Reply With Quote
Old 10-10-2010, 05:09 PM   #13
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,205
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
I don't think I want to get in the way of you learning and achieving what you want in the way that best suits you.
However, I feel the need to say a couple more things in case it saves you work.
  1. I reiterate that MSWord will automatically wrap certain tags around your text if you apply the correct built-in style. e.g. if you apply style 'Heading 1' to your heading text then the output HTML will be <h1>My heading text</h1>

  2. Regarding the id attributes in your <h1> and <h2> tags. If you've put these in to help create a TOC at some point, you may like to know that it isn't actually necessary. If you tell calibre that your Chapters are defined by <h1> and/or <h2> then it will do all the required work to create a perfect TOC.

  3. You may find it easier to get the fonts you want on your reader by using more generic specification of font-family in your CSS. e.g.
    Code:
    h1 {font-family: monospace;} rather than Courier 
    h2 {font-family: sans-serif;} rather than Verdana
    and, although I notice your example specified Courier as its default text, if you wanted it to be a serif font then
    Code:
    body {font-family: serif;}
    Your reader will then use its default monospace, sans-serif, serif fonts to display your text.

    If you want to customise your reader's default epub fonts you can change them, but that's a story for another day.

As for actionscript cookbooks - you're way beyond me there
jackie_w is offline   Reply With Quote
Old 10-12-2010, 08:58 PM   #14
sergio blum
Member
sergio blum began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Aug 2009
Device: SONY READER
epub <pre> brain drain </pre>

@jackie_w
(Word 2003 would not open XTML pages because it found "&nbsp" and wouldnt even after removal of "&nbsp". A friend told me Word 2007 will.)


Regardless the above, the solution seems indeed now straghforward enough:
(and can be later generalised for any number of files and CSS styles)

Convert CHM into EPUB in Calibre

In Sigil, fine tune fonts as desired

Open EPUB in the winrar program

Shift select the relevant html files

Drag (unrar) the files selected to an empty folder created in Windows Explorer

Change all html extensions to txt

Run Word macro that for each txt file (Word will open TXTs in any circumstances) will insert <br /> at the end of each sentence enclosed "between" <pre> and </pre> (to prevent individual paragraphs [programming lines of code in the book] from collapsing into sentences belonging to only one paragraph when substituting p for pre).

Macro will then replace all pre tags with p tags

In Sigil, tweak the CSS manually (a) removing any references to whitespace so lines of code will reflect (b) fine tuning line-height as desired, and create the styles that will define h1 {font-family: monospace;} and
h2 {font-family: sans-serif;}.

Change txt extensions back into html

Form Windows Explorer, drag modified set of html files back into the EPUB.

Visually check results in Sigil.

In Calibre, just Add File and download it to Reader.

Read the book!
= = =

Crucially, I have tested manually the insertion of <br /> tags and replacement of pre with p.

= = =
sergio blum is offline   Reply With Quote
Old 10-12-2010, 10:34 PM   #15
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,205
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by sergio blum View Post
@jackie_w
(Word 2003 would not open XTML pages because it found "&nbsp" and wouldnt even after removal of "&nbsp". A friend told me Word 2007 will.)
My HTML file was mostly created from a Word XP (ver 10, very old) .doc file saved as Webpage-filtered. I have had no problems with non-breaking spaces ( &nbsp; )

It's designed to be viewed using a browser or edited via a text editor. However, you should be able to use Word as the text editor if you rename the .htm to .txt - I just tried it and it worked for me.

If you want to read the .htm back into Word as an HTML file try renaming the test.css file beforehand. You will get a 'css file missing' error message but it should open OK, minus the styles. I think there's something in the CSS Word doesn't like. This doesn't altogether surprise me as I was able to control the styling much better with manually created CSS than with the Word auto-generated stuff. If I have time tomorrow I'll try and track down which CSS it's complaining about (maybe the DropCaps?).

If I have to do this myself, I delete the CSS file and do a File-Insert of the .htm file into a blank Word doc based on a special Word .dot template which has all the matching style names. Although it's easier to reopen the original .doc rather than the .htm

Anyway, I'm glad you're making progress.
jackie_w is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
pdfreflow: reflow text PDFs Pranananda PDF 45 11-03-2011 09:32 AM
What is the best reader read real reflow PDF ( not refow text ) ? familyhandh Which one should I buy? 1 08-05-2010 08:44 AM
Help with reflow text file siulayhumga Workshop 9 07-31-2010 06:36 PM
epub reflow problem with 0.6.7 NASCARaddicted Calibre 6 08-17-2009 01:09 AM
80-column text reflow - Hanlin V3 elewton Other formats 1 02-10-2009 05:00 AM


All times are GMT -4. The time now is 10:33 PM.


MobileRead.com is a privately owned, operated and funded community.