Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 10-12-2015, 02:45 PM   #271
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,056
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Thank you, Toxaris! Now the settings area works fine!
Leonatus is offline   Reply With Quote
Old 11-16-2015, 02:00 AM   #272
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,767
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Actually changes to Rules Document are NOT being processed

I've noticed this a couple of times - sometimes when I make a change to a rules document the xml file doesn't get rebuilt next time I use the rules document. Until now I resolved by starting and stopping Word, sometimes several times.

But that's not working now - the only thing of significance that's changed is that I upgraded to Windows 10 Version 1511 build 10586 - but I can't imagine that's germane.

Personally I would rather do something explicit to rebuild the xml file from the .doc when I change the doc file, rather than hoping it gets rebuilt when I use it, which may not be immediately after I change it.

The attached screenshot contains what I think is the relevant information. Note the dates on my rules document (2015-11-16 (today)) and the xml file (2015-11-12 (last Thursday))

I'll try editing my changes directly into the XML file, looks easy enough.

BR
Attached Thumbnails
Click image for larger version

Name:	Clipboard01.jpg
Views:	252
Size:	234.8 KB
ID:	143809  

Last edited by BetterRed; 11-16-2015 at 02:05 AM.
BetterRed is offline   Reply With Quote
Advert
Old 11-16-2015, 02:16 AM   #273
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,767
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by BetterRed View Post
I'll try editing my changes directly into the XML file, looks easy enough.
Indeed it is easier, by a country mile what's more

BR
BetterRed is offline   Reply With Quote
Old 11-16-2015, 05:09 AM   #274
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by BetterRed View Post
I've noticed this a couple of times - sometimes when I make a change to a rules document the xml file doesn't get rebuilt next time I use the rules document. Until now I resolved by starting and stopping Word, sometimes several times.

But that's not working now - the only thing of significance that's changed is that I upgraded to Windows 10 Version 1511 build 10586 - but I can't imagine that's germane.

Personally I would rather do something explicit to rebuild the xml file from the .doc when I change the doc file, rather than hoping it gets rebuilt when I use it, which may not be immediately after I change it.

The attached screenshot contains what I think is the relevant information. Note the dates on my rules document (2015-11-16 (today)) and the xml file (2015-11-12 (last Thursday))

I'll try editing my changes directly into the XML file, looks easy enough.

BR
There are several procedures is as follows:
1. When Word starts, a hash is calculated of the SR document specified in the settings. If the hash is different compared to the stored hash (or there is no stored hash), the SR xml file is created. This is done in the background. When this procedure runs, a flag is set (this gives the message you saw earlier).
2. If you start the SR process, the same check is done as in 1, to ensure that in the meantime no changes are made.
3. If you change the SR document in the settings, it will do the same as 1. To be honest without looking at the code I cannot determine if this is done when the document is selected or after saving the settings. I think it is the first.
4. If the SR document is edited with the editor, it will do the same as 1 again, but not in the background. After editing the SR document will be adapted and stored. The temporary XML file used for the editor becomes the new XML file (saves a lot of time...).

So, if the XML is not recreated, the hash is probably not changed according to the hash procedure in .NET. I choose hash, because it should be more effective than size and/or date.
Therefore I am a bit baffled that it doesn't work for you.

Quote:
Originally Posted by BetterRed View Post
Indeed it is easier, by a country mile what's more

BR
Yes, but the changes are that they will be overwritten when a change in hash is detected in the Word file.

What I will do is build in an option to choose between a SR document or a SR XML file.
Toxaris is offline   Reply With Quote
Old 11-16-2015, 06:11 AM   #275
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,767
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Toxaris - I've always found it much easier to edit the word document directly, rather than via the Settings forms. After changing the document, I see a vertical maze looking progress thingy when I've run S&R as its rebuilding the xml file.

The last thing I did was to relocate some lines in the tables. If the hash algorithm doesn't take relative position of text elements into account that would explain why its 'not working' - (2 + 3 + 4) = 9 and (3 + 4 + 2) = 9. But they are not the same 9

In my opinion it would be better to give the user a button to click, labelled something like "Build/Compile/Generate Run Time Rules Database Now".

But all of that said I think I'd prefer to maintain the XML file directly - whilst it may appear to be more verbose than a table in a Word document or a Word Form, assuming you have a text editor that has a snippet library then it's easy to maintain - I use Notepad++ and the SnippetsPlus PI.

BR

Last edited by BetterRed; 11-16-2015 at 06:17 AM.
BetterRed is offline   Reply With Quote
Advert
Old 11-16-2015, 08:47 AM   #276
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Hmm, I can imagine that in that case the hash could be the same. Still strange, since a docx is a zipfile. One would assume that the hash would change.

What I will do is a bit more. In the settings screen there will be an option to indicate whether you want to work with the xml or Word document. There will also be a button for converting the document to the xml as well. If the xml is to be used, no conversions anymore in the background. Also the editor will be using either the Word document or the xml.
It will be some work though to do it right.
Toxaris is offline   Reply With Quote
Old 11-16-2015, 03:19 PM   #277
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,767
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by BetterRed View Post
Indeed it is easier, by a country mile what's more
Quote:
Originally Posted by Toxaris View Post
Yes, but the changes are that they will be overwritten when a change in hash is detected in the Word file.
For the immediate future I won't be changing the Word file - I've changed its disposition to Read Only to make sure of that!

For me at least there's no urgency on this, I'm happy editing the XML directly

BR

I had an instance of software relying on a file checksum changing to indicate a 'new version' was available (MD5 or SHA-1, I forget which).

Unfortunately a 'new version' had the same checksum as the 'old version', even though I knew for a fact they were not the same. This was borne out by the fact that the file date, and size had changed, and Beyond Compare reported differences.

Interestingly the CRC32 checksums on the two files were different.
BetterRed is offline   Reply With Quote
Old 12-04-2015, 09:23 AM   #278
jangell2
Addict
jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.
 
jangell2's Avatar
 
Posts: 296
Karma: 1618384
Join Date: Aug 2010
Device: Kindle
Question

Quote:
Originally Posted by Toxaris View Post
Welcome to my Word add-in. This add-in has all kinds of procedures to help with creating e-Books.

Amongst others, the following procedures are available:
  • Preperation/Prework
  • Search & Replace
  • Check dialogues
  • Check accented Words
  • Generate clean HTML
  • Create ePUB

The preperation/prework procedure is the first procedure to roughly cleanup the output from ABBYY (or another OCR program) to prepare for the other work. The second is a Search and Replace procedure. That procedure follows a Word document with tables (example available on the site) that contains a large number of S&R rules. Your own rules can be added to that. The dutch version has much more standard available than the English, but if you send me additions (not via this forum please to keep the thread clean) I will add them to the base document. Some S&R are language independant.
The third procedure is to check all dialogues in the system if they are correct. With correct I mean having an opening and an ending quote. I check for 3 kind of quotes: single, double and guillemets (« »).
The fourth procedure is useful for languages that are not using accented words frequently in my opinion. It checks all accents to check they are correct. Words can be added to a temporary list to skip them for the remainder of the document.
The fifth procedure is basically my HTML export macro (with some changes). The sixth procedure is taking the HTML export and creates a base ePUB ready to be loaded into Sigil or the Calibre Editor. You can make additions like adding a title, author, description and cover.

There are many more features and tools. For more information see my site or the online help.

The add-in, manual and supporting document can be downloaded here.

Enjoy.
I clicked on the link in your Word add-in and while the headlines on the page are in English, the rest of the page is not. Is there an English only page?

Does your Word add-in work on an iMac? I can run windows but prefer OS.
jangell2 is offline   Reply With Quote
Old 12-04-2015, 09:31 AM   #279
jangell2
Addict
jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.
 
jangell2's Avatar
 
Posts: 296
Karma: 1618384
Join Date: Aug 2010
Device: Kindle
Ok, found the English page.
jangell2 is offline   Reply With Quote
Old 12-04-2015, 01:01 PM   #280
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by jangell2 View Post
Does your Word add-in work on an iMac? I can run windows but prefer OS.
Nope, it does not work on an iMac. There is nothing I can do about that, since the required libraries are not released by Microsoft. Also, even if they were released, it would be best effort only since I don't own an iMac.
Toxaris is offline   Reply With Quote
Old 12-04-2015, 03:56 PM   #281
jangell2
Addict
jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.
 
jangell2's Avatar
 
Posts: 296
Karma: 1618384
Join Date: Aug 2010
Device: Kindle
Thanks for the reply. I understand how hard it is to write software, much less software for different platforms. I'll have to check to see if I've got Word on my PC side of the Mac.
jangell2 is offline   Reply With Quote
Old 12-05-2015, 04:30 PM   #282
jangell2
Addict
jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.jangell2 ought to be getting tired of karma fortunes by now.
 
jangell2's Avatar
 
Posts: 296
Karma: 1618384
Join Date: Aug 2010
Device: Kindle
I'm gong to send of a pocket book to one of the scan services and have it returned as a word doc. Can anyone give me an idea about the level of effort required to proof it? What are the typical errors that will be found. Do I have to read the whole book? Will the Word Add-In catch most of the problems?
jangell2 is offline   Reply With Quote
Old 12-05-2015, 05:31 PM   #283
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by jangell2 View Post
I'm gong to send of a pocket book to one of the scan services and have it returned as a word doc. Can anyone give me an idea about the level of effort required to proof it? What are the typical errors that will be found. Do I have to read the whole book? Will the Word Add-In catch most of the problems?
I'll let Tox address what the Word add-in will catch, but as a commercial formatter of eBooks, I can tell you that the effort to proof a scan--particularly if you're using one of the cheaper scanning services--is fairly significant. In some ways, it's far harder than editing/proofing the book the first time around, because you have to proof the Word file against the PDF, to ensure that you don't have scanning errata (typical is, for example, "hat" for "fiat," and other errors like that, particularly surrounding ligatures).

You'll have a lot of text-sizing errors, which are represented in the output HTML as spans. Lots, and lots, and lots of spans. (DUH, corrected, thank you, Peter!).

Then you'll have the fairly endless broken paragraph and page-ending errors; those are ubiquitous. The Word add-in does a pretty good job at finding possible broken paragraphs, particularly those broken mid-sentence.

What it can't do--and no automated system can--is find those paragraphs that have the end of one sentence at or near the right-hand-margin on one page, and continue, flush-left, with a capitalized first letter at the top of the next page. Only human reading and decision-making can handle those.

This--this very thing, proofing post-scan Word files--is the single biggest obstacle that we have with authors/publishers doing their backlists into eBooks. NONE of them want to do this step. ALL of them think that either a) this should be the scanning company's job, or b) this should be the formatting company's job. None of them are willing to accept that as publishers, it's THEIR job.

The single biggest glitch is that they have zero interest in learning the underpinnings of Word. Most of them don't know how to see pilcrows, much less figure out page breaks versus section breaks, etc. The scanning companies don't want to proof for that type of errata, (broken paragraphs and the like) and we certainly don't; we're not editors or proofreaders.

(Sorry, I digress). Anyway, that's what you should expect. Tox, who has done thousands of scans, will likely have more feedback, but that's at the top of what I see. Broken paragraphs; section/page breaks that have to be removed; paragraphs that may or may not be breaking across pages; typical scan OCR errors; and font/text-sizing errors.

Oh, yes: you absolutely have to proofread the whole thing.

Offered FWIW.

Hitch

Last edited by Hitch; 12-05-2015 at 06:43 PM. Reason: Idiotic scan-span typos.
Hitch is offline   Reply With Quote
Old 12-05-2015, 06:02 PM   #284
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
Posts: 13,545
Karma: 79436716
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
Hitch.... SPANS not SCANS
PeterT is offline   Reply With Quote
Old 12-05-2015, 06:42 PM   #285
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by PeterT View Post
Hitch.... SPANS not SCANS
THANKS, Peter!

Will go fix that stupid typo right now.

Hitch
Hitch is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Save any Document from Microsoft Word 2007 to EPUB using a Free Add-in from Aspose romeok ePub 45 05-03-2017 03:33 PM
Automatically add tag if word/phrase found in ebook? eosrose Calibre 3 11-16-2011 06:48 AM
Doc Splitter-Macro or Add-in For Word 2003 konrad Workshop 0 03-08-2011 03:43 PM
Can't see lit add in for MS Word ray100 Reading and Management 3 08-12-2009 02:59 AM


All times are GMT -4. The time now is 02:22 PM.


MobileRead.com is a privately owned, operated and funded community.