09-12-2018, 06:49 PM | #646 | |
null operator (he/him)
Posts: 20,620
Karma: 26960534
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
I'm still puzzled as to how I had it working for so long without tags . BR |
|
09-15-2018, 02:37 PM | #647 |
BioReader
Posts: 292
Karma: 42568
Join Date: Apr 2009
Location: Germany
Device: Various
|
Hi
after a longer pause with epub editing (2 months) I started some new projects last Monday. Most of them are pdf ebooks which I load directly into Word (MS Office 365 V1808) for conversion. After some basic corrections I usually start Postprocess OCR. Currently this module runs into step 3/15 (tag markup) then crashes Word, repeatedly. Tried different pdf and also plain docx files. Same story. Nothing changed during the recent 2 months except for some updates of Word. I did not touch the ePubTools. All other functions (e.g. search/replace or epub import) are working perfectly! Anyone with such problems here around? Klaus Last edited by kbaerwald; 09-15-2018 at 02:46 PM. Reason: Attached snap shot |
Advert | |
|
09-16-2018, 03:37 AM | #648 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Well, yesterday I did a book myself and had issues in that department as well. I also got a report from someone else. I am not quite sure why yet, as this does not happen on older version of Word. They did make a change, but it is not clear what it is and it is also not consistent. I am looking into it though.
|
09-16-2018, 05:42 AM | #649 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
I can confirm there is something broken in at least the latest version of Word (Office 365) in the search and replace engine of Word itself. I can repeat the observed behavior manually. Unfortunately this cannot be fixed by me of course. I did think of a way to circumvent this, but it will have a performance impact and I need time to program it as well.
I will post updates every now and then. |
09-16-2018, 06:20 AM | #650 |
BioReader
Posts: 292
Karma: 42568
Join Date: Apr 2009
Location: Germany
Device: Various
|
Can you specify what is broken in the search/replace engine? Does it appear directly in the user interface or just in the programmatic interface that you use?
If it is not a permanent change by MS we should better wait for the next update to come. It would be too much a burden for you to find an intermediate fix for that problem. Anyway - thank you for your help! Klaus |
Advert | |
|
09-16-2018, 06:59 AM | #651 | |
null operator (he/him)
Posts: 20,620
Karma: 26960534
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
BR |
|
09-16-2018, 07:16 AM | #652 | |
null operator (he/him)
Posts: 20,620
Karma: 26960534
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
I've not encountered any problems in Word 2016, using it manually, or via EPubTools Search and Replace. BR |
|
09-16-2018, 07:46 AM | #653 |
BioReader
Posts: 292
Karma: 42568
Join Date: Apr 2009
Location: Germany
Device: Various
|
BR
most of the time I am using the Postprocess OCR for its very good scene detection feature (variable 'detection distance'). The other features are more a by-catch for me - they may be important for others however. If Postprocess OCR remains unusuable I have to look for other ways - unfortunately the standard Word Search/Replace UI ist not too flexible regarding Regex. Klaus |
09-16-2018, 09:27 AM | #654 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Sure I can explain more. If you for example do the following search:
Find: empty Find formatting: italic Replace: [dummy]^&[/dummy] Replace formatting: non italic If you now do a S&R, it should just set the dummy tags before and after the italic part and also remove the italic formatting. That is how it should be and always was. However, now the tags are placed at the correct place, but not everything is set to non-italic (including the end-tag!). That causes the same part of the block (including tag) to be selected again, and again, and again. That is not how it should work and this was recently introduced. I do expect it to be solved soon, as this is clearly an issue in Word itself as I can reproduce it manually. |
09-16-2018, 04:05 PM | #655 | |
BioReader
Posts: 292
Karma: 42568
Join Date: Apr 2009
Location: Germany
Device: Various
|
Quote:
true - can confirm this (see snap shot below: replaced the "Sch" of Schönheit only). Funny - well we have to wait that MS is taking notice of the problem and is solving it. Thanks for the demonstration. Klaus |
|
09-16-2018, 09:29 PM | #656 |
null operator (he/him)
Posts: 20,620
Karma: 26960534
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Yep - me too.
IMO empty Find shouldn't do anything, there should be an Any String (^@ or something) Curious : does this relate directly to Klaus' OCR PP problem. BR |
09-17-2018, 02:33 PM | #657 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Yes, an empty find sounds strange but it works fine. Basically you say find anything. There is no Any String command unfortunately.
It is absolutely related to Klaus' OCR PP problem. It is actually doing that to tag the formatting in that step. As not all the text is de-formatted, it is again selected and tagged and again, again, etc. So, it appears to hang. If you wait long enough, it will end, but you will have a big mess with half tags all over. I actually had the same problem myself as well as someone else I know. I am not too optimistic that MS will solve this soon, they have the most atrocious bug reporting system I know and you actually needs votes to have your bug handled sooner or so it seems. Hence, I am thinking of an alternative method. I have designed one, but it will require some time to program and test. I hope to start tomorrow evening. |
09-17-2018, 06:56 PM | #658 |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Reminds me of a time in history when people thought there was no need for a zero. It took centuries for people to trust a place based number system from roman numerals.
|
09-17-2018, 07:56 PM | #659 |
null operator (he/him)
Posts: 20,620
Karma: 26960534
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@Toxaris - Ouch!
Unsurprisingly it's Tag Formatting too, which I use occasionally. What else ? BR |
09-18-2018, 12:46 PM | #660 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
This functionality is used in Post OCR, Tag Formatting and HTML Conversion... So, it is quite essential unfortunately. That is why I am building an alternative.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Save any Document from Microsoft Word 2007 to EPUB using a Free Add-in from Aspose | romeok | ePub | 45 | 05-03-2017 03:33 PM |
Automatically add tag if word/phrase found in ebook? | eosrose | Calibre | 3 | 11-16-2011 06:48 AM |
Doc Splitter-Macro or Add-in For Word 2003 | konrad | Workshop | 0 | 03-08-2011 03:43 PM |
Can't see lit add in for MS Word | ray100 | Reading and Management | 3 | 08-12-2009 02:59 AM |