![]() |
#1 |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
Edit Book: How to remove formattings, here in text?
![]() I converted a PDF-file to ePub. Similar to here. formatted text (Line Heights) How I can remove these (or best any) formattings in Edit Book, here different Line Hights? This is quite irritating! I run Beautify files & Fix HTML, Check Book OK. The yellow markings are (same sort of) EN DASHs, obviously they have nothing to do with the matter. Last edited by chaot; 01-25-2017 at 04:51 AM. |
![]() |
![]() |
![]() |
#2 |
Age improves with wine.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
|
This can be caused by invisible characters that can't be displayed in the editor. I've seen this happen with strange quote characters that just don't show up in the editor (some bizarre Unicode symbols that not all fonts can represent). One way to see if that's the problem is to copy the text into an editor that doesn't understand Unicode (e.g. Textpad on Windows), and any non-8-bit characters will show up as "?" or some such. You can also use a regex-function to convert characters above U+00FF to entities:
Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): result = '' for c in match.group(): if ord(c) > 255: result += '&#%d;' % ord(c) else: result += c return result Note that the editor is showing you the raw HTML, so it's not an HTML formatting problem -- it's something that's giving the editor indigestion. Last edited by Phssthpok; 01-25-2017 at 01:30 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#4 |
Age improves with wine.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
|
Hmm. I'm not sure I can say it any more clearly, but here goes in baby steps:
1) Select the mis-spaced text and press Ctrl-Shift-M (or Search->Mark Selected Text if you don't like using the keyboard). The search range dropdown (bottom line of the search panel) should change to say "Marked text" when you do this. 2) In the "Mode" dropdown to the left of the search range, select "regex function". 3) Press the "create/edit" button next to the "function" dropdown. 4) Paste in the code I gave you. 5) Put "." (without the quotes) in the "Find" box at the top. 6) Press the button marked "Replace all". 7) Change the mode from "regex function" to Regex. 8) Enter "&#\d+;" in the Find box (without the quotes). 9) Press "Find" and look at what it's selected. Does it correspond to a visible character in the preview panel, or what you remember of the original text? If not, delete it and see what happens. Otherwise, replace it with whatever character it should be (whatever it was before you started). You can of course look up the numeric entity code to find out what character it is (just ask Google). 10) Repeat (9) until you get to the end of the problematic text. Last edited by Phssthpok; 01-26-2017 at 09:02 AM. |
![]() |
![]() |
![]() |
#5 |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
I’ve got it! Thank you very much!
And as honestly I would like to mention that former guidance for someone who had never had to deal with these functionalities was something insufficient. This is, of course, not to charge you but the circumstances. Here worth mentioning is my still insufficient knowledge. I start to know what’s what. Perhaps something long-winded, but I am interested and you seem also (others do not have to read). A S. (viewing the typed #numbers is impossible) Textsites configured! The strange character middle left does not appear as code at all. A → Anhang S. →page number wxyz S. →page number wxyz+1 As far as the report, evaluation follows (possibly). Do you want a text part - for your own tests? I will try to manage that. EDIT: The whole miracles are not yet reported. I have to stop now. Last edited by chaot; 01-26-2017 at 02:42 PM. Reason: add: EDIT |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
Correction of errors and clarifications!
(Et character)#8211; is ''–'' Unicode 'EN DASH' U+2013 (written as HTML Entity (decimal)(not to be confused with EM DASH) There are lots (dozens) of encodings/notations for something simple??? as an EN DASH. Private word! Spoiler:
S. Unicode U+F644 | 0xF644 | Unicode U+F64B | 0xF64B | Unicode U+F647 | 0xF647 | Unicode U+F64B | 0xF64B | S. Unicode U+F644 | 0xF644 | Unicode U+F64B | 0xF64B | Unicode U+F647 | 0xF647 | Unicode U+F64C | 0xF64C | A Unicode U+F76E | 0xF76E | Unicode U+F768 | 0xF768 | Unicode U+F761 | 0xF761 | Unicode U+F76E | 0xF76E | Unicode U+F767 | 0xF767 | The line height of some characters is more than twice the pure character. This can be tested: See mouse-click line. This is partly and apparently due to the notation in boxes. It is also noteworthy that the mouse-click line at is bold and still longer (poor English, I know). different line heights (via C&P from gedit text editor to here) Why in the above (#1) shown paragraph (and in many others) the line heights changes (suddenly, without a recognizable reason), yet I got no explanation. However, I can tell that since yesterday evening all line heights reduced to 'normal'. For why!? Don't ask me. If I find out, I report. Last edited by chaot; 01-27-2017 at 12:23 PM. Reason: add brace: reason), |
![]() |
![]() |
![]() |
#7 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,660
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,609
Karma: 12595249
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
|
Last edited by Terisa de morgan; 01-27-2017 at 06:32 AM. |
![]() |
![]() |
![]() |
#9 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,609
Karma: 12595249
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
|
Quote:
ndash / mdash /whatever... are not a concept themselves. The concept is white spaces, that everybody understand. Yes, an oil maker won't difference between mdash / ndash or whatever but I challenge you to know all the kind of oils or olives, and being able to understand the references and types of oil... Sorry, software is not different of anything else, the main problem is that a lot of people thinks that software should be like, I don't know, something you don't have to learn and has to be clear and crystalline for everybody to understand. If you find a matter like you pretend software to be, please, tell me. |
|
![]() |
![]() |
![]() |
#10 |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
Spoiler:
Concerning the issue 'how I would prefer software to be', I think about. Last edited by chaot; 02-03-2017 at 10:26 AM. Reason: in the olive→inside the olive |
![]() |
![]() |
![]() |
#11 | |
Age improves with wine.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
|
Fascinating though the discussion about olive-picking may be, which I know nothing about, here is a bit more info on a topic which I do know something about...
Quote:
Find:  Replace: <span class="small-cap">n</span> Then hit "replace all". Now repeat this process for all the other letters which appear like this until all the problematic entities have been replaced by the corresponding letters wrapped in <span class="small-cap">...</span>. Then a couple more steps: (1) merge adjacent small-cap spans: Find: <span class="small-cap">(.*?)</span><span class="small-cap"> Replace: <span class="small-cap">\1 Do "replace all" several times until the number of replacements shown in the resulting message box is 0. (2) add a rule for "small-cap" in your CSS: .small-cap { font-variant: small-caps; } Last edited by Phssthpok; 02-03-2017 at 05:12 AM. |
|
![]() |
![]() |
![]() |
#12 |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
@Phssthpok: What you think!? I am a hard working man ...
![]() Then later, I thought, I will correct this, so to speak, self-produced error manually. This is, of course, unprofessional (I hate unprofessionalism). Additionally make possibly overseeing to transform some of the former small caps. Luckily, before I touch new books for editing I make original copies. So I am able to process your suggestions now: I merged all Files Browser entries (3.9 MB), then did the instructions in #4 (I didn't have to create the function again, it's stored under Phssthpok). You are linked with ''problematic Unicode'' ![]() Replace all: 3.900.000 ocurrences! So I hadn't to do point 8) and 9) of the instructions: the book is full of that stuff. New problematik: C&P your example  doesn't work (Error: Not found). I had to find such a character in the book and C&P that one to the Find-box. Here it is: - apparently the same! Now I thought for @Terisa de morgan (#9): can not be ... ''clear and crystalline for everybody to understand'' ![]() Find: <span class="small-cap">(.*?)</span><span class="small-cap"> Replace: <span class="small-cap">\1 ... must be Find: <span class="small-cap">(.*?)</span> Replace: <span class="small-cap">\1</pan> Your help was tremendously valuable. I wouldn't have managed that in years. Especially not so professional. Last edited by chaot; 02-03-2017 at 01:05 PM. |
![]() |
![]() |
![]() |
#13 | ||
Age improves with wine.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
|
Quote:
Quote:
Code:
<span class="small-cap">x</span><span class="small-cap">y</span><span class="small-cap">z</span> Code:
<span class="small-cap">xyz</span> Code:
<span class="small-cap">x</pan><span class="small-cap">y</pan><span class="small-cap">z</pan> Last edited by Phssthpok; 02-03-2017 at 02:26 PM. |
||
![]() |
![]() |
![]() |
#14 |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
In order to give the code the deserved honor, I have followed 'all' your instructions and can report: You are 100% right.
I have been busy with these books the last time. These are SCANS from two sources, so the scan errors are also different - and almost innumerable. S&R is only a small help, it is almost always necessary a semi-manual procedure. This is an strenuous activity, if one wants to work quickly and without flaws. (Find-Replace-Find-Find-Replace-Find, etc.) And in the end proof-reading is necessary. I like to deal with texts, so that is a pleasure for me, which would mean a nonsensical and exhausting affair to others. But my temper does not allow to do a job twice. So I can not start again. Fortunately, the SMALL CAPS are limited to definable, known places: selected headings. Now I filter these out and change to SMALL CAPS. At the end still a small drop of bitterness (known a few days ago): Small caps does not function in my ereader. Now I write: ANHANG. A few images and codes of the procedure! Original Small Caps conversion merging First repetition has 104 occurences, 2nd, 3rd and 4th repetition has 52 occurences (it's clear for why, without explaining here). I test only with ''Anhang''. Code:
<p class="calibre1">A<span class="small-cap">n</span><span class="small-cap">h</span><span class="small-cap">a</span><span class="small-cap">n</span><span class="small-cap">g</span> </p> I am looking forward to the next time. Problems are always enough. ![]() Last edited by chaot; 02-07-2017 at 12:54 PM. Reason: add: image merging |
![]() |
![]() |
![]() |
#15 | |
Head of lunatic asylum
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
|
Quote:
In #12 I said ... the book is already nearly finish. That we should call a typical misestimation! Report (as announced): Since some time I know what justified these increased line hights. Responsible are some bad characters, here my collection (so far): • ■ ° (isn't º) Some are optical easyly detectable, others not. These characters influence the line height always the whole paragraph, from <p> to </p>. As I got the 'raw material' partly unedited, some paragraphs are multiple pages. Prinz-Eugen-Brucken Smart comment tags also limites the bad characters influence. (Most likely (unproofed) the influence spans from >... to ...<). So you can probably imagine that the issue made me some headaches in the first time. But even today I find, more and more rarely, new ones. ![]() ![]() Last edited by chaot; 03-22-2017 at 01:18 PM. Reason: formulations, trivia |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Edit Book: How to remove formatting (without deleting)? | chaot | Editor | 7 | 01-26-2017 07:03 AM |
Edit Book/Show help for: xyz opens text editor | chaot | Editor | 2 | 02-16-2016 01:25 PM |
remove series in edit metadata dialog | speakingtohe | Conversion | 2 | 04-01-2012 12:19 PM |
RFE: Remove remove tags in bulk edit | magphil | Calibre | 0 | 08-11-2009 10:37 AM |
Cannot edit text in Book Designer. Help? | monkeyhihi | Sony Reader | 29 | 10-20-2007 10:22 PM |