View Single Post
Old 07-17-2022, 06:37 AM   #6
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,471
Karma: 8025600
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
I tried your experiment. Jumping to the end, I found that the problem is caused by the "1000 Words" not being enclosed in HTML.

My apologies for the length of what follows. I wrote down all the steps so I could be sure I knew what I was doing/had done. The steps I ran to discover this:
  1. Empty the comments field.
  2. Use your S&R to copy int Words to the comments field. After the S&R comments *but before Edit Metadata* contains
    Code:
    1000 Words
    I used a database manager to look directly at the data in the comments column.
  3. Look at the book with the Metadata Editor.
    Code:
    <div>
    <p>1000 Words</p></div>
    Some HTML has magically appeared. I checked the database. That HTML isn't actually in the column data, according to the database.
  4. Set a custom comment to the following HTML
    Code:
    <div>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
    Closed the metadata editor and reopened. The text was still correct.
  5. Checked the blurb field in the database. It is the same.
  6. Ran the S&R to append the blurb to comments. S&R says it is putting the following in comments.
    Code:
    1000 Words<div>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
  7. Checking the database, the comments column contains the same text.
    Code:
    1000 Words<div>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
  8. Edit metatada on the book. The comments field shows
    Code:
    <div>
    <p>1000 Words</p>
    <p><br></p>
    <p>This is a line in the comment.</p>
    <p><br></p>
    <p>This is another line in the comment.</p></div>
    This is very different from what is in the database.
  9. Checking the database, it still contains the original text.
    Code:
    1000 Words<div>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
    Lets look at what HTML is being used when showing comments in book details. I changed to code to show it.
  10. Here is the output being sent to book details. It has all the cruft and more.
    Code:
    <p class="description">1000 Words</p>
    <div><br/><p class="description">This is a line in the comment.</p>
    <br/><p class="description">This is another line in the comment.</p>
    </div>
    Why?

    Looking at the code, calibre does this because it thinks the comments aren't already HTML because the content doesn't start with a '<'. Lets try fixing that by changing the original "1000 Words" to be "<div>1000 words</div>" so calibre thinks it is HTML. After the S&R, the comments field contains
    Code:
    <div>1000 Words</div>
  11. edit metadata shows it as
    Code:
    <div>
    <p>1000 Words</p></div>
    Not the same but not as different as above. The database hasn't changed, though.
  12. Run the second S&R to add the blurb. The comments field now contains
    Code:
    <div>1000 Words</div><div>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
  13. The comments field HTML in edit metadata now shows:
    Code:
    <div>
    <p>1000 Words</p>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
    This isn't what is in the database but it is much closer.
  14. The output in book details is
    Code:
    <div>1000 Words</div><div>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
    This is actually what is in the database. No extra <br> or what-have-you.

    We now know how to "fix" it. Ensure that *everything* that is added to comments is correct HTML. In this case that means surrounding the "words" stuff with tags.
  15. We see that using <div> causes calibre to add <p>. For fun, try it using <p></p> around the word count instead of div. Without showing the intermediate results, we get in edit metadata:
    Code:
    <div>
    <p>1000 Words</p>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
    This isn't bad!

    We see in book details:
    Code:
    <p>1000 Words</p><div>
    <p>This is a line in the comment.</p>
    <p>This is another line in the comment.</p></div>
    which is what we want.
Bottom line: you must ensure that the resulting comments field contains valid HTML. As such, your first S&R should be like this:
Click image for larger version

Name:	Clipboard01.jpg
Views:	72
Size:	126.6 KB
ID:	195080

Looking at the code I thought about why calibre was adding all the stuff if the contents didn't appear to be HTML. The answer (I think) is that calibre has no idea what is in there. It must protect itself against strange and malformed html to avoid having book details get scribbled on. One could argue it is being overzealous, but Kovid arrived here after many years of experience with strange things.
chaley is offline   Reply With Quote