I am a little worried that this might be a calibre related issue. I searched the calibre github site for QMimedata usage and found the following code snippet that actually removes tables from html when copying text to the clipboard.
As I am unfamiliar with the calibre code base, I have no idea if this routine is being invoked at all when copying out of calibre's Normal view.
Code:
def copy_all(text_browser):
mf = getattr(text_browser, 'details', text_browser)
c = QApplication.clipboard()
md = QMimeData()
html = mf.toHtml()
md.setHtml(html)
from html5_parser import parse
from lxml import etree
root = parse(html)
tables = tuple(root.iterdescendants('table'))
for tag in root.iterdescendants(('table', 'tr', 'tbody')):
tag.tag = 'div'
parent = root
is_vertical = getattr(text_browser, 'vertical', True)
if not is_vertical:
parent = tables[1]
for tag in parent.iterdescendants('td'):
for child in tag.iterdescendants('br'):
child.tag = 'span'
child.text = '\ue000'
tt = etree.tostring(tag, method='text', encoding='unicode')
tag.tag = 'span'
for child in tuple(tag):
tag.remove(child)
tag.text = tt.strip()
if not is_vertical:
for tag in root.iterdescendants('td'):
tag.tag = 'div'
for tag in root.iterdescendants('a'):
tag.attrib.pop('href', None)
from calibre.utils.html2text import html2text
simplified_html = etree.tostring(root, encoding='unicode')
txt = html2text(simplified_html, single_line_break=True).strip()
txt = txt.replace('\ue000', '\n\t')
if iswindows:
txt = os.linesep.join(txt.splitlines())
# print(simplified_html)
# print(txt)
md.setText(txt)
c.setMimeData(md)
Not sure why anyone would want to simplify the html by removing tables. So copy in calibre creates two different formats, one html and the second a simplified html.
PageEdit on Windows seems to default to the latter one based on BetterRed's testing.
I will create a debug PageEdit version to list and dump all of the formats in the QClipboard QMimedata when Edit->Paste is invoked just to verify.