01-26-2021, 06:44 AM | #1 |
Guru
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
Two problems with splitting files
First case
Why? After all, it is enough to move the cursor one field to the left and then the split will work. Second case
Despite the message, the split will be performed ... but the split part of the file will be lost (we lose a piece of code BEFORE split position, and it remains AFTER split) |
01-26-2021, 10:24 AM | #2 | |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
The cursor is actually on the first <, not "before it.
so if CV shows something like |<tag> , then the cursor is in the tag. If you had a block cursor it would highlight the < but CV uses an insertion point cursor. The long held Sigil rule is you can not split when in a tag. If you think this is too tight, the I would be happy to consider a patch or PR that handles just that one case. Note, a file with an xml file extension can be of any type, it could be an opf, an ncx, an xhtml, a generic xml file, etc. There is no way to know the tag meanings: a body tag may not be what you think it means, there may be no html root tag, there is no way to tell which are void tags and which are not, which are block tags and which are not, where it might even be safe to split, and how splitting will impact a generic xml file. And according to the epub3 spec, file extension must be xhtml if the file is an xhtml file. The first step when seeing lots of xml files that are really xhtml files (or html files that are really xhtml files) is to use Sigil's ability to bulk rename a set of files by just changing the extension as they do not meet the current epub spec left as .xml files. Again, I would consider a patch or PR to handle this differently but Sigil has long operated this way, and forcing people to change file extensions to more correct ones that meet the latest spec makes good sense to me. Please note, that old epub2 spec that allowed pure xml "islands" has long been deprecated mainly since no reader bothered to support it and just treated all .xml files as .xhtml files. This changed in epub3. Hope this explains why you are seeing what you are seeing. Quote:
|
|
Advert | |
|
01-26-2021, 10:29 AM | #3 |
Guru
Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
|
I can reproduce the second issue, but not the first. Sigil 1.4.3 Windows 10.
|
01-26-2021, 10:38 AM | #4 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
The point I was trying to make is both are expected behaviour. So a patch and rationale to change this would be needed for either.
|
01-26-2021, 10:50 AM | #5 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
If a user gets the message "Cannot split since it may not be an HTML file" then why is the split still being performed (with the subsequent loss of data)? Shouldn't the split attempt be aborted under those conditions if possible?
|
Advert | |
|
01-26-2021, 11:26 AM | #6 | |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Quote:
That I can look at. |
|
01-26-2021, 11:41 AM | #7 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Well the first "issue" can be found in CodeViewEditor.cpp here if anyone wants to play with that, they could simply check for a |<tag> here and allow that even if "considered in tag".
Code:
QString CodeViewEditor::SplitSection() { QString text = toPlainText(); int split_position = textCursor().position(); // Abort splitting the section if user is within a tag - MainWindow will display a status message if (IsPositionInTag(split_position, text)) { return QString(); } Code:
void MainWindow::CreateSectionBreakOldTab(QString content, HTMLResource *originating_resource) { if (content.isEmpty()) { QMessageBox::warning(this, tr("Sigil"), tr("File cannot be split at this position.")); return; } // XXX: This should be using the mime type not the extension. if (!TEXT_EXTENSIONS.contains(QFileInfo(originating_resource->Filename()).suffix().toLower())) { QMessageBox::warning(this, tr("Sigil"), tr("Cannot split since it may not be an HTML file.")); return; } It seems SplitAtCursor is handled inside CodeView where it takes the bottom half of the file and uses the current file name for it and then the piece above under a new name. All of this was designed before SplitAtSplitMarkers was done. Hmm.... |
01-26-2021, 11:52 AM | #8 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Okay, this needs to be handled in FlowTab.cpp SplitSection() before the OldTabRequest is fired as that is obviously too late.
Code:
void FlowTab::SplitSection() { if (!IsDataWellFormed()) { return; } QWidget *mainWindow_w = Utility::GetMainWindow(); MainWindow *mainWindow = qobject_cast<MainWindow *>(mainWindow_w); if (!mainWindow) { Utility::DisplayStdErrorDialog("Could not determine main window."); return; } HTMLResource * nav_resource = mainWindow->GetCurrentBook()->GetConstOPF()->GetNavResource(); if (nav_resource && (nav_resource == m_HTMLResource)) { Utility::DisplayStdErrorDialog("The Nav file can not be split"); return; } // Handle warning the user about undefined url fragments. if (!mainWindow->ProceedWithUndefinedUrlFragments()) { return; } if (m_wCodeView) { emit OldTabRequest(m_wCodeView->SplitSection(), m_HTMLResource); } } We can then "*assume*" that a .xml file that actually gets successfully loaded into a FlowTab itself must be xhtml (otherwise it would be loaded in a XMLTab) and maybe even remove that test in MainWidow completely since it is already too late. I will try doing that. |
01-26-2021, 12:03 PM | #9 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
@BeckyEbook,
I just pushed a change to master for your second "issue". When you get a free moment, please give it a try and let us know. At least half of the file should not be missing anymore but there is no guarantee that a generic xml file will be split properly either. Only xhtml is this designed to work for. |
01-26-2021, 01:09 PM | #10 |
Guru
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
I can confirm that it is better now – the second case does not delete part of the file
Thank you. |
01-26-2021, 01:32 PM | #11 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Thanks, Kevin.
|
01-26-2021, 02:24 PM | #12 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
@BeckyEbook,
I looked at the SplitSection code and the actual split position is *not* included in the spilt (it grabs everything before that char) so I can check for the text[pos] == '<' and special case that in SplitSection so |<tag> would then work as a split point. I will push that to master so you can try it and let us know if that works for you. Kevin |
01-26-2021, 02:37 PM | #13 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
@BeckyEbook
Okay just pushed that change. It appears to work to me. I had it look like the following: Code:
<div>top</div>|<div>bottom</div> I got back what I expected. If this actually worked in Sigil 1.4.3 and earlier it was by accident. Now it will make all earlier behaviour official. Thanks for your bug reports! Last edited by KevinH; 01-26-2021 at 07:46 PM. |
01-26-2021, 05:37 PM | #14 |
Guru
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
I had exactly this idea for solving the first problem, but the idea and the implementation in code are two different things.
Splitting works great now! Thank you, again. I will try to do more tests in the near future. BTW, your sample code from the previous post causes an unexpected side effect. Paste this fragment and call Mend & Prettify several times: Code:
<div>top</div>|<div>bottom</div> |
01-26-2021, 08:33 PM | #15 |
Bibliophagist
Posts: 35,401
Karma: 145435140
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Cute. So no naked pipe symbols allowed.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Kindle Template Problems Splitting Authors Names | D.. | Library Management | 9 | 12-06-2020 10:56 AM |
Splitting files | LeonidasM | Sigil | 20 | 12-09-2017 12:47 PM |
Renaming when splitting files | rubeus | Sigil | 5 | 01-28-2016 12:32 PM |
splitting html files? | NASCARaddicted | ePub | 8 | 01-22-2013 04:13 AM |
Splitting files... or something? | *Angie* | Calibre | 4 | 09-14-2009 07:42 PM |