Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-26-2021, 06:44 AM   #1
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Two problems with splitting files

First case
  1. Open any file, place the cursor in front of any open or close tag, and try to split the file (Split At Cursor).
Message: File cannot be split at this position.

Why? After all, it is enough to move the cursor one field to the left and then the split will work.


Second case
  1. Open any file, but change the extension from xhtml to xml (I see files like that in the wild quite often).
  2. Place the cursor anywhere (apart from the one I described in the first case of course) and try to split the file (Split At Cursor).
Message: Cannot split since it may not be an HTML file.

Despite the message, the split will be performed ... but the split part of the file will be lost (we lose a piece of code BEFORE split position, and it remains AFTER split)
Attached Thumbnails
Click image for larger version

Name:	sigil-cannot-split-before-tag.png
Views:	160
Size:	22.6 KB
ID:	184977  
BeckyEbook is online now   Reply With Quote
Old 01-26-2021, 10:24 AM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
The cursor is actually on the first <, not "before it.
so if CV shows something like |<tag> , then the cursor is in the tag. If you had a block cursor it would highlight the < but CV uses an insertion point cursor.

The long held Sigil rule is you can not split when in a tag. If you think this is too tight, the I would be happy to consider a patch or PR that handles just that one case.


Note, a file with an xml file extension can be of any type, it could be an opf, an ncx, an xhtml, a generic xml file, etc. There is no way to know the tag meanings: a body tag may not be what you think it means, there may be no html root tag, there is no way to tell which are void tags and which are not, which are block tags and which are not, where it might even be safe to split, and how splitting will impact a generic xml file.

And according to the epub3 spec, file extension must be xhtml if the file is an xhtml file.

The first step when seeing lots of xml files that are really xhtml files (or html files that are really xhtml files) is to use Sigil's ability to bulk rename a set of files by just changing the extension as they do not meet the current epub spec left as .xml files.

Again, I would consider a patch or PR to handle this differently but Sigil has long operated this way, and forcing people to change file extensions to more correct ones that meet the latest spec makes good sense to me. Please note, that old epub2 spec that allowed pure xml "islands" has long been deprecated mainly since no reader bothered to support it and just treated all .xml files as .xhtml files. This changed in epub3.

Hope this explains why you are seeing what you are seeing.


Quote:
Originally Posted by BeckyEbook View Post
First case
  1. Open any file, place the cursor in front of any open or close tag, and try to split the file (Split At Cursor).
Message: File cannot be split at this position.

Why? After all, it is enough to move the cursor one field to the left and then the split will work.


Second case
  1. Open any file, but change the extension from xhtml to xml (I see files like that in the wild quite often).
  2. Place the cursor anywhere (apart from the one I described in the first case of course) and try to split the file (Split At Cursor).
Message: Cannot split since it may not be an HTML file.

Despite the message, the split will be performed ... but the split part of the file will be lost (we lose a piece of code BEFORE split position, and it remains AFTER split)
KevinH is online now   Reply With Quote
Advert
Old 01-26-2021, 10:29 AM   #3
exaltedwombat
Guru
exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.
 
Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
I can reproduce the second issue, but not the first. Sigil 1.4.3 Windows 10.
exaltedwombat is offline   Reply With Quote
Old 01-26-2021, 10:38 AM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
The point I was trying to make is both are expected behaviour. So a patch and rationale to change this would be needed for either.
KevinH is online now   Reply With Quote
Old 01-26-2021, 10:50 AM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
If a user gets the message "Cannot split since it may not be an HTML file" then why is the split still being performed (with the subsequent loss of data)? Shouldn't the split attempt be aborted under those conditions if possible?
DiapDealer is offline   Reply With Quote
Advert
Old 01-26-2021, 11:26 AM   #6
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Quote:
Originally Posted by DiapDealer View Post
If a user gets the message "Cannot split since it may not be an HTML file" then why is the split still being performed (with the subsequent loss of data)? Shouldn't the split attempt be aborted under those conditions if possible?
Probably. That code is either in CodeViewEditor.cpp itself or in MainWindow.

That I can look at.
KevinH is online now   Reply With Quote
Old 01-26-2021, 11:41 AM   #7
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Well the first "issue" can be found in CodeViewEditor.cpp here if anyone wants to play with that, they could simply check for a |<tag> here and allow that even if "considered in tag".
Code:
QString CodeViewEditor::SplitSection()
{
    QString text = toPlainText();
    int split_position = textCursor().position();

    // Abort splitting the section if user is within a tag - MainWindow will display a status message                           
    if (IsPositionInTag(split_position, text)) {
        return QString();
    }
I do not see that error message "Cannot split since it may not be an HTML file" anywhere in CodeViewEditor though. But in MainWindow.cpp that error message comes here:

Code:
void MainWindow::CreateSectionBreakOldTab(QString content, HTMLResource *originating_resource)
{
    if (content.isEmpty()) {
        QMessageBox::warning(this, tr("Sigil"), tr("File cannot be split at this position."));
        return;
    }

    // XXX: This should be using the mime type not the extension.                                                               
    if (!TEXT_EXTENSIONS.contains(QFileInfo(originating_resource->Filename()).suffix().toLower())) {
        QMessageBox::warning(this, tr("Sigil"), tr("Cannot split since it may not be an HTML file."));
	return;
    }
But by this point the "damage" has been done and the original file is now doomed.

It seems SplitAtCursor is handled inside CodeView where it takes the bottom half of the file and uses the current file name for it and then the piece above under a new name. All of this was designed before SplitAtSplitMarkers was done.

Hmm....
KevinH is online now   Reply With Quote
Old 01-26-2021, 11:52 AM   #8
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Okay, this needs to be handled in FlowTab.cpp SplitSection() before the OldTabRequest is fired as that is obviously too late.

Code:
void FlowTab::SplitSection()
{
    if (!IsDataWellFormed()) {
        return;
    }

    QWidget *mainWindow_w = Utility::GetMainWindow();
    MainWindow *mainWindow = qobject_cast<MainWindow *>(mainWindow_w);
    if (!mainWindow) {
        Utility::DisplayStdErrorDialog("Could not determine main window.");
        return;
    }
    HTMLResource * nav_resource = mainWindow->GetCurrentBook()->GetConstOPF()->GetNavResource();
    if (nav_resource && (nav_resource == m_HTMLResource)) {
        Utility::DisplayStdErrorDialog("The Nav file can not be split");
        return;
    }

    // Handle warning the user about undefined url fragments.                                                                   
    if (!mainWindow->ProceedWithUndefinedUrlFragments()) {
        return;
    }

    if (m_wCodeView) {
        emit OldTabRequest(m_wCodeView->SplitSection(), m_HTMLResource);
    }
}
So from this, I think we can simply remove the "Cannot be split" from that part of MainWindow.cpp as it is too late by that point to not hurt something.

We can then "*assume*" that a .xml file that actually gets successfully loaded into a FlowTab itself must be xhtml (otherwise it would be loaded in a XMLTab) and maybe even remove that test in MainWidow completely since it is already too late.

I will try doing that.
KevinH is online now   Reply With Quote
Old 01-26-2021, 12:03 PM   #9
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
@BeckyEbook,
I just pushed a change to master for your second "issue". When you get a free moment, please give it a try and let us know.

At least half of the file should not be missing anymore but there is no guarantee that a generic xml file will be split properly either. Only xhtml is this designed to work for.
KevinH is online now   Reply With Quote
Old 01-26-2021, 01:09 PM   #10
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
I can confirm that it is better now – the second case does not delete part of the file
Thank you.
BeckyEbook is online now   Reply With Quote
Old 01-26-2021, 01:32 PM   #11
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Thanks, Kevin.
DiapDealer is offline   Reply With Quote
Old 01-26-2021, 02:24 PM   #12
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
@BeckyEbook,

I looked at the SplitSection code and the actual split position is *not* included in the spilt (it grabs everything before that char) so I can check for the text[pos] == '<' and special case that in SplitSection so |<tag> would then work as a split point.
I will push that to master so you can try it and let us know if that works for you.

Kevin
KevinH is online now   Reply With Quote
Old 01-26-2021, 02:37 PM   #13
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
@BeckyEbook
Okay just pushed that change. It appears to work to me.

I had it look like the following:

Code:
<div>top</div>|<div>bottom</div>
with the cursor being represented by the '|' char and hit Split At Cursor.
I got back what I expected.

If this actually worked in Sigil 1.4.3 and earlier it was by accident.
Now it will make all earlier behaviour official.

Thanks for your bug reports!

Last edited by KevinH; 01-26-2021 at 07:46 PM.
KevinH is online now   Reply With Quote
Old 01-26-2021, 05:37 PM   #14
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
I had exactly this idea for solving the first problem, but the idea and the implementation in code are two different things.

Splitting works great now! Thank you, again.
I will try to do more tests in the near future.

BTW, your sample code from the previous post causes an unexpected side effect.
Paste this fragment and call Mend & Prettify several times:
Code:
<div>top</div>|<div>bottom</div>
I know this is an incredibly rare situation, I report it as a curiosity.
BeckyEbook is online now   Reply With Quote
Old 01-26-2021, 08:33 PM   #15
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,401
Karma: 145435140
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by BeckyEbook View Post
BTW, your sample code from the previous post causes an unexpected side effect.
Paste this fragment and call Mend & Prettify several times:
Code:
<div>top</div>|<div>bottom</div>
I know this is an incredibly rare situation, I report it as a curiosity.
Cute. So no naked pipe symbols allowed.
DNSB is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Kindle Template Problems Splitting Authors Names D.. Library Management 9 12-06-2020 10:56 AM
Splitting files LeonidasM Sigil 20 12-09-2017 12:47 PM
Renaming when splitting files rubeus Sigil 5 01-28-2016 12:32 PM
splitting html files? NASCARaddicted ePub 8 01-22-2013 04:13 AM
Splitting files... or something? *Angie* Calibre 4 09-14-2009 07:42 PM


All times are GMT -4. The time now is 10:45 AM.


MobileRead.com is a privately owned, operated and funded community.