Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 04-09-2021, 02:45 PM   #31
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by DiapDealer View Post
EDIT: Just to clarify, this is in no way new to Sigil 1.5.1.
Yep, definitely. The issues I brought up have been lurking in Sigil for all the years I've been using it.

I just learned to completely avoid monolithic HTML files + kept Preview closed by default... and now continue to do so out of habit.

(Preview + Sigil overall has gotten much faster/better though. )

Quote:
Originally Posted by DiapDealer View Post
Many of them I would expect, but the UI interface freezing in general (after the merge is done) seem odd to me (as well as unique unique to Windows, in my experience).

[...] During that time, processor/ram/disk/gpu usage was not particularly high (both cumulatively--for all processes--and for Sigil specifically). One notable thing is that the Windows 10 task manager indicates that power consumption is "very high" during the merge process. [...]

During the unresponsive periods, Sigil's processor and memory usage spike a bit (but nothing drastic, and not nearly enough to interfere with the running of other applications) and the power consumption indicator immediately spikes to "very high". The gpu % never moves from 0% for Sigil during any of my testing.
Hmmm.... interesting observation. I didn't think to have Task Manager and that column open at the time.

But that "Very High" makes me think... perhaps the laptop saw that, was in power-saving mode, then decided to aggressively throttle Sigil extra hard to "save battery"?

More Testing

During/after the merge, Calibre had a few "calibre worker processes" sitting in Task Manager:
  • Calibre only jumped to "Very High" for a split second while generating the preview, then sat at "High" for ~6 seconds.
    • ~1.5 cores + ~20% GPU usage
    • (Calibre was using "~20% CPU". Since I have 8 cores: 1 core = 12.5%.)
  • While syntax highlighting and everything else, it sat at "Medium".
    • ~1/2 a core + ~5% GPU usage.
    • Full-speed and usable this entire time.
  • When everything was completed, dropped down to "Very Low".

Sigil, during these similar steps, used a full core the entire time (~13-15%).

When it finally settled down, and became clickable/scrollable... If I left to do something else for a minute, then clicked back into Sigil... it froze and jumped back to a full core for a while.

Quote:
Originally Posted by KevinH View Post
As for the time to delete the now merged resources, it takes a long time because it requires parsing the opf, updating the manifest, spine, guide, and any nav landmarks (so parsing and recreating the nav each time too), then rebuilding the opf and saving it each time repeated for over 100 resources.
If you need a test EPUB for merging lots of internal links, this may also be a good one:

https://mises.org/library/man-econom...wer-and-market

That was a 1500 page book (each page is marked with an <a id="page_XXX"></a>) with ~6000 links in the Index.

(That's the only one I remember off the top of my head where the Index was >300KBs so had to be split. I definitely know I've worked on multiple books with even larger Indexes though, but I don't recall which ones.)

Quote:
Originally Posted by KevinH View Post
Only the merge case would require deleting so many resources at once, so it was designed for single use and safety not speed. I will look into adding a bulk removal routine to save some time.


And Calibre has this nice menu selection where you pick WHICH file you want to merge into:

Click image for larger version

Name:	Calibre.Merge.Selection.png
Views:	1654
Size:	14.0 KB
ID:	186502

Last edited by Tex2002ans; 04-09-2021 at 03:54 PM. Reason: More details on core usages
Tex2002ans is offline   Reply With Quote
Old 04-09-2021, 03:08 PM   #32
repilo
Connoisseur
repilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmosrepilo has become one with the cosmos
 
Posts: 97
Karma: 21870
Join Date: Apr 2021
Location: Spain
Device: Kobo Libra 2
Quote:
Originally Posted by Tex2002ans View Post
Question to repilo: Is this happening while on battery? Or is it happening when plugged in too?
In both cases.

Quote:
And did that "Balanced" power mode solve the freezing issue now for Sigil 1.5.1?
Yes, I'm happy now.
Quote:
Does this happen in anything else on your computer? For example, watching a high resolution video on Youtube? Or opening up a very large document in Word/LibreOffice?
I have noticed that with the new configuration it takes less time to load Firefox with about 20 tabs that I usually have open. Previously that was the only thing that really bothered me about this computer because of its slowness. Now it is noticeably much better.
repilo is offline   Reply With Quote
Old 04-09-2021, 03:48 PM   #33
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Okay,

I have pushed a bunch of stuff to master for testing to help speedup things:

1. Removed all of the secondary and third tests for well-formed because we check them all in advance now using multiple threads to do the checking.

2. temporarily disabled the id verification step as it really needs to detect if ids are being reused across the the files to be merged (because after merge that could result in duplicate ids).

3. Rewrote the merge code to use multiple threads for speed and to remove redundant checks

4. Added a bulk resource removal capability to FolderKeeper and the OPF to prevent reparsing the OPF a hundred or more times in a row when deleteing them one by one

5. Made loading Preview be asynchronous and no longer wait for loading to complete

Now to get the fastest time I do the following:

1) Use Preferences to turn *off* Spellcheck highlighting (red squiggley) in CodeView

2) Use Preferences to turn *off* the newly added highlight tag pair (as that requires reparsing the dom)

3) Preview can now be on or off as it seems to do little to impact speed (interestingly)

I added a bunch fo qDebug log messages to show what is going on during the merge so that timings can easily be done when in a debugger

The results show the bottle neck is now loading the over 43000 lines of Code into CodeView and then synchronously running the SyntxHighlighter.

The syntax highlighter has its own thread but still consumes much time since we wait for completion.

The only way to speed up editing any further is to disable syntax highlighting which would need a Preferences setting as well which we do not have (yet).

With this in places, what took over 2 minutes on my machine now takes about 15 seconds or so.

Any other speedups would come from completely redesigning how we do syntax highlighting which is not something I am prepared to do given the workaround above.
KevinH is online now   Reply With Quote
Old 04-09-2021, 04:49 PM   #34
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
It seems that syntax highlighting delays things only for initial launch and when CodeView is reloaded.

The real slowdown is actually caused by MainWindow constantly checking if the cursor is in a tag or not so it knows whether to enable or disable some editing icons and menu items. This actually requires CV to search for and identify all tags in the huge document just to know if the cursor is in a tag which might be inside a comment or a cdata or not at all. It needs to do this just to disable icons and things that could happily be ignored after launch if they can do nothing (ie. just to grey things out).

I was able to work around the most commonly employed call "IsPositionInTag()" which tries to determine if a cursor is inside a tag or not and properly handle the case of multiline comments and cdata sections.

I put a pre filter to at least check near by to look to see if in a tag or not, and if not, just return, if yes, then do the full parse to verify that.

I have pushed this to master as well.

So all in all, Merge is now a lot faster than it was, Preview can stay open, but disabling SpellChecking in CodeView and Tag Pair highlighting in CodeView in Sigil Preferences will speed things up considerably when working with huge merged monolithic files.
KevinH is online now   Reply With Quote
Old 04-09-2021, 04:50 PM   #35
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by KevinH View Post
With this in places, what took over 2 minutes on my machine now takes about 15 seconds or so.
Incredible.

Quote:
Originally Posted by KevinH View Post
Any other speedups would come from completely redesigning how we do syntax highlighting which is not something I am prepared to do given the workaround above.
Well, we'll see if this squashes the main freezing/stuttering issue. On any normal EPUBs, it'll probably do gloriously.

(And I always forget... we typically have quite beefy computers. So our "few seconds" may = huge amount of time on an ancient/slow computer. A few years ago, when I was training a few people to use Sigil, I borrowed a person's super slow laptop. Calibre/KindlePreviewer on my computer would convert in seconds, but took MINUTES on the cruddy laptop. I just couldn't believe it... it was like being stuck in the stone ages. )

Last edited by Tex2002ans; 04-09-2021 at 04:56 PM.
Tex2002ans is offline   Reply With Quote
Old 04-09-2021, 05:08 PM   #36
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
DiapDealer,

Give current master a try with that testcase but turn off CV spellchecking and open/close tag highlighting first. I am not seeing those long extra pauses anymore after load but I did see them earlier and really have no way to know what is going on in them unless I am running in a debugger and break in and dump all thread backtraces to see where things are.

Please let me know if they help. These changes were done quite quickly so expect some breakage ...

KevinH

Quote:
Originally Posted by DiapDealer View Post
Bad news: turns out spellchecking in Code View was already disabled.

The same cycle of the Preferences dialog repeatedly doing something for 8 minutes or so before finally rendering happens every time (at least while the new monolithic merged html tab is the one active) after the merge is complete.

Sigil debug logging is disabled, by the by. I learned my lesson on that when testing Sigil performance issues on Windows
KevinH is online now   Reply With Quote
Old 04-09-2021, 05:08 PM   #37
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,866
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
As I mentioned: the stuttering and freezing I've described (as well as the 8 minute wait for the Preferences dialog to render) are all after the massive merge has been completed and Cod View has reloaded.

I expect a massive merge like that to take some time and resources (and it's not something I'd be doing myself, or recommending to anyone). I'm more concerned with Sigil's erratic behavior when the massive html file is open (and Code View is fully loaded). I realize that editing such a file (especially when Preview is open) is always going to be problematic, but I don't understand the freeze-ups that are happening when navigating Sigil menus (especially trying to launch the Preferences dialog) when no edits are being made and Preview is closed.

I haven't pulled the changes and retested yet, so I'll do that now. Hopefully they will have an effect on the slowdowns and freezing/stuttering I'm referring to.

EDIT: Whoops! Looks like we cross-posted, there. I'll definitely pull the changes and disable the tag highlighting. Spellchecking has been disabled all along.

Last edited by DiapDealer; 04-09-2021 at 05:10 PM.
DiapDealer is offline   Reply With Quote
Old 04-09-2021, 07:29 PM   #38
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,866
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
With just the Book Browser and Code View widgets open (and with spell-checking and tag highlighting disabled) the same massive merge takes 1.5 minutes in 1.5.1 and just a hair less than 20 seconds after your changes. That's great!

Unfortunately, the stuttering and freezing continues (as well as the delay when trying to launch/render the preferences dialog afterward with little change that I can detect). But I'm beginning to detect a pattern that can help eliminate the problem (temporarily at least). The issue seems to be that the Code View tab with the new giant html file (from the merge) has the focus. If I can get the focus on the Book Browser widget (by single-clicking one of the files in the tree and waiting for the inevitable busy cursor to stop spinning) then the various menus (including launching the Preferences dialog) are very responsive and don't suffer from the "freezups"... even though the giant file is still the active tab showing in the central widget.

I know there's no way to keep monolithic html files from slowing down Sigil in general when editing, I'm just wishing there might be a way to mitigate the effect if the monolithic file's tab is open (and active) but is not being actively edited, scrolled, clicked on, or rendered (Preview closed).

P.S. I also think the fact that the drop down menus (as well as the Preferences widget) can partially oclude the Code View tab of the giant html file might be causing signals to fire and reparsing to happen. Trying to launch Preferences on my laptop with the monolithic file open actually seems to cause a loop of some kind that prevents the widget from ever fully rendering. If I shrink the Code View tab so that none of the dropdowns touch Code View (and make sure the Preferences dialog is positioned to not open over top of the Code View Tab), the freezing menus and non-rendering Preferences don't seem to present nearly as much of an issue.

Last edited by DiapDealer; 04-09-2021 at 07:50 PM.
DiapDealer is offline   Reply With Quote
Old 04-09-2021, 07:48 PM   #39
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
I may have been lucky enough to just have focus someplace else. Opening Preview should not change your timings by much.

If you can recreate this slowdown in launching preferences on Linux, try running gdb and when the beach ball or hang starts, hit crtl-c to interrupt gdb and get a copy of all backtraces of all threads. Then use gdb to hit C for continue, and wait a second or two and interrupt it again. Then again print out the backtraces of all threads. Find the thread that has not progressed much across the two sets of backtraces (that is actually doing something or trying to). That will help show us what is happening at this time.

If you get a chance, zip up the two outputs and e-mail me. I will try the same thing under macOS if I can get the big delay after loading to happen.

Once we see where things are stuck, we will have a better idea of how to fix it.
KevinH is online now   Reply With Quote
Old 04-09-2021, 09:32 PM   #40
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
@DiapDealer
Your idea about focus was right. It seems if CV loses the focus because any other dialog opens over it when you try to close the dialog, it causes focus to be regained which will completely rehighlight the entire document which causes a big delay.

I would think that there is no need to rehighlight (synchronously) the entire document when focus is lost then gained. Widget backingstore should take care of that without having to redo the complete highlighting.

So I have just pushed yet another change to master to disable this hopefully unneeded re-syntaxhighlighting with every change in focus in CV.

With this change, I can open and close dialogs that partially cover CV with no more spinning ball.

This is even with Preview open, but spellchecking and open/close tag pair highlighting disabled.

Please give it a try. I have my fingers crossed.

KevinH

Last edited by KevinH; 04-09-2021 at 09:53 PM.
KevinH is online now   Reply With Quote
Old 04-10-2021, 09:04 AM   #41
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,866
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by KevinH View Post
@DiapDealer
Your idea about focus was right.
Even a blind squirrel finds a nut every now and then!

Quote:
Originally Posted by KevinH View Post
With this change, I can open and close dialogs that partially cover CV with no more spinning ball.

This is even with Preview open, but spellchecking and open/close tag pair highlighting disabled.
Same. The UI remains snappy and responsive even when the massive html file's tab is open in Code View and rendered in Preview.

I got an 11 second merge time from and to end on a Virtual Windows test machine. But it's got more resources than my tired laptop where it was taking 20 seconds. I'll test this new fix there next.

Quote:
Originally Posted by KevinH View Post
Please give it a try. I have my fingers crossed.
Very promising. Thanks! Just need to do some testing to make certain that synchronous call to re-highlight wasn't necessary under certain conditions.

Editing huge html files is probably always going to painful and sluggish, but hopefully this will go a long way toward making sure Sigil's UI doesn't go on the fritz just because a huge file is open. Should make opening and splitting one of these huge buggers up less painful as well.

Last edited by DiapDealer; 04-11-2021 at 05:43 PM.
DiapDealer is offline   Reply With Quote
Old 04-10-2021, 09:13 AM   #42
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,866
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Still a hair under a minute to merge all 62 files of that epub in the virtual machine with spellcheck and tag-pair highlighting enabled. Plus the UI was just as responsive as when they were disabled (so long as no editing is taking place).

So after the next release we can officially recommend that those who have to (or choose to) deal with these huge monolithic html files should leave spellcheck and tag-pair highlighting off whenever doing the heavy lifting.
DiapDealer is offline   Reply With Quote
Old 04-10-2021, 09:30 AM   #43
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Or try to create a second thread to handle spellchecking specifically. I will look into how doable that would be.
KevinH is online now   Reply With Quote
Old 04-10-2021, 09:36 AM   #44
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Also we still have to check for duplicate ids used in files to be merged and abort or fix them before the merge starts.

Also it would seem to me we should also insert ids where the start of each section was so that any links from outside the merged set to the top of file of one of the files to be merged, can properly be kept instead of ending up at the top of the merged section.

Perhaps adding an id attribute to the first tag of every section with the old filename as the id might be useful as they are being merged.

Last edited by KevinH; 04-10-2021 at 10:08 AM.
KevinH is online now   Reply With Quote
Old 04-10-2021, 12:03 PM   #45
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Okay, I have been working on how best to handle the potential duplication of ids across the files to be merged into one.

For this particular test case I see the following output:
Code:
Id duplicated:  "fn_14"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_16"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fnb"  in  ("OEBPS/Text/Chapter61.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter91.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html", "OEBPS/Text/Chapter96.html")

Id duplicated:  "fn4"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn14"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_3"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_10"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html")

Id duplicated:  "fn_7"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_20"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn10"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html")

Id duplicated:  "fnc"  in  ("OEBPS/Text/Chapter67.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html")

Id duplicated:  "fna"  in  ("OEBPS/Text/Chapter61.html", "OEBPS/Text/Chapter71.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter88.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter91.html", "OEBPS/Text/Chapter93.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html", "OEBPS/Text/Chapter96.html")

Id duplicated:  "fn15"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_1"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter90.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn5"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_9"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_21"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fnd"  in  ("OEBPS/Text/Chapter67.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn2"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter90.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn8"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_13"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn20"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_b"  in  ("OEBPS/Text/Chapter61.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter91.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html", "OEBPS/Text/Chapter96.html")

Id duplicated:  "fn21"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn22"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn18"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn3"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_11"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html")

Id duplicated:  "fn9"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_19"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn6"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_4"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn16"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_d"  in  ("OEBPS/Text/Chapter67.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_6"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_5"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_2"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter90.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_22"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn12"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn19"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_8"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_17"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_12"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_c"  in  ("OEBPS/Text/Chapter67.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html")

Id duplicated:  "fn_15"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn1"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter90.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn_18"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn_a"  in  ("OEBPS/Text/Chapter61.html", "OEBPS/Text/Chapter71.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter88.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter91.html", "OEBPS/Text/Chapter93.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html", "OEBPS/Text/Chapter96.html")

Id duplicated:  "fn7"  in  ("OEBPS/Text/Chapter53.html", "OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter85.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html", "OEBPS/Text/Chapter95.html")

Id duplicated:  "fn17"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")

Id duplicated:  "fn11"  in  ("OEBPS/Text/Chapter79.html", "OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html", "OEBPS/Text/Chapter94.html")

Id duplicated:  "fn13"  in  ("OEBPS/Text/Chapter87.html", "OEBPS/Text/Chapter89.html")
This is a huge list and given the name, it seems that many of these duplicates will be the target of hrefs and so will fail miserably.

There are too many for people to to handle manually.

So I think the only way to deal with this is to update the duplicate ids with unique ones and then walk the entire set of html files to update the links that may have pointed to them to use the new fragment.

So it appears we will have to use an approach much like calibre does and automate the renaming to be unique at least among the set of files to be merged. For that we will have to add a SourceUpdater for Fragments to our codebase.

Thoughts?

KevinH
KevinH is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Sigil 0.5.3 extremely slow to reflect changes yotzeret Sigil 4 08-16-2012 06:10 AM
computer upgrade for slow calibre myday Devices 16 08-17-2011 06:26 PM
Sigil on Lion So Slow mhikl Sigil 10 07-24-2011 10:55 PM
I need help with a slow computer Nate the great Lounge 24 08-29-2010 02:27 PM
Sigil 1.6 - deleting blank line very slow lol Sigil 2 12-24-2009 11:54 AM


All times are GMT -4. The time now is 07:19 PM.


MobileRead.com is a privately owned, operated and funded community.