Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 07-01-2013, 08:58 PM   #1
automa
Connoisseur
automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.automa ought to be getting tired of karma fortunes by now.
 
automa's Avatar
 
Posts: 93
Karma: 972092
Join Date: Jan 2012
Device: iPhone
Are there efficient ways to make the table of contents on EPub of scanned files

Is there a way to at least partially automate the finding of different chapters of the book and automatically mark them up as heading1, heading 2, etc.?
automa is offline   Reply With Quote
Old 07-01-2013, 10:32 PM   #2
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,869
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by automa View Post
Is there a way to at least partially automate the finding of different chapters of the book and automatically mark them up as heading1, heading 2, etc.?
You might spend some time looking over ALL the options (for your source input and desired output) in the section:
Preferences: Conversion.

There are 3 areas: Input, Common: Output;

There are a big number of choices (some exclusive, while others enable MORE)

There is a section on detecting Unmarked up headings.

BTW, I prefer to run a 'Vanilla' Calibre conversion and Find and Fix Headings with Sigil. So easy to LOOK at the code and write the perfect REGEX for that case (and step thru and see it really does what you want without the part)
theducks is online now   Reply With Quote
Old 07-01-2013, 11:01 PM   #3
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 629
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
The only thing you might be able to do is use Search & Replace to find the next use of the word "Chapter" (if it's used in your book) and then manually change the title's paragraph formatting.
Sabardeyn is offline   Reply With Quote
Old 07-02-2013, 03:15 AM   #4
Steadyhands
Enthusiast
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 32
Karma: 10
Join Date: Dec 2011
Location: Brisbane, Oz
Device: iPad2
I've ended up building a series of regex's that fit different scenarios. I inspect the document and then chose the most appropriate and modify if needed. I've got a TOC search group with entries for Chapter + Number, Roman Numbers, Numbers only, Numbers as words etc.
Steadyhands is offline   Reply With Quote
Old 07-02-2013, 08:49 AM   #5
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 629
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
If you feel like sharing, I'm sure many folks would appreciate having your scripts readily available. You could post it here or in the Regex Examples topic. (If you're inclined.)
Sabardeyn is offline   Reply With Quote
Old 07-02-2013, 10:28 AM   #6
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,869
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
I wonder if we should start a 'Saved Search thread' for frequently reused snips'

I define Frequent use as code that will see usage over many books rather than for a 1 time fix-n-patch job (which can be way more complicated in many cases)

Some of my saved Searches are really Models that need per case tuning before use

Code:
69\Name=Fixup/Promote Headings/Roman
69\Find="<p class=\"\\w\">([CLXVI]{1,7})</p>"
69\Replace="<hr class=\"sigil_split_marker\" /><h3 class=\"chapno\">\\1</h3>"
The green is a series (1 to 6) of possible roman characters in a separate paragraph tag. Everything surrounding the Green needs to be adjusted on a case-by-case use.

(I just copied the above direct from my saved search file)
theducks is online now   Reply With Quote
Old 07-02-2013, 01:02 PM   #7
mrmikel
Book Twiddler
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,086
Karma: 1444487
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
It should be clear from the above that there are ways, but there are few easy ways because none answers every case.

By the time you fiddle with regex you can do an ordinary search for what starts chapters and highlight and hit h1 for any ordinary numbers of chapters. Of course if this is rocket manual or the like, that's different.
mrmikel is offline   Reply With Quote
Old 07-02-2013, 02:56 PM   #8
Steadyhands
Enthusiast
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 32
Karma: 10
Join Date: Dec 2011
Location: Brisbane, Oz
Device: iPad2
Ok, here's some of mine. Remember you'll have to tweak these before use most times.

Code:
52\Name=TOC & Metadata/Part
52\Find="<p class=\".*?\">(?:(Part|PART)) (?:(One|Two|Three|Four|Five|Six|ONE|TWO|THREE|FOUR|FIVE|SIX))</p>"
52\Replace="<hr class=\"sigilChapterBreak\" /><h1>\\1 \\2</h1><hr class=\"sigilChapterBreak\" />"
53\Name=TOC & Metadata/Chapter Finder
53\Find="<p class=\".*?\">(Prologue|PROLOGUE|Epilogue|EPILOGUE|Chapter|CHAPTER)([^>]*)</p>"
53\Replace="<hr class=\"sigilChapterBreak\" /><h2>\\1 \\2</h2>"
55\Name=TOC & Metadata/Numbered Chapters
55\Find="<p class=\".*?\">(\\d+)</p>"
55\Replace="<hr class=\"sigilChapterBreak\" /><h2>\\1</h2>"
56\Name=TOC & Metadata/Roman Chapters
56\Find="<p class=\".*?\">([XVI]+)</p>"
56\Replace="<hr class=\"sigilChapterBreak\" /><h2>\\1</h2>"
57\Name=TOC & Metadata/Numbers
57\Find="<p class=\".*?\">(?:(ONE|TWO|THREE|FOUR|FIVE|SIX|SEVEN|EIGHT|NINE|TEN|ELEVEN|TWELVE|THIRTEEN|FOURTEEN|FIFTEEN|SIXTEEN|SEVENTEEN|EIGHTEEN|NINETEEN|TWENTY|TWENTY-ONE|TWENTY-TWO|TWENTY-THREE|TWENTY-FOUR|TWENTY-FIVE|TWENTY-SIX|TWENTY-SEVEN|TWENTY-EIGHT|TWENTY-NINE|THIRTHY|THIRTY-ONE|THIRTY-TWO|THIRTY-THREE|THIRTY-FOUR|THIRTY-FIVE|THIRTY-SIX|THIRTY-SEVEN|THIRTY-EIGHT|THIRTY-NINE|FORTY|FORTY-ONE|FORTY-TWO|FORTY-THREE|FORTY-FOUR|FORTY-FIVE|FORTY-SIX|FORTY-SEVEN|FORTY-EIGHT|FORTY-NINE|FIFTY|FIFTY-ONE|FIFTY-TWO|FIFTY-THREE|FIFTY-FOUR|FIFTY-FIVE|FIFTY-SIX|FIFTY-SEVEN|FIFTY-EIGHT|FIFTY-NINE|SIXTY|SIXTY-ONE|SIXTY-TWO|SIXTY-THREE|SIXTY-FOUR|SIXTY-FIVE|SIXTY-SIX|SIXTY-SEVEN|SIXTY-EIGHT|SIXTY-NINE|SEVENTY|SEVENTY-ONE|SEVENTY-TWO|SEVENTY-THREE|SEVENTY-FOUR|SEVENTY-FIVE|SEVENTY-SIX|SEVENTY-SEVEN|SEVENTY-EIGHT|SEVENTY-NINE|EIGHTY|EIGHTY-ONE|EIGHTY-TWO|EIGHTY-THREE|EIGHTY-FOUR|EIGHTY-FIVE|EIGHTY-SIX|EIGHTY-SEVEN|EIGHTY-EIGHT|EIGHTY-NINE|THIRTHY|NINETY-ONE|NINETY-TWO|NINETY-THREE|NINETY-FOUR|NINETY-FIVE|NINETY-SIX|NINETY-SEVEN|NINETY-EIGHT|NINETY-NINE))</p>"
57\Replace="<hr class=\"sigilChapterBreak\" /><h2>\\1</h2>"
58\Name=TOC & Metadata/Numbers2
58\Find="<p class=\".*?\">(?:(One|Two|Three|Four|Five|Six|Seven|Eight|Nine|Ten|Eleven|Twelve|Thirteen|Fourteen|Fifteen|Sixteen|Seventeen|Eighteen|Nineteen|Twenty|Twenty-One|Twenty-Two|Twenty-Three|Twenty-Four|Twenty-Five|Twenty-Six|Twenty-Seven|Twenty-Eight|Twenty-Nine|Thirthy|Thirty-One|Thirty-Two|Thirty-Three|Thirty-Four|Thirty-Five|Thirty-Six|Thirty-Seven|Thirty-Eight|Thirty-Nine|Forty|Forty-One|Forty-Two|Forty-Three|Forty-Four|Forty-Five|Forty-Six|Forty-Seven|Forty-Eight|Forty-Nine|Fifty|Fifty-One|Fifty-Two|Fifty-Three|Fifty-Four|Fifty-Five|Fifty-Six|Fifty-Seven|Fifty-Eight|Fifty-Nine|Sixty|Sixty-One|Sixty-Two|Sixty-Three|Sixty-Four|Sixty-Five|Sixty-Six|Sixty-Seven|Sixty-Eight|Sixty-Nine|Seventy|Seventy-One|Seventy-Two|Seventy-Three|Seventy-Four|Seventy-Five|Seventy-Six|Seventy-Seven|Seventy-Eight|Seventy-Nine|Eighty|Eighty-One|Eighty-Two|Eighty-Three|Eighty-Four|Eighty-Five|Eighty-Six|Eighty-Seven|Eighty-Eight|Eighty-Nine|Thirthy|Ninety-One|Ninety-Two|Ninety-Three|Ninety-Four|Ninety-Five|Ninety-Six|Ninety-Seven|Ninety-Eight|Ninety-Nine))</p>"
58\Replace="<hr class=\"sigilChapterBreak\" /><h2>\\1</h2>"
Like theducks did, I've just cut this from the sigil saved searches file, so in a lot of cases it can't just be cut and pasted into a find replace dialog as there will be extra \ in there. I imagine it could be pasted into the bottom of the sigil_searches file, adjust the numbers and it would be fine.

I agree, a saved searches sticky would be very handy.
Steadyhands is offline   Reply With Quote
Old 07-02-2013, 03:55 PM   #9
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,657
Karma: 5072002
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
A sticky will end up with people asking questions in it and other side tracked issues. It would be better to build a reference document in the wiki and link to the forum for discussion.

Dale
DaleDe is offline   Reply With Quote
Old 07-02-2013, 04:00 PM   #10
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,869
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by DaleDe View Post
A sticky will end up with people asking questions in it and other side tracked issues. It would be better to build a reference document in the wiki and link to the forum for discussion.

Dale
Good point
But since this is aimed at the Sigil (saved) Search and Replace interface, is that a good idea?
theducks is online now   Reply With Quote
Old 07-02-2013, 07:11 PM   #11
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,657
Karma: 5072002
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
Quote:
Originally Posted by theducks View Post
Good point
But since this is aimed at the Sigil (saved) Search and Replace interface, is that a good idea?
Why not. Calibre has a article in the wiki on searches. Anything eBook related is fair game in our wiki.

Dale
DaleDe is offline   Reply With Quote
Old 07-02-2013, 07:41 PM   #12
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,869
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Thumbs up

Quote:
Originally Posted by DaleDe View Post
Why not. Calibre has a article in the wiki on searches. Anything eBook related is fair game in our wiki.

Dale
If the Greenies did not have an objection, I sure don't have one.
I wonder what format it should take?

Cookbook (type) Sections? I want to do:


Chapter/Part Headings (Finding and restyling)

Repair (broken Paragraphs, Mangled quotes, bad/invalid HTML ...)

Cleanup (Removal of OCR leftovers: Headers and footers, Word kruft, removing excessive spans)

Other (? )

And I guess there should be some way of indicating What programs/Where this applies (foot notes)
eg
1 Sigil
2 Calibre Add Books
3 Calibre conversions
4 Notepad++

Ideas ?
theducks is online now   Reply With Quote
Old 07-03-2013, 04:08 AM   #13
Steadyhands
Enthusiast
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 32
Karma: 10
Join Date: Dec 2011
Location: Brisbane, Oz
Device: iPad2
Quote:
Originally Posted by theducks View Post
I wonder what format it should take?

Cookbook (type) Sections? I want to do:
The names on mine differ slightly but they follow the same theme.

Formatting/Join Paragraphs
Formatting/Speech/Broken dialog
Formatting/Line Endings
Formatting/Format Change
Formatting/Quotes
Formatting/Dashes
TOC & Metadata
Steadyhands is offline   Reply With Quote
Old 07-03-2013, 06:21 AM   #14
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 629
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Hmm... not sure what you might be looking for exactly, but I know I would like to see any "Known Exclusions" for a particular regex. Things that the regex command will not find so there is some idea to it's usefullness/limitations.

As for the Formatting/Quotes section, we might need two of them: Normal Speech Quotes and Smart/Curly Quotes. I've seen a few posts about people hating one or the other. Not to mention that some non-English languages use other symbols (I think).

My two cents for the moment.
Sabardeyn is offline   Reply With Quote
Old 07-03-2013, 03:14 PM   #15
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 2,509
Karma: 13904601
Join Date: Apr 2010
Location: Phoenix, AZ
Device: Kindle2, iPad, KindleFire and NookColor
Quote:
Originally Posted by Sabardeyn View Post
Hmm... not sure what you might be looking for exactly, but I know I would like to see any "Known Exclusions" for a particular regex. Things that the regex command will not find so there is some idea to it's usefullness/limitations.

As for the Formatting/Quotes section, we might need two of them: Normal Speech Quotes and Smart/Curly Quotes. I've seen a few posts about people hating one or the other. Not to mention that some non-English languages use other symbols (I think).

My two cents for the moment.
The part that cracks me up is that everyone here instantly assumed that the OP was seeking a regex solution. Somehow....something makes me think that wasn't actually the question. ;-)

Hitch
Hitch is offline   Reply With Quote
Reply

Tags
regex wiki

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Table of contents in pdf files pvdas Onyx Boox 3 12-14-2012 05:45 AM
Most efficient way to process file contents of exploded ePub Agama Development 4 09-23-2012 07:49 AM
adding table of contents to html files jfs999 Conversion 2 09-30-2011 02:25 PM
Make Table of Contents? banjobama Calibre 18 06-25-2011 08:13 AM
How to make a PDF table of contents work in epub ajbrutico Calibre 3 09-26-2010 09:31 AM


All times are GMT -4. The time now is 03:06 PM.


MobileRead.com is a privately owned, operated and funded community.