Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 10-23-2012, 04:38 AM   #1
sperrmull
Junior Member
sperrmull began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Oct 2012
Device: generic
how to get chapter title from filename

Hi

In my document each chapter is a seperat xhtml-file. In the orginal, all chapters had only '*' as heading - so in the table of content, I have only al lot of *.
Ist there a way to use the filenames / nummers ( all xhtmlfiles are in right order from 0001 to 0020) for table of content and fill the in automaticaly?

thank you,
Andi
sperrmull is offline   Reply With Quote
Old 10-23-2012, 12:32 PM   #2
ghostyjack
Guru
ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.
 
ghostyjack's Avatar
 
Posts: 714
Karma: 1085604
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
If I understand you correctly, you want to have a toc that that lists the chapters (e.g. Chapter 1, Chapter 2, etc), but you don't want to have this shown on the actual page, just have a * as the displayed chapter name.

How odd.

What you could do is add a title to each of the * headings like the following example:

Code:
<h2 title="Chapter 1">*</h2>
This method will tell Sigil to use the text in the title tag for the toc and not the actual displayed text.
ghostyjack is offline   Reply With Quote
Old 10-23-2012, 03:24 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,273
Karma: 42298328
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I think he/she means the opposite. The chapter headings are already "*". So that's what appears in the ToC. I believe they're looking for a way to use the physical individual chapter file (*.html) names to automatically fix the chapter heading and subsequently (I'm guessing) rebuild the ToC with the then corrected names.

Find and Replace with/without regex is the only possible way of automatically generating/changing those chapter headers (and thus the ToC) ... and to my knowledge, there's no provision for search and replace to access (or use) the underlying file's name in an F&R routine.

Although technically, you could use regex on the toc.ncx file ... and as long as the play orders start with 1 and increment by 1 (you can automatically renumber them if not), you could capture the play order with regex and then use that number to build a new <text> entry:

Find:
Code:
<navPoint id="(.*?)" playOrder="(\d+)">\s+<navLabel>\s+<text>.*?</text>
Replace:
Code:
<navPoint id="\1" playOrder="\2"><navLabel><text>Chapter \2</text>
(or something to that effect. I was just winging that example, so be sure to check the toc.ncx to make sure I didn't make some glaring error)

The expression could probably be made a lot cleaner, but I thought someone mentioned that there's issues with \K and look(ahead|behind)s in the latest beta.

NOTE: the above would fix the ToC, but do nothing to the chapter headings.
DiapDealer is online now   Reply With Quote
Old 10-23-2012, 05:33 PM   #4
ghostyjack
Guru
ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.ghostyjack ought to be getting tired of karma fortunes by now.
 
ghostyjack's Avatar
 
Posts: 714
Karma: 1085604
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
You may be right.

I think we need the OP to better explain the situation, but on closer inspection, your response may be what they are after.
ghostyjack is offline   Reply With Quote
Old 10-24-2012, 11:46 AM   #5
Man Eating Duck
Addict
Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.
 
Posts: 253
Karma: 69784
Join Date: May 2006
Location: Oslo, Norway
Device: Kobo Aura, Sony PRS-650
Quote:
Originally Posted by ghostyjack View Post
You may be right.

I think we need the OP to better explain the situation, but on closer inspection, your response may be what they are after.
Unless the OP has a huge amount of files to process I would advise him to just fix the titles in the source manually and regenerate TOC, it's only 20 (searchable) entries. If he for some reason wants to keep the star instead of Chapter XX (which would be a bit confusing to the reader) your title="" method is probably better than toc.ncx regex trickery, as that will be stored in case of later TOC regenerations.
Man Eating Duck is offline   Reply With Quote
Old 10-24-2012, 12:45 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,273
Karma: 42298328
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Man Eating Duck
If he for some reason wants to keep the star instead of Chapter XX (which would be a bit confusing to the reader) your title="" method is probably better than toc.ncx regex trickery
Hey pal... my "toc.ncx regex trickery" is the biggety-bomb! Don't you forget it.

Spoiler:
DiapDealer is online now   Reply With Quote
Old 10-25-2012, 09:20 AM   #7
Man Eating Duck
Addict
Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.
 
Posts: 253
Karma: 69784
Join Date: May 2006
Location: Oslo, Norway
Device: Kobo Aura, Sony PRS-650
Quote:
Originally Posted by DiapDealer View Post
Hey pal... my "toc.ncx regex trickery" is the biggety-bomb! Don't you forget it.

Spoiler:
I love regex trickery as much as the next guy, but in this case... well OK, it's the bomb
Man Eating Duck is offline   Reply With Quote
Old 12-28-2012, 05:52 AM   #8
wobohohoho
Member
wobohohoho began at the beginning.
 
wobohohoho's Avatar
 
Posts: 11
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (Windows Phone) EPub reader app
I have a somewhat similar problem. I have in excess of 500 htm files from a decompiled CHM (for just one book, but a few have the same problem):

1.that have individual sub-chapters as different htm files in different folders to denote main chapters (numbered 1 to 12)
2.the original CHM creators never bothered heading their documents, so there is no way to generate a TOC in the first place

Since we can't use folders in Sigil, or EPUB for that matter to distinguish the loads of htm files, I would have to batch rename files to match their main chapters as there a few that are the same (e.g. Introduction) (which I'm okay with).

Finding and replacing tags through all selected html files seems the way to go to create a TOC. But the suitable tag I could find is, for example, below:

Code:
 
    <div id="printheader_crumbtrail">
      &gt; Front of Book &gt; Conflicts of interest: none declared
    </div>

Question is, how do I find and replace these tags (or what regex do I use)?

Or maybe other ways?
wobohohoho is offline   Reply With Quote
Old 12-28-2012, 06:50 AM   #9
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 1,997
Karma: 4633978
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by wobohohoho View Post

Code:
 
    <div id="printheader_crumbtrail">
      &gt; Front of Book &gt; Conflicts of interest: none declared
    </div>

Question is, how do I find and replace these tags (or what regex do I use)?
You could use the following very simple regular expression:

Find: <div id="printheader_crumbtrail">\s+(.*?)\s+</div>
Replace: <h3>\1</h3>

However, before you do that you may want to search for dedicated CHM-to-EPUB converters first. There are a couple of standalone programs and some free websites, e.g. Zamzar that will convert .chm files to epubs.
You could then use Sigil to fine-tune the epub.
Doitsu is offline   Reply With Quote
Old 12-28-2012, 07:24 AM   #10
wobohohoho
Member
wobohohoho began at the beginning.
 
wobohohoho's Avatar
 
Posts: 11
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (Windows Phone) EPub reader app
Ooops. Terribly sorry, I didn't specify that what I want is to make it :

Code:
    <h1 id="printheader_crumbtrail">
      &gt; Front of Book &gt; Conflicts of interest: none declared
    </h1>
Which what happens when I manually edit using the h1 tag on the Book View.

And <div id="printheader_crumbtrail">\s+(.*?)\s+</div> doesn't seem to work.

Also I can't convert the CHM to epub due to a broken TOC that omits one main chapter (and I've tried Calibre and a few other options), so building it through Sigil seems a much a straightforward job (there's also a weird problem of loss of character sets when I try rebuilding the CHM using WinCHM so I'd rather preserve the decompiled files that retain their appearance in Sigil).
wobohohoho is offline   Reply With Quote
Old 12-28-2012, 07:37 AM   #11
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 1,997
Karma: 4633978
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by wobohohoho View Post
And <div id="printheader_crumbtrail">\s+(.*?)\s+</div> doesn't seem to work.
Did you download the latest Sigil version and change Mode from Normal to Regex?
Doitsu is offline   Reply With Quote
Old 12-28-2012, 09:03 AM   #12
wobohohoho
Member
wobohohoho began at the beginning.
 
wobohohoho's Avatar
 
Posts: 11
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (Windows Phone) EPub reader app
Ahaha! Sorry, I didn't know had to change the mode to regex. Thanks! Works perfectly!
wobohohoho is offline   Reply With Quote
Old 12-28-2012, 09:39 AM   #13
wobohohoho
Member
wobohohoho began at the beginning.
 
wobohohoho's Avatar
 
Posts: 11
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (Windows Phone) EPub reader app
Also, it seems like I'd rather put my work through Sigil than Calibre on all the other decompiled CHMs with this method (since they are bereft of headings)! Thanks again!
wobohohoho is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Book Title from filename EricBr Kindle Formats 3 06-17-2012 10:43 PM
Use Filename as Title in ANY Reader tonyc46 Android Devices 2 02-21-2012 10:23 PM
using filename for title pj123 Calibre 1 05-22-2011 06:18 PM
importing PDF with author, title in filename autchirion Library Management 3 02-22-2011 11:46 AM
Metadata in Title/filename mezme Calibre 0 08-18-2010 03:08 AM


All times are GMT -4. The time now is 10:34 AM.


MobileRead.com is a privately owned, operated and funded community.