10-23-2012, 04:38 AM | #1 |
Junior Member
Posts: 1
Karma: 10
Join Date: Oct 2012
Device: generic
|
how to get chapter title from filename
Hi
In my document each chapter is a seperat xhtml-file. In the orginal, all chapters had only '*' as heading - so in the table of content, I have only al lot of *. Ist there a way to use the filenames / nummers ( all xhtmlfiles are in right order from 0001 to 0020) for table of content and fill the in automaticaly? thank you, Andi |
10-23-2012, 12:32 PM | #2 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
If I understand you correctly, you want to have a toc that that lists the chapters (e.g. Chapter 1, Chapter 2, etc), but you don't want to have this shown on the actual page, just have a * as the displayed chapter name.
How odd. What you could do is add a title to each of the * headings like the following example: Code:
<h2 title="Chapter 1">*</h2> |
10-23-2012, 03:24 PM | #3 |
Grand Sorcerer
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I think he/she means the opposite. The chapter headings are already "*". So that's what appears in the ToC. I believe they're looking for a way to use the physical individual chapter file (*.html) names to automatically fix the chapter heading and subsequently (I'm guessing) rebuild the ToC with the then corrected names.
Find and Replace with/without regex is the only possible way of automatically generating/changing those chapter headers (and thus the ToC) ... and to my knowledge, there's no provision for search and replace to access (or use) the underlying file's name in an F&R routine. Although technically, you could use regex on the toc.ncx file ... and as long as the play orders start with 1 and increment by 1 (you can automatically renumber them if not), you could capture the play order with regex and then use that number to build a new <text> entry: Find: Code:
<navPoint id="(.*?)" playOrder="(\d+)">\s+<navLabel>\s+<text>.*?</text> Code:
<navPoint id="\1" playOrder="\2"><navLabel><text>Chapter \2</text> The expression could probably be made a lot cleaner, but I thought someone mentioned that there's issues with \K and look(ahead|behind)s in the latest beta. NOTE: the above would fix the ToC, but do nothing to the chapter headings. |
10-23-2012, 05:33 PM | #4 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
You may be right.
I think we need the OP to better explain the situation, but on closer inspection, your response may be what they are after. |
10-24-2012, 11:46 AM | #5 |
Addict
Posts: 254
Karma: 69786
Join Date: May 2006
Location: Oslo, Norway
Device: Kobo Aura, Sony PRS-650
|
Unless the OP has a huge amount of files to process I would advise him to just fix the titles in the source manually and regenerate TOC, it's only 20 (searchable) entries. If he for some reason wants to keep the star instead of Chapter XX (which would be a bit confusing to the reader) your title="" method is probably better than toc.ncx regex trickery, as that will be stored in case of later TOC regenerations.
|
10-24-2012, 12:45 PM | #6 | |
Grand Sorcerer
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
Spoiler:
|
|
10-25-2012, 09:20 AM | #7 |
Addict
Posts: 254
Karma: 69786
Join Date: May 2006
Location: Oslo, Norway
Device: Kobo Aura, Sony PRS-650
|
|
12-28-2012, 05:52 AM | #8 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (WP 7.8) EPUB reader app
|
I have a somewhat similar problem. I have in excess of 500 htm files from a decompiled CHM (for just one book, but a few have the same problem):
1.that have individual sub-chapters as different htm files in different folders to denote main chapters (numbered 1 to 12) 2.the original CHM creators never bothered heading their documents, so there is no way to generate a TOC in the first place Since we can't use folders in Sigil, or EPUB for that matter to distinguish the loads of htm files, I would have to batch rename files to match their main chapters as there a few that are the same (e.g. Introduction) (which I'm okay with). Finding and replacing tags through all selected html files seems the way to go to create a TOC. But the suitable tag I could find is, for example, below: Code:
<div id="printheader_crumbtrail"> > Front of Book > Conflicts of interest: none declared </div> Question is, how do I find and replace these tags (or what regex do I use)? Or maybe other ways? |
12-28-2012, 06:50 AM | #9 | |
Grand Sorcerer
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Find: <div id="printheader_crumbtrail">\s+(.*?)\s+</div> Replace: <h3>\1</h3> However, before you do that you may want to search for dedicated CHM-to-EPUB converters first. There are a couple of standalone programs and some free websites, e.g. Zamzar that will convert .chm files to epubs. You could then use Sigil to fine-tune the epub. |
|
12-28-2012, 07:24 AM | #10 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (WP 7.8) EPUB reader app
|
Ooops. Terribly sorry, I didn't specify that what I want is to make it :
Code:
<h1 id="printheader_crumbtrail"> > Front of Book > Conflicts of interest: none declared </h1> And <div id="printheader_crumbtrail">\s+(.*?)\s+</div> doesn't seem to work. Also I can't convert the CHM to epub due to a broken TOC that omits one main chapter (and I've tried Calibre and a few other options), so building it through Sigil seems a much a straightforward job (there's also a weird problem of loss of character sets when I try rebuilding the CHM using WinCHM so I'd rather preserve the decompiled files that retain their appearance in Sigil). |
12-28-2012, 07:37 AM | #11 |
Grand Sorcerer
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
|
12-28-2012, 09:03 AM | #12 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (WP 7.8) EPUB reader app
|
Ahaha! Sorry, I didn't know had to change the mode to regex. Thanks! Works perfectly!
|
12-28-2012, 09:39 AM | #13 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2012
Location: KL, Malaysia
Device: Freda (WP 7.8) EPUB reader app
|
Also, it seems like I'd rather put my work through Sigil than Calibre on all the other decompiled CHMs with this method (since they are bereft of headings)! Thanks again!
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Book Title from filename | EricBr | Kindle Formats | 3 | 06-17-2012 10:43 PM |
Use Filename as Title in ANY Reader | tonyc46 | Android Devices | 2 | 02-21-2012 10:23 PM |
using filename for title | pj123 | Calibre | 1 | 05-22-2011 06:18 PM |
importing PDF with author, title in filename | autchirion | Library Management | 3 | 02-22-2011 11:46 AM |
Metadata in Title/filename | mezme | Calibre | 0 | 08-18-2010 03:08 AM |