|
|
#1 |
|
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 574
Karma: 32228
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
Splitting multiple html files?
Hi you guys,
I know that Editor can split a single html files using xpath. It is great. But I wonder if there is a way to split all the html files at the same time (something like "split mark" in Sigil). Before I saved all the footnotes at the end of the respective htmls, now I want to merge them into a single endnote file. I have to move to every html and split and merge... Ah, I used file_name in Regex Function and it returns the whole html path (I can use regex to strip off the unwanted part) but is there a way to get only the name, not the extension? (I use it for note IDs) |
|
|
|
|
|
#2 |
|
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
I also use file_name (full) for note IDs. But, because you need to remove the extension?
To extract notes from all files and dump them in a specific file (notas.xhtml), as I have not sufficient knowledge of Python, I do the following: 1- I make notas.xhtml 2- I use this regex-function Code:
#Searching: (<p class="nota".+?>.+?</p>)
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
notas = open('e:/Libros/Taller/En curso/notas.txt', 'a')
texto = match.group()+'\n'
notas.write(texto)
return ''
replace.file_order = 'spine'
And sorry for my english. |
|
|
|
|
|
#3 |
|
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 574
Karma: 32228
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
Thank you. I will try to play with that. I'm no programmer, though.
What I did is search for "#n(\d+)" (in #n1, for example) Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
text='#'
text2= '_'
return text + file_name + text2 + match.group(1)
#OEBPS/1.html_1 I would only want: #1_1 (of course, i could use regex to clean the unwanted portion afterward, but I would be nicer to have it done in one regex function, and I could learn something as well) Last edited by nqk; 11-24-2015 at 03:32 AM. |
|
|
|
|
|
#4 |
|
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
You can use:
file_name = file_name [6:len(file_name)-5] |
|
|
|
|
|
#5 |
|
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 574
Karma: 32228
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
|
|
|
|
|
|
#6 |
|
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
You are welcome.
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Merging multiple HTML files into one HTML file | skoobwoman | Workshop | 45 | 07-11-2014 11:46 AM |
| splitting html files? | NASCARaddicted | ePub | 8 | 01-22-2013 05:13 AM |
| How To Stop It From Splitting HTML Files? | Ransom | Calibre | 8 | 06-12-2011 03:08 PM |
| Does splitting EPUB among more HTML files improve Performance? | purcelljf | ePub | 2 | 10-01-2010 02:15 AM |
| Splitting the Bible into Multiple Files | SciFiGal777 | Ectaco jetBook | 3 | 03-27-2010 10:35 PM |