Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-11-2012, 08:58 PM   #1
ctop
Member
ctop is on a distinguished road
 
Posts: 14
Karma: 60
Join Date: Jun 2008
Device: zaurus->palm->iPad->Sony PRS-T1
epub to epub conversion problem with regex spanning multiple input files

Hi there,

I am trying to fix some epubs I got, since they can't be used in my Sony PRS-T1. There are many problems, but I managed to write regular expressions to deal with them and got most of the things get straightened out. Now, for some reason the ePub file contains one HTML file for each page (!) of the books, which makes the reader starting a new page everytime, which I find very annoying. So, one of my regular expressions removes everything from the end of one of the </body> HTML file to the beginning of the next <body> element. This works quite nice when I test it in the Search and Replace wizard, but fails on actual conversion.
Now, my question is: Could it be that the conversion engine treats the files one by one? In that case, my regular expression could not match, since it spans files, but in the wizard, the whole ePub is presented as one file, so the expression matches.
Is there maybe an option to switch this behaviour on for conversion?

Any help appreciated,

Ctop
ctop is offline   Reply With Quote
Old 02-11-2012, 10:32 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 24,817
Karma: 4369673
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
regexes apply to a single file at a time. That cannot be changed. If you want to merge files, convert to htmlz in calibre.
kovidgoyal is offline   Reply With Quote
Old 02-12-2012, 01:56 AM   #3
ctop
Member
ctop is on a distinguished road
 
Posts: 14
Karma: 60
Join Date: Jun 2008
Device: zaurus->palm->iPad->Sony PRS-T1
Quote:
Originally Posted by kovidgoyal View Post
regexes apply to a single file at a time. That cannot be changed. If you want to merge files, convert to htmlz in calibre.
Thanks, Kovid, this works.

So I converted the files to HTMLZ, which produced one single HTML file, and then converted them back to ePub. As it turned out, I had to enable the "do not split on page break" option for the ePub, because otherwise I would get exactly the same problem as in the previous file.

BTW, is it mentioned somewhere that HTMLZ is always one single file? That is good to know and I am glad I found out now!

All the best,

Ctop
ctop is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Input formats for ePub conversion? llamedos Conversion 5 02-24-2011 01:55 PM
Txt files - Convert to Epub - Multiple files into one book - noob help Cernan Calibre 6 05-18-2010 10:12 AM
Multiple html to epub conversion. Barthelemy ePub 4 03-30-2010 06:18 AM
Epub To RB Conversion Problem - Files From Both Calibre and Nuevomedia Brandon202000 Calibre 2 03-25-2010 12:42 PM
Epub To RB Conversion Problem - Files From Both Calibre and Nuevomedia Brandon202000 Calibre 0 01-15-2010 02:39 PM


All times are GMT -4. The time now is 10:04 PM.


MobileRead.com is a privately owned, operated and funded community.