Thread: I need help!!!
View Single Post
Old 11-02-2010, 05:02 PM   #8
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,885
Karma: 6120478
Join Date: Nov 2009
Device: many
Hi,

If you are okay running python you could try something like the following on each file in the epub that had stanza's in it (after unzipping it of course)

Code:
#!/usr/bin/env python                                                                                   
import sys
import os
def main(argv=sys.argv):
    if len(argv) != 3:
        print "syntax is:  python fixme.py INPUTFILE OUTPUTFILE"
        return 1
    infile = argv[1]
    outfile = argv[2]
    if not os.path.exists(infile):
        print "input file was not found"
        return 1

    data = file(infile,'rb').read()
    of = file(outfile,'wb')
    lines = data.split(os.linesep)
    instanza = False
    res = ''
    for line in lines:
        if line.find('<div class="stanza">') != -1:
            instanza = True
        if instanza :
            line = line.replace('<p>','')
            line = line.replace('</p>','<br />')
            if line.find('</div>') != -1:
                instanza = False
        line += os.linesep
        res += line
    of.write(res)
    of.close()

if __name__ == '__main__':
    sys.exit(main())

my test.html is:
Code:
<html>
<body>
<p>do not change me</p>
<div class="stanza">
  <p>“‘Tell me, my old friend, tell me why</p>
  <p>You sit and softly laugh by yourself.’</p>
  <p>‘It is because I am repeating to myself,</p>
  <p>Write! write</p>
  <p>Of the valiant strength,</p>
  <p>The calm, brave bearing</p>
  <p>Of the sons of the sea.’”</p>
</div>
<p>do not change me either</p>
</body>
</html>
And running:

python fixme.py test.html test_fixed.html

gives the following for test_fixed.html

Code:
<html>
<body>
<p>do not change me</p>
<div class="stanza">
  “‘Tell me, my old friend, tell me why<br />
  You sit and softly laugh by yourself.’<br />
  ‘It is because I am repeating to myself,<br />
  Write! write<br />
  Of the valiant strength,<br />
  The calm, brave bearing<br />
  Of the sons of the sea.’”<br />
</div>
<p>do not change me either</p>
</body>
</html>


This is of course only a simple test and the pasting of it here may cause problems if it messes up spacing and things, but a similar approach can be used for almost any mass change you want.

If you are desperate enough to want to give it a try, pm me with your e-mail and I will send you the python file.

If you are macosx, linux or unix based, you can use 'sed" to do this or awk or almost any simple scripting language like python (above) or perl or php, etc.

KevinH

Last edited by KevinH; 11-02-2010 at 05:16 PM. Reason: add test example and output
KevinH is offline   Reply With Quote