Alas, no programming languages, but I'm getting a little better at adapting found code as a template.
The form of the ABBYY output is very straightforward....
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=WINDOWS-1252">
<meta name="generator" content="ABBYY FineReader 9.0">
<meta name="author" content="">
<meta name="description" content="">
<meta name="keywords" content="">
<title></title>
<style type="text/css">
table.main {}
tr.row {}
td.cell {}
div.block {}
div.paragraph {}
.font0 { font:6.00pt "Arial", sans-serif; }
.font1 { font:40.00pt "Arial", sans-serif; }
.font2 { font:5.00pt "Arial Narrow", sans-serif; }
.font3 { font:6.00pt "Arial Narrow", sans-serif; }
.font4 { font:7.00pt "Arial Narrow", sans-serif; }
.font5 { font:8.00pt "Arial Narrow", sans-serif; }
.font6 { font:11.00pt "Arial Narrow", sans-serif; }
.font7 { font:12.00pt "Arial Narrow", sans-serif; }
.font8 { font:13.00pt "Arial Narrow", sans-serif; }
.font9 { font:15.00pt "Arial Narrow", sans-serif; }
.......
</style>
</head>
<body>
<p></p>
<p><span class=font9>CHAPTER I</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<hr>
<p></p>
<p><span class=font9>CHAPTER I</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<hr>
<p></p>
<p><span class=font9>CHAPTER I</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<p><span class=font9>text</span></p>
<hr>
<p></p>
<p><span class=font9>CHAPTER 2</span></p>
<p><span class=font6>text</span></p>
<p><span class=font3>text</span></p>
<p><span class=font4>text</span></p>
<p><span class=font2>text</span></p>
<hr>
<p><span class=font9>text</span></p>
<p><span class=font8>text</span></p>
<p><span class=font4>text</span></p>
<p><span class=font9>text</span></p>
<hr>
Thus: that would be pages 1-5. Each chapter begins with <p></p>
each page break is represented by <hr>
that's it.
|