Txt to Epub: how to create a toc
Hello,
a few days ago I installed iBooks for the iPhone and since then I'am interested in creating books in the epub format. I tried to convert a PDF document but I still have a few problems. I hope, you can help me solving these.
I use the command line to convert my books because the GUI is inaccessible for me.
My initial position is a PDF document which is about 300 sites. At the end I want to create an epub file with a table of contents. At first I extracted the text of the document (with the Acrobat Reader). After that I adapted the .txt document, for example I removed the page numbers. Now I used the following command to convert the book:
ebook-convert file.txt file.epub --level1-toc="//*[re:test(., 'PROLOGUE|PART\s+|EPILOGUE', '')]"
but this didn't work, the book was splitted into 7 parts and a toc was created but the entries didn't match "PROLOGUE", "PART" or EPILOGUE. The same when I use --chapter instead of --level1-toc. Later I read something about markdown, so I put a "#" before the chapters manually and converted it with:
ebook-convert file.txt file.epub --markdown
Now I get the error, that the program couldn't find a reasonable point at which to split although my document has enough empty lines.
Then I tried to convert the PDF file directly with the following command:
ebook-convert file.pdf file.epub --remove-footer --footer-regex="Page\s\d.*" --level1-toc="//*[re:test(., 'PROLOGUE|PART\s+|EPILOGUE', '')]"
This did almost work. I got a table of contents with the right entries. But the page numbers weren't removed completely, about the half is still in the epub. With Notepad++ and the RegExp "Page\s\d.*" I could remove all page numbers. I also tried the --header with no success.
I don't understand why the XPath expression works for a pdf file and not for txt. Both convert the text into html and then it's only matching the text with the XPath expression.
By the way: Always when I start a conversion process, I get the message "Initial parse failed" but after this the process continues. I don't know, if this matters.
I hope, that someone can help me and maybe also can explain why the creation of the toc doesn't work.
Thank you and regards
|