Txt to Epub: how to create a toc

nestol · 08-09-2010, 08:56 PM

Hello,

a few days ago I installed iBooks for the iPhone and since then I'am interested in creating books in the epub format. I tried to convert a PDF document but I still have a few problems. I hope, you can help me solving these.
I use the command line to convert my books because the GUI is inaccessible for me.
My initial position is a PDF document which is about 300 sites. At the end I want to create an epub file with a table of contents. At first I extracted the text of the document (with the Acrobat Reader). After that I adapted the .txt document, for example I removed the page numbers. Now I used the following command to convert the book:
ebook-convert file.txt file.epub --level1-toc="//*[re:test(., 'PROLOGUE|PART\s+|EPILOGUE', '')]"
but this didn't work, the book was splitted into 7 parts and a toc was created but the entries didn't match "PROLOGUE", "PART" or EPILOGUE. The same when I use --chapter instead of --level1-toc. Later I read something about markdown, so I put a "#" before the chapters manually and converted it with:
ebook-convert file.txt file.epub --markdown
Now I get the error, that the program couldn't find a reasonable point at which to split although my document has enough empty lines.

Then I tried to convert the PDF file directly with the following command:
ebook-convert file.pdf file.epub --remove-footer --footer-regex="Page\s\d.*" --level1-toc="//*[re:test(., 'PROLOGUE|PART\s+|EPILOGUE', '')]"
This did almost work. I got a table of contents with the right entries. But the page numbers weren't removed completely, about the half is still in the epub. With Notepad++ and the RegExp "Page\s\d.*" I could remove all page numbers. I also tried the --header with no success.
I don't understand why the XPath expression works for a pdf file and not for txt. Both convert the text into html and then it's only matching the text with the XPath expression.
By the way: Always when I start a conversion process, I get the message "Initial parse failed" but after this the process continues. I don't know, if this matters.

I hope, that someone can help me and maybe also can explain why the creation of the toc doesn't work.
Thank you and regards

08-09-2010, 08:56 PM	#1
nestol Junior Member Posts: 1 Karma: 10 Join Date: Aug 2010 Device: iPhone	Txt to Epub: how to create a toc Hello, a few days ago I installed iBooks for the iPhone and since then I'am interested in creating books in the epub format. I tried to convert a PDF document but I still have a few problems. I hope, you can help me solving these. I use the command line to convert my books because the GUI is inaccessible for me. My initial position is a PDF document which is about 300 sites. At the end I want to create an epub file with a table of contents. At first I extracted the text of the document (with the Acrobat Reader). After that I adapted the .txt document, for example I removed the page numbers. Now I used the following command to convert the book: ebook-convert file.txt file.epub --level1-toc="//[re:test(., 'PROLOGUE\|PART\s+\|EPILOGUE', '')]" but this didn't work, the book was splitted into 7 parts and a toc was created but the entries didn't match "PROLOGUE", "PART" or EPILOGUE. The same when I use --chapter instead of --level1-toc. Later I read something about markdown, so I put a "#" before the chapters manually and converted it with: ebook-convert file.txt file.epub --markdown Now I get the error, that the program couldn't find a reasonable point at which to split although my document has enough empty lines. Then I tried to convert the PDF file directly with the following command: ebook-convert file.pdf file.epub --remove-footer --footer-regex="Page\s\d." --level1-toc="//[re:test(., 'PROLOGUE\|PART\s+\|EPILOGUE', '')]" This did almost work. I got a table of contents with the right entries. But the page numbers weren't removed completely, about the half is still in the epub. With Notepad++ and the RegExp "Page\s\d." I could remove all page numbers. I also tried the --header with no success. I don't understand why the XPath expression works for a pdf file and not for txt. Both convert the text into html and then it's only matching the text with the XPath expression. By the way: Always when I start a conversion process, I get the message "Initial parse failed" but after this the process continues. I don't know, if this matters. I hope, that someone can help me and maybe also can explain why the creation of the toc doesn't work. Thank you and regards

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
[Old Thread] How do I maually create a TOC for a .rtf?	djloewen	Calibre	4	02-13-2013 05:35 AM
Create Newspaper TOC	plantoschka	Kindle Formats	2	09-09-2010 02:03 PM
How to create a TOC from scratch?	greenapple	Sigil	5	06-01-2010 02:20 AM
How to create non-embedded Unicode EPUB,LRF,TXT,RTF,PDF	alexmobile	Sony Reader	1	09-23-2009 10:04 PM
How to create linked TOC?	squawker	Sony Reader	1	03-04-2007 08:20 AM