Sort Order and Depth-FIrst
I recently did some experiments with Calibre to see what actually happens when you specify an index file referencing several other html files. I had several questions:
1. Does Calibre change the order of included files?
I created 5 html files named part_1.html, part_2.html, part_10.html, part_11.html and part_20.html. Each file looks like:
Code:
<html><body>
<p>This is file 1.</p>
</body></html>
with the "1" replaced with 2, 10, 11 or 20, as appropriate for each file.
I tried to create an epub from just these five files:
Code:
ebook-convert *.html foo.epub
but calibre wouldn't take it. So, following Kovid's recommendation in the documentation, I created an index file that looks like:
Code:
<html><body>
<p><a href=file_1.html>file 1</a></p>
<p><a href=file_2.html>file 2</a></p>
<p><a href=file_10.html>file 10</a></p>
<p><a href=file_11.html>file 11</a></p>
<p><a href=file_20.html>file 20</a></p>
</body></html>
and created an epub:
Code:
ebook-convert index.html foo.epub
The order of the files in the epub was 1, 2, 10, 11, 20, just as in the index.html. From this, I conclude that calibre doesn't just bunch all the references together and then retrieve them in sort order. Further, there's no need to rename the files with leading zeros before single digits.
2. Well, just when depth-first order important?
Following Kovid's example, I changed file_2.html to include a link to file_11.html and made an epub out of this. The sequence of files turned out to be 1, 2, 11, 10, 20, just as Kovid warned it would be. The difference here is that there was a forward reference (in file_2.html) that needed to be chased down first.
3. What happens if you reverse the files in index.html?
I arranged the links in index.html to be in reverse order. File_2.html still linked to file_11.html, but now it's a backward link. Made an epub out of this and the order was 20, 11, 10, 2, 1 without any need for a depth-first search.
Conclusions:
Simple forward links at the top (index) level behave nicely, coming out in the same order you list them. Actually, it's a depth-first search where the depth is 1.
Forward links at the second (or lower) levels will include files in depth-first order.
Backward links don't change the include order at all.