Originally Posted by cjallan
We don't create the TOC by hand... but with MS Word's wizard. It takes no time at all once the headings are tagged.
From what I've heard you say, I think you adhere to the practice that is so often recommended... that is, get your files out of Word as soon as possible, to avoid errors Word can introduce... so I guess I can see why you don't use the Word wizard.
But what I don't understand is how do you get rid of user errors such as extra spaces, extra paragraph returns... extra page breaks, etc, etc...
Do you make the user clean up their Word files before you accept them?
No, although we do upcharge for messed-up Word files. We use regex, either through NoteTabPro clips or PERL (everyone floats their own boat around here, so to speak). We don't spend a lot of time worrying about extra spaces, in the literal sense of "spaces," as html ignores additional spaces. We clean up extra page breaks with a simple regex that yanks them ALL out (they're almost always misplaced anyway) and then regex the Chapter heads, which are styled specifically, to put the chapter breaks behind them (or any other class that uses that same "section" styling). The extra returns are of course more passes; start with 4 in a row, clean those out, then do 3; then use regex that recognizes alpha characters to check the doubles, so we can spot the scenebreaks.
Our repeat clients, of which, fortunately, we have many, tend to get "trained" to use specific multiple characters--e.g., *** or ### to indicate certain items, like chapter heads, scenebreaks, POV breaks, etc.
A large percentage of what we do is fairly automated now. It took us several years' worth of every type of bizarro-world thing you can think of (my favorites are the double-spaced ms's wherein someone types to the right margin and hits "enter" twice to "doublespace" his text for submission) to get some pretty reliable and steady regex, but that's basically how we do it. I don't think we do anything particularly magical or unusual; I think everyone who does volume ends up doing essentially the same thing, eventually. It's just more efficient than working in Word, plus, of course, we get a LOT of INDD and PDF clients, as well as Word, OO, LO, RTF, Pages and even Works (which always bakes my noodle--who even knew that was still floating around?).
I think what separates one firm from another is the QA, and the "hands in file" time that's expended to give the client a special experience when they open their book. We do, though, get a shocking number of authors who WANT their finished files to look like a Word file, which always--ALWAYS--catches me by surprise. Me, I want my books (I mean the ones I'm reading--not the ones I am not writing) to look like books, not Word files, but...each to their own, eh?
Although, the one we received in Powerpoint was quite the doozy! ;-)