|02-13-2011, 12:12 PM||#1|
Join Date: Feb 2011
Device: Kindle DX
Batch doc conversion
Here I am again.
I had a problem of converting a bunch of doc files to .mobi. Problem is that Calibre doesn't handle doc conversion, so a possible solution is to pass through the html format, but if this is a good solution for three or four docs, when you have fifty doesn't look so good anymore.
What I managed to do is to write a macro to open, convert and save a group of files from MS Word, that means, of course, that you must have MS Word installed. I used 2007, but 2003 should work as well.
I'm not going to detail how to write a VBA macro, try Google first. If you're stuck, anyway, I'll try to help.
The macro is this:
Sub BatchConvertToHTML() Dim dlgInputFolder As New CommonDialog Dim strFileName As String Dim strNames() As String dlgInputFolder.MaxFileSize = 32000 dlgInputFolder.Flags = cdlOFNAllowMultiselect + cdlOFNExplorer + cdlOFNLongNames dlgInputFolder.Filter = "Word document (*.doc)|*.doc" dlgInputFolder.ShowOpen 'Parse the zeroes in the string For x = 1 To Len(dlgInputFolder.FileName) If Asc(Mid(dlgInputFolder.FileName, x, 1)) = 0 Then int_zeroes = int_zeroes + 1 Next ReDim strNames(int_zeroes) As String int_index = 0 'put each file name in a string array For x = 1 To Len(dlgInputFolder.FileName) If Asc(Mid(dlgInputFolder.FileName, x, 1)) = 0 Then int_index = int_index + 1 Else strNames(int_index) = strNames(int_index) + (Mid(dlgInputFolder.FileName, x, 1)) End If Next 'Conversion starting For x = 1 To int_zeroes Wordconvert (strNames(0) & "\" & strNames(x)) Next End Sub Sub Wordconvert(strDoc As String) Dim strAuthor As String, strTitle As String Dim strDocName As String, strFileName As String 'extract author and title from file name strDocName = Left(strDoc, Len(strDoc) - 4) strFileName = extractfilename(strDocName) strAuthor = Trim(Left(strFileName, InStr(1, strFileName, " - "))) strTitle = Trim(Right(strFileName, Len(strFileName) - InStr(1, strFileName, " - ") - 2)) strDocName = strDocName & ".html" 'open the document Documents.Open strDoc 'if your word document already has title and author correctly set, just comment or delete the two following lines Documents(strDoc).BuiltInDocumentProperties(wdPropertyAuthor).Value = strAuthor Documents(strDoc).BuiltInDocumentProperties(wdPropertyTitle).Value = strTitle 'save and close the document Documents(strDoc).SaveAs FileFormat:=wdFormatFilteredHTML, FileName:=strDocName Documents(strDocName).Close End Sub Function extractfilename(strfile As String) As String 'Simply put, the file string passed by common dialog is complete with the full path 'Here I strip out the path and take only the file name, to extract author and title pos = 1 pos1 = 1 Do pos1 = InStr(pos, strfile, "\") If pos1 > 0 Then pos = pos1 + 1 Loop Until pos1 = 0 extractfilename = Right(strfile, Len(strfile) - pos) End Function
Another important thing is to activate a reference for the common dialog. This is done in Visual Basic by clicking on Tools, then References, search for "Microsoft Common Dialog Control 6.0" and select it. If you don't find it in the list, browse for the file "COMDLG32.OCX". This is needed for the "Open File" form to work.
You have to create a new module under the Visual Basic window of an empty document and paste the code there. Next you click on the "play macro" button of the taskbar. An "open file" window will appear: select all the doc files you need to convert and click "Open". Word will then quickly open and save as HTML all the files.
Just one thing is important to remember: if you plan to extract the title and author from the file, the filename MUST be in the following format:
<author> - <title>.doc
and no dashes are allowed in the name or title.
That's the best I could do in half an hour, but I hope that helps.
PS: That's the last one, I swear...
|02-13-2011, 11:59 PM||#2|
US Navy, Retired
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|pdftohtml Batch Conversion||kad032000||Sony Reader||8||06-27-2008 09:50 AM|
|Batch conversion html to lrf||lilpretender||Sony Reader||5||04-22-2008 09:22 PM|
|how to do batch conversion with libprs||mazzeltjes||Calibre||8||02-12-2008 09:24 AM|
|Batch conversion of txt||BlackVoid||Sony Reader||8||11-17-2007 09:53 PM|
|tips for batch conversion with emacs||klikklak||Sony Reader||0||11-14-2007 12:02 PM|