|
|
#1 |
|
Member
![]() Posts: 20
Karma: 10
Join Date: Feb 2011
Device: Kindle DX
|
Batch doc conversion
Here I am again.
I had a problem of converting a bunch of doc files to .mobi. Problem is that Calibre doesn't handle doc conversion, so a possible solution is to pass through the html format, but if this is a good solution for three or four docs, when you have fifty doesn't look so good anymore. What I managed to do is to write a macro to open, convert and save a group of files from MS Word, that means, of course, that you must have MS Word installed. I used 2007, but 2003 should work as well. I'm not going to detail how to write a VBA macro, try Google first. If you're stuck, anyway, I'll try to help. The macro is this: Code:
Sub BatchConvertToHTML()
Dim dlgInputFolder As New CommonDialog
Dim strFileName As String
Dim strNames() As String
dlgInputFolder.MaxFileSize = 32000
dlgInputFolder.Flags = cdlOFNAllowMultiselect + cdlOFNExplorer + cdlOFNLongNames
dlgInputFolder.Filter = "Word document (*.doc)|*.doc"
dlgInputFolder.ShowOpen
'Parse the zeroes in the string
For x = 1 To Len(dlgInputFolder.FileName)
If Asc(Mid(dlgInputFolder.FileName, x, 1)) = 0 Then int_zeroes = int_zeroes + 1
Next
ReDim strNames(int_zeroes) As String
int_index = 0
'put each file name in a string array
For x = 1 To Len(dlgInputFolder.FileName)
If Asc(Mid(dlgInputFolder.FileName, x, 1)) = 0 Then
int_index = int_index + 1
Else
strNames(int_index) = strNames(int_index) + (Mid(dlgInputFolder.FileName, x, 1))
End If
Next
'Conversion starting
For x = 1 To int_zeroes
Wordconvert (strNames(0) & "\" & strNames(x))
Next
End Sub
Sub Wordconvert(strDoc As String)
Dim strAuthor As String, strTitle As String
Dim strDocName As String, strFileName As String
'extract author and title from file name
strDocName = Left(strDoc, Len(strDoc) - 4)
strFileName = extractfilename(strDocName)
strAuthor = Trim(Left(strFileName, InStr(1, strFileName, " - ")))
strTitle = Trim(Right(strFileName, Len(strFileName) - InStr(1, strFileName, " - ") - 2))
strDocName = strDocName & ".html"
'open the document
Documents.Open strDoc
'if your word document already has title and author correctly set, just comment or delete the two following lines
Documents(strDoc).BuiltInDocumentProperties(wdPropertyAuthor).Value = strAuthor
Documents(strDoc).BuiltInDocumentProperties(wdPropertyTitle).Value = strTitle
'save and close the document
Documents(strDoc).SaveAs FileFormat:=wdFormatFilteredHTML, FileName:=strDocName
Documents(strDocName).Close
End Sub
Function extractfilename(strfile As String) As String
'Simply put, the file string passed by common dialog is complete with the full path
'Here I strip out the path and take only the file name, to extract author and title
pos = 1
pos1 = 1
Do
pos1 = InStr(pos, strfile, "\")
If pos1 > 0 Then pos = pos1 + 1
Loop Until pos1 = 0
extractfilename = Right(strfile, Len(strfile) - pos)
End Function
![]() Another important thing is to activate a reference for the common dialog. This is done in Visual Basic by clicking on Tools, then References, search for "Microsoft Common Dialog Control 6.0" and select it. If you don't find it in the list, browse for the file "COMDLG32.OCX". This is needed for the "Open File" form to work. You have to create a new module under the Visual Basic window of an empty document and paste the code there. Next you click on the "play macro" button of the taskbar. An "open file" window will appear: select all the doc files you need to convert and click "Open". Word will then quickly open and save as HTML all the files. Just one thing is important to remember: if you plan to extract the title and author from the file, the filename MUST be in the following format: <author> - <title>.doc and no dashes are allowed in the name or title. That's the best I could do in half an hour, but I hope that helps. ![]() Bye Paul PS: That's the last one, I swear...
|
|
|
|
|
|
#2 | |
|
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
Quote:
|
|
|
|
|
| Advert | |
|
|
![]() |
| Thread Tools | Search this Thread |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| pdftohtml Batch Conversion | kad032000 | Sony Reader | 8 | 06-27-2008 10:50 AM |
| Batch conversion html to lrf | lilpretender | Sony Reader | 5 | 04-22-2008 10:22 PM |
| how to do batch conversion with libprs | mazzeltjes | Calibre | 8 | 02-12-2008 10:24 AM |
| Batch conversion of txt | BlackVoid | Sony Reader | 8 | 11-17-2007 10:53 PM |
| tips for batch conversion with emacs | klikklak | Sony Reader | 0 | 11-14-2007 01:02 PM |