Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-13-2011, 12:12 PM   #1
webwizard
Member
webwizard began at the beginning.
 
webwizard's Avatar
 
Posts: 20
Karma: 10
Join Date: Feb 2011
Device: Kindle DX
Batch doc conversion

Here I am again.

I had a problem of converting a bunch of doc files to .mobi. Problem is that Calibre doesn't handle doc conversion, so a possible solution is to pass through the html format, but if this is a good solution for three or four docs, when you have fifty doesn't look so good anymore.

What I managed to do is to write a macro to open, convert and save a group of files from MS Word, that means, of course, that you must have MS Word installed. I used 2007, but 2003 should work as well.

I'm not going to detail how to write a VBA macro, try Google first. If you're stuck, anyway, I'll try to help.

The macro is this:
Code:
Sub BatchConvertToHTML()

Dim dlgInputFolder As New CommonDialog
Dim strFileName As String
Dim strNames() As String

dlgInputFolder.MaxFileSize = 32000
dlgInputFolder.Flags = cdlOFNAllowMultiselect + cdlOFNExplorer + cdlOFNLongNames
dlgInputFolder.Filter = "Word document (*.doc)|*.doc"

dlgInputFolder.ShowOpen
'Parse the zeroes in the string
For x = 1 To Len(dlgInputFolder.FileName)
If Asc(Mid(dlgInputFolder.FileName, x, 1)) = 0 Then int_zeroes = int_zeroes + 1
Next

ReDim strNames(int_zeroes) As String
int_index = 0

'put each file name in a string array
For x = 1 To Len(dlgInputFolder.FileName)
If Asc(Mid(dlgInputFolder.FileName, x, 1)) = 0 Then
    int_index = int_index + 1
Else
    strNames(int_index) = strNames(int_index) + (Mid(dlgInputFolder.FileName, x, 1))
End If
Next
'Conversion starting
For x = 1 To int_zeroes
    Wordconvert (strNames(0) & "\" & strNames(x))
Next
    
End Sub


Sub Wordconvert(strDoc As String)

   Dim strAuthor As String, strTitle As String
   Dim strDocName As String, strFileName As String
'extract author and title from file name
   strDocName = Left(strDoc, Len(strDoc) - 4)
   strFileName = extractfilename(strDocName)
   strAuthor = Trim(Left(strFileName, InStr(1, strFileName, " - ")))
   strTitle = Trim(Right(strFileName, Len(strFileName) - InStr(1, strFileName, " - ") - 2))
   strDocName = strDocName & ".html"
'open the document
   Documents.Open strDoc
'if your word document already has title and author correctly set, just comment or delete the two following lines 
   Documents(strDoc).BuiltInDocumentProperties(wdPropertyAuthor).Value = strAuthor
   Documents(strDoc).BuiltInDocumentProperties(wdPropertyTitle).Value = strTitle
'save and close the document
   Documents(strDoc).SaveAs FileFormat:=wdFormatFilteredHTML, FileName:=strDocName
   Documents(strDocName).Close
End Sub

Function extractfilename(strfile As String) As String

'Simply put, the file string passed by common dialog is complete with the full path
'Here I strip out the path and take only the file name, to extract author and title

pos = 1
pos1 = 1

Do

    pos1 = InStr(pos, strfile, "\")
    If pos1 > 0 Then pos = pos1 + 1

Loop Until pos1 = 0

extractfilename = Right(strfile, Len(strfile) - pos)

End Function
I know it's not a perfect coding, but was made in half an hour so take it as it is

Another important thing is to activate a reference for the common dialog. This is done in Visual Basic by clicking on Tools, then References, search for "Microsoft Common Dialog Control 6.0" and select it. If you don't find it in the list, browse for the file "COMDLG32.OCX". This is needed for the "Open File" form to work.

You have to create a new module under the Visual Basic window of an empty document and paste the code there. Next you click on the "play macro" button of the taskbar. An "open file" window will appear: select all the doc files you need to convert and click "Open". Word will then quickly open and save as HTML all the files.

Just one thing is important to remember: if you plan to extract the title and author from the file, the filename MUST be in the following format:
<author> - <title>.doc
and no dashes are allowed in the name or title.

That's the best I could do in half an hour, but I hope that helps.

Bye

Paul

PS: That's the last one, I swear...
webwizard is offline   Reply With Quote
Old 02-13-2011, 11:59 PM   #2
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by webwizard View Post
I had a problem of converting a bunch of doc files to .mobi. Problem is that Calibre doesn't handle doc conversion, so a possible solution is to pass through the html format, but if this is a good solution for three or four docs, when you have fifty doesn't look so good anymore.
Excellent info, but for anyone reading this thread who's eyes just glazed over and who want to do a mass conversion of doc files to mobi files read this post and this post and change any occurrence of epub to mobi.
DoctorOhh is offline   Reply With Quote
Advert
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
pdftohtml Batch Conversion kad032000 Sony Reader 8 06-27-2008 09:50 AM
Batch conversion html to lrf lilpretender Sony Reader 5 04-22-2008 09:22 PM
how to do batch conversion with libprs mazzeltjes Calibre 8 02-12-2008 09:24 AM
Batch conversion of txt BlackVoid Sony Reader 8 11-17-2007 09:53 PM
tips for batch conversion with emacs klikklak Sony Reader 0 11-14-2007 12:02 PM


All times are GMT -4. The time now is 06:30 AM.


MobileRead.com is a privately owned, operated and funded community.