|06-07-2011, 02:00 PM||#1|
Join Date: Sep 2010
Device: Sony PRS 650
Batch convert MS Word to other formats
I wrote this in partial reponse to a request, but because of its complexity it's getting its own thread.
Here's a macro to batch convert .doc files. If there are any errors blame it on Cabernet Merlot and report problems here.
Please note that I'm including full instructions to help anyone who is not familiar with VBA macros. Please do not be offended if you already know this stuff. It's intended also for beginners who happen upon this thread.
Word docs can be batch converted to TXT, RTF, or Filtered HTML and in Word 2007 you can 'export' to PDF.
I created the macro to be used with Word 2003 and 2007.
If you wish to take advantage of Word 2007's ability to export as PDF you must remove the apostrophe at the start of this line:-
'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
- as unfortunately I haven't the time to find a way to work around the compile error that occurs with this line in Word 2003. As you'd expect Word 2003 simply doesn't know 'wdExportFormatPDF' which became available in Word 2007.
All your doc files go in one folder. You open Word which has this macro in it. You run the macro. All the doc files are loaded, converted and saved in a new folder. Your original docs are unchanged in the first folder.
Here's the code.
Option Explicit Sub ChangeDocsToTxtOrRTFOrHTML() 'with export to PDF in Word 2007 Dim fs As Object Dim oFolder As Object Dim tFolder As Object Dim oFile As Object Dim strDocName As String Dim intPos As Integer Dim locFolder As String Dim fileType As String On Error Resume Next locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\myDocs") Select Case Application.Version Case Is < 12 Do fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT")) Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML") Case Is >= 12 Do fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT")) Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF") End Select Application.ScreenUpdating = False Set fs = CreateObject("Scripting.FileSystemObject") Set oFolder = fs.GetFolder(locFolder) Set tFolder = fs.CreateFolder(locFolder & "Converted") Set tFolder = fs.GetFolder(locFolder & "Converted") For Each oFile In oFolder.Files Dim d As Document Set d = Application.Documents.Open(oFile.Path) strDocName = ActiveDocument.Name intPos = InStrRev(strDocName, ".") strDocName = Left(strDocName, intPos - 1) ChangeFileOpenDirectory tFolder Select Case fileType Case Is = "TXT" strDocName = strDocName & ".txt" ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText Case Is = "RTF" strDocName = strDocName & ".rtf" ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF Case Is = "HTML" strDocName = strDocName & ".html" ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML Case Is = "PDF" strDocName = strDocName & ".pdf" ' *** Word 2007 users - remove the apostrophe at the start of the next line *** 'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF End Select d.Close ChangeFileOpenDirectory oFolder Next oFile Application.ScreenUpdating = True End Sub
Copy the code into Notepad
In Notepad > Format > uncheck 'Word Wrap'. IMPORTANT as broken lines won't compile.
From the View Menu select Toolbars > Visual Basic
On the Visual Basic toolbar click the Visual Basic icon (hover cursor to find it) or Press ALT + F11
Click the big multicoloured cloverleaf icon in top left. Click Word Options button (at very bottom). Check 'Show Developer tab in the Ribbon' and click 'OK'. Now on the same line as 'Home', at the far right, you'll see 'Developer'. Click this. At the left end of this toolbar click 'Visual Basic'.
In the Visual Basic Editor > View > Project Explorer (but it may be showing already).
Click the plus sign next to 'Normal'.
Click the plus sign next to 'Modules'.
Double click 'NewMacros' to open its code panel.
Scroll to the end of any macros, if present.
Right click Normal > Insert > Module
This will probably be named 'Module 1' and is in the 'Modules' folder above 'NewMacros'.
Double click 'Module 1' to open its code panel.
Copy ALL the code (Ctrl A, Ctrl C) from Notepad and paste into the code panel in the Visual Basic Editor.
Organise your Doc files:
Leave Word for the moment.
You must now place your Word docs into a single folder. (Start with a couple of doc files to try it out)
I suggest that you put this folder in a place where its 'long path name' will be short, eg in your root directory and you'll need the 'long path name' of this folder.*(see next)
*How to get the file path:
Open the folder with your docs in and copy the path from the address bar.
If it's not showing: Tools > Folder Options > View > CHECK 'Display the full path in the address bar' > OK.
You can make this the default path in your macro by changing the line:
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\myDocs")
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "Put YOUR Path Here")
'Default' means if you click 'OK' instead of inserting a new path this is the path used.
Your converted files (Text, RTF, HTML files) will go into a folder adjacent to the folder holding the doc files.
This folder will be created if it doesn't already exist and be named 'myDocsConverted' or 'whatever you've called the folder' + 'Converted'.
To run the code:
From the Visual Basic Editor - Either place the cursor somewhere inside the code and click the 'Run' icon (a triangle) or press F5
Go into Word, Word 2003 - click the 'Run' icon and select this macro 'ChangeDocsToTxtOrRTFOrHTML'.
Or in Word 2007 - Developer tab - click Macros - select this macro 'ChangeDocsToTxtOrRTFOrHTML'.
You will be asked for the location of the folder (entered as a long path name). If you've amended the default path in the macro you can simply click OK.
You will be asked if you want to save copies as Text, RTF or HTML (filtered) with TXT the default. (Also PDF in Word 2007)
There will be some screen flicker as each file is loaded, saved and closed.
That's it. Done!
When you close Word, the macro will be saved in 'Normal.dot', either in 'NewMacros' or 'Module 1'. You can re-use it whenever you open Word.
To remove it, select all the code and delete.
~ "Word cannot give a document the same name as an open document" ~
If you have a txt, rtf or html file already open in Word and try to SaveAs with the same filename and same extension it will cause the macro to error. If this happens. Click 'End' in the dialog that appears. Close the offending file.
If you click DeBug by mistake - Go into the Visual Basic Editor (ALT + F11) and click the 'Stop' icon (a square near the triangle). Close the offending file.
Be aware that there is considerable variation in file size as you change format:
From smallest to largest, with text file only it's often:- txt < html < doc <rtf <pdf
but with an image included:- html - doc < pdf < rtf. (Watch out for a file with images becoming too big a single file for your ebook reader.)
If you get notices regarding Macro Security then you'll need to alter your security settings within MS Office.
|06-11-2011, 03:27 PM||#3|
Join Date: Sep 2010
Device: Sony PRS 650
Totally ignore negative, unhelpful comments. Small minds...
BTW, I wrote the macro assuming sensible use, but if you want to experiment try putting files other than .docs in the start folder, eg RTF, HTML, TXT and for a laugh try JPG!!! It would be easy to include a check for the file-type before converting but currently too many other irons in the fire.
If anyone is averse to using VBA macros then may I suggest you take a look at the Atlantis Word Processor (shareware) which has a batch convert facility.
Last edited by Faster; 06-18-2011 at 05:52 PM.
|08-28-2011, 05:46 AM||#4|
Join Date: Apr 2011
Device: Yarvik 310
At the risk of committing thread necromancy I have to first thank Faster for the help, and then ask if there is any way of writing that Put YOUR Path Here part so that it can handle subfolders?
|04-12-2013, 08:00 AM||#6|
Join Date: Apr 2013
First time writer, long time reader!
I have used this macro and it works great, I have a couple of questions though:
Is there a way of moving the original documents to be converted to an Archive folder as part of this macro, to avoid them being converted more than once?
Is there a way of setting the macro so you dont have to click ok on the two option that come up and it just uses pre-defined answers i.e. i always want it to convert to the same folder and i always want it to convert to pdf.
|07-05-2013, 03:46 PM||#7|
Join Date: Jul 2013
worth registering just to say thank you
It is not an exaggeration to say that this post just saved me about a day of work. I hate registering on sites I don't use regularly just to post a comment, but it was worth it just to say thank you!
I've experimented with at least 10 different ways of converting Word DOCX -> PDF, and all of them had drawbacks, sometimes really horrible drawbacks like badly mangling the document. This allows me to use Word's native PDF conversion routine, which is generally the highest quality PDF I can generate from a Word document, in a big batch.
If someone could post a way to eliminate the "save document" dialog as this macro rotates through documents -- I do not want to edit the original docs in any way -- that would save me additional minutes of productivity. :^)
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|BATCH WORD TO RTF||judahis||Other formats||2||03-21-2011 02:40 PM|
|How to batch-convert with ebook-convert?||cypresstwist||Conversion||8||02-22-2011 09:28 AM|
|Convert Epub and Msreader formats to Kindle formats||bruc79||Calibre||17||06-22-2010 04:50 AM|
|Convert word DOCs when you don't have WORD ? heheh||macthekitten||Calibre||9||01-30-2009 07:41 AM|
|Word Documents - batch convert title property from file name||tomliversidge||Workshop||4||11-07-2008 11:20 AM|