Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Other formats

Notices

Reply
 
Thread Tools Search this Thread
Old 06-07-2011, 02:00 PM   #1
Faster
Connoisseur
Faster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of light
 
Posts: 61
Karma: 12096
Join Date: Sep 2010
Location: Tasmania
Device: Sony PRS 650
Batch convert MS Word to other formats

I wrote this in partial reponse to a request, but because of its complexity it's getting its own thread.

Here's a macro to batch convert .doc files. If there are any errors blame it on Cabernet Merlot and report problems here.

Please note that I'm including full instructions to help anyone who is not familiar with VBA macros. Please do not be offended if you already know this stuff. It's intended also for beginners who happen upon this thread.

Word docs can be batch converted to TXT, RTF, or Filtered HTML and in Word 2007 you can 'export' to PDF.
I created the macro to be used with Word 2003 and 2007.
If you wish to take advantage of Word 2007's ability to export as PDF you must remove the apostrophe at the start of this line:-

'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF

- as unfortunately I haven't the time to find a way to work around the compile error that occurs with this line in Word 2003. As you'd expect Word 2003 simply doesn't know 'wdExportFormatPDF' which became available in Word 2007.

Overview:
All your doc files go in one folder. You open Word which has this macro in it. You run the macro. All the doc files are loaded, converted and saved in a new folder. Your original docs are unchanged in the first folder.

Here's the code.
Code:
Option Explicit

Sub ChangeDocsToTxtOrRTFOrHTML()
'with export to PDF in Word 2007
    Dim fs As Object
    Dim oFolder As Object
    Dim tFolder As Object
    Dim oFile As Object
    Dim strDocName As String
    Dim intPos As Integer
    Dim locFolder As String
    Dim fileType As String
    On Error Resume Next
    locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\myDocs")
    Select Case Application.Version
        Case Is < 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
        Case Is >= 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
    End Select
    Application.ScreenUpdating = False
    Set fs = CreateObject("Scripting.FileSystemObject")
    Set oFolder = fs.GetFolder(locFolder)
    Set tFolder = fs.CreateFolder(locFolder & "Converted")
    Set tFolder = fs.GetFolder(locFolder & "Converted")
    For Each oFile In oFolder.Files
        Dim d As Document
        Set d = Application.Documents.Open(oFile.Path)
        strDocName = ActiveDocument.Name
        intPos = InStrRev(strDocName, ".")
        strDocName = Left(strDocName, intPos - 1)
        ChangeFileOpenDirectory tFolder
        Select Case fileType
        Case Is = "TXT"
            strDocName = strDocName & ".txt"
            ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
        Case Is = "RTF"
            strDocName = strDocName & ".rtf"
            ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
        Case Is = "HTML"
            strDocName = strDocName & ".html"
            ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
        Case Is = "PDF"
            strDocName = strDocName & ".pdf"

            ' *** Word 2007 users - remove the apostrophe at the start of the next line ***
            'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
            
        End Select
        d.Close
        ChangeFileOpenDirectory oFolder
    Next oFile
    Application.ScreenUpdating = True
End Sub
Putting the macro into Word:
Copy the code into Notepad
In Notepad > Format > uncheck 'Word Wrap'. IMPORTANT as broken lines won't compile.

Open Word
WORD 2003
From the View Menu select Toolbars > Visual Basic
On the Visual Basic toolbar click the Visual Basic icon (hover cursor to find it) or Press ALT + F11
WORD 2007
Click the big multicoloured cloverleaf icon in top left. Click Word Options button (at very bottom). Check 'Show Developer tab in the Ribbon' and click 'OK'. Now on the same line as 'Home', at the far right, you'll see 'Developer'. Click this. At the left end of this toolbar click 'Visual Basic'.

In the Visual Basic Editor > View > Project Explorer (but it may be showing already).

EITHER
Click the plus sign next to 'Normal'.
Click the plus sign next to 'Modules'.
Double click 'NewMacros' to open its code panel.
Scroll to the end of any macros, if present.

OR
Right click Normal > Insert > Module
This will probably be named 'Module 1' and is in the 'Modules' folder above 'NewMacros'.
Double click 'Module 1' to open its code panel.

Copy ALL the code (Ctrl A, Ctrl C) from Notepad and paste into the code panel in the Visual Basic Editor.

Organise your Doc files:
Leave Word for the moment.
You must now place your Word docs into a single folder. (Start with a couple of doc files to try it out)
I suggest that you put this folder in a place where its 'long path name' will be short, eg in your root directory and you'll need the 'long path name' of this folder.*(see next)

*How to get the file path:
Open the folder with your docs in and copy the path from the address bar.
If it's not showing: Tools > Folder Options > View > CHECK 'Display the full path in the address bar' > OK.

You can make this the default path in your macro by changing the line:
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\myDocs")
to:
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "Put YOUR Path Here")

'Default' means if you click 'OK' instead of inserting a new path this is the path used.
Your converted files (Text, RTF, HTML files) will go into a folder adjacent to the folder holding the doc files.
This folder will be created if it doesn't already exist and be named 'myDocsConverted' or 'whatever you've called the folder' + 'Converted'.

To run the code:
From the Visual Basic Editor - Either place the cursor somewhere inside the code and click the 'Run' icon (a triangle) or press F5
or
Go into Word, Word 2003 - click the 'Run' icon and select this macro 'ChangeDocsToTxtOrRTFOrHTML'.
Or in Word 2007 - Developer tab - click Macros - select this macro 'ChangeDocsToTxtOrRTFOrHTML'.

Operation:
You will be asked for the location of the folder (entered as a long path name). If you've amended the default path in the macro you can simply click OK.
You will be asked if you want to save copies as Text, RTF or HTML (filtered) with TXT the default. (Also PDF in Word 2007)
There will be some screen flicker as each file is loaded, saved and closed.
That's it. Done!

When you close Word, the macro will be saved in 'Normal.dot', either in 'NewMacros' or 'Module 1'. You can re-use it whenever you open Word.
To remove it, select all the code and delete.

Possible problems:
~ "Word cannot give a document the same name as an open document" ~
If you have a txt, rtf or html file already open in Word and try to SaveAs with the same filename and same extension it will cause the macro to error. If this happens. Click 'End' in the dialog that appears. Close the offending file.
If you click DeBug by mistake - Go into the Visual Basic Editor (ALT + F11) and click the 'Stop' icon (a square near the triangle). Close the offending file.

Be aware that there is considerable variation in file size as you change format:
From smallest to largest, with text file only it's often:- txt < html < doc <rtf <pdf
but with an image included:- html - doc < pdf < rtf. (Watch out for a file with images becoming too big a single file for your ebook reader.)

If you get notices regarding Macro Security then you'll need to alter your security settings within MS Office.
Faster is offline   Reply With Quote
Old 06-10-2011, 02:07 AM   #2
judahis
Enthusiast
judahis is on a distinguished road
 
Posts: 31
Karma: 68
Join Date: Dec 2005
Thanks a lot. The guy who told me "come on, why so many files, concentrate on 'quality', not quantity" has no idea of why I have so many files, and your macro is greatly appreciated.
judahis is offline   Reply With Quote
Old 06-11-2011, 03:27 PM   #3
Faster
Connoisseur
Faster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of light
 
Posts: 61
Karma: 12096
Join Date: Sep 2010
Location: Tasmania
Device: Sony PRS 650
You're welcome.

Totally ignore negative, unhelpful comments. Small minds...

BTW, I wrote the macro assuming sensible use, but if you want to experiment try putting files other than .docs in the start folder, eg RTF, HTML, TXT and for a laugh try JPG!!! It would be easy to include a check for the file-type before converting but currently too many other irons in the fire.

If anyone is averse to using VBA macros then may I suggest you take a look at the Atlantis Word Processor (shareware) which has a batch convert facility.

Last edited by Faster; 06-18-2011 at 05:52 PM.
Faster is offline   Reply With Quote
Old 08-28-2011, 05:46 AM   #4
ryublu
Member
ryublu began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Apr 2011
Device: Yarvik 310
At the risk of committing thread necromancy I have to first thank Faster for the help, and then ask if there is any way of writing that Put YOUR Path Here part so that it can handle subfolders?
ryublu is offline   Reply With Quote
Old 01-19-2013, 10:50 AM   #5
randyp1234
Member
randyp1234 began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Jul 2010
Device: Nook
Thanks....

Just found this thread via Google looking for this exact solution.

Thanks so much!!!

Randy
randyp1234 is offline   Reply With Quote
Old 04-12-2013, 08:00 AM   #6
azwalker
Junior Member
azwalker began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Apr 2013
Device: none
Question

Hello,

First time writer, long time reader!

I have used this macro and it works great, I have a couple of questions though:

Is there a way of moving the original documents to be converted to an Archive folder as part of this macro, to avoid them being converted more than once?

Is there a way of setting the macro so you dont have to click ok on the two option that come up and it just uses pre-defined answers i.e. i always want it to convert to the same folder and i always want it to convert to pdf.

Cheers.
azwalker is offline   Reply With Quote
Old 07-05-2013, 03:46 PM   #7
rhaining
Junior Member
rhaining began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jul 2013
Device: none
Talking worth registering just to say thank you

It is not an exaggeration to say that this post just saved me about a day of work. I hate registering on sites I don't use regularly just to post a comment, but it was worth it just to say thank you!

I've experimented with at least 10 different ways of converting Word DOCX -> PDF, and all of them had drawbacks, sometimes really horrible drawbacks like badly mangling the document. This allows me to use Word's native PDF conversion routine, which is generally the highest quality PDF I can generate from a Word document, in a big batch.

If someone could post a way to eliminate the "save document" dialog as this macro rotates through documents -- I do not want to edit the original docs in any way -- that would save me additional minutes of productivity. :^)
rhaining is offline   Reply With Quote
Old 10-24-2015, 02:17 PM   #8
adoucette
Member
adoucette doesn't litteradoucette doesn't litter
 
Posts: 24
Karma: 140
Join Date: Sep 2011
Device: Nook Color (rooted?)
Thank you, this saved me a good deal of time
adoucette is offline   Reply With Quote
Old 05-26-2016, 07:30 PM   #9
Gardoglee
Junior Member
Gardoglee began at the beginning.
 
Posts: 1
Karma: 10
Join Date: May 2016
Device: Nook and Kindle (have and use both)
Extremely useful, and a good base for other macros

This is a great macro, and has already saved me several hours of work, with the promise to save many more in the future. I think it will also serve as the basis for some additional variations I will write. I'm not good enough at VBA to have written this myself, but I will be able to tweak it a bit to do some other useful things. And for those who use sed, grep, Notepad++ or the Powershell regex equivalents, this macro can be the magic doorway to allow you to use those tools against folders full of Word documents.

Thanks!

Gardoglee


[QUOTE=Faster;1593832]I wrote this in partial response to a request, but because of its complexity it's getting its own thread.

Here's a macro to batch convert .doc files. If there are any errors blame it on Cabernet Merlot and report problems here.
Gardoglee is offline   Reply With Quote
Reply

Tags
word conversion

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
BATCH WORD TO RTF judahis Other formats 2 03-21-2011 02:40 PM
How to batch-convert with ebook-convert? cypresstwist Conversion 8 02-22-2011 09:28 AM
Convert Epub and Msreader formats to Kindle formats bruc79 Calibre 17 06-22-2010 04:50 AM
Convert word DOCs when you don't have WORD ? heheh macthekitten Calibre 9 01-30-2009 07:41 AM
Word Documents - batch convert title property from file name tomliversidge Workshop 4 11-07-2008 11:20 AM


All times are GMT -4. The time now is 11:56 PM.


MobileRead.com is a privately owned, operated and funded community.