Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Other formats

Notices

Reply
 
Thread Tools Search this Thread
Old 05-21-2015, 08:03 AM   #1
rebl
r.eads e.njoys b.ooks lol
rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.
 
rebl's Avatar
 
Posts: 69
Karma: 560010
Join Date: Mar 2010
Location: It's time to get this Book a Rest
Device: Kindle 4 NT
RTF files with wrong DOC extension - batch identify and rename?

Hello community,

I have a number of *.doc files but some of them are not really Word documents, but actually are RTF files with a wrong .doc extension.
If I open this kind of file in a plain text editor such as Notepad++ i can see the RTF syntax, so they are really RTF files.
Normally, if DOC files are associated in windows with a viewer that also supports RTF (most of them do) one wouldn't even't notice that the DOC file is not a doc file, but a RTF.
The problem is that in some circumstances such as with programs that use Microsoft's wordconv.exe utility (included with the "Office Compatibility Pack") to batch convert doc files to docx, the RTF files won't be converted and lead to errors/software freezeing depending on the software. The same applies with the doc-to-docx plugin in calibre.

In an older post (here) someone was mentioning that ther is a tool that is able to automatically scan for RTF files with wrong DOC extension and rename them.
Does anyone know about such tool?

Thank you.
rebl is offline   Reply With Quote
Old 05-21-2015, 10:39 PM   #2
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,239
Karma: 83049305
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
The linux `file` command tests files for file content, and does not look at the file extension. It will correctly identify an RTF misnamed as a DOC.

You could use that command in Cygwin on Windows, or ... quickly googles ... awesome, gnuwin32 has a native windows binary here: http://gnuwin32.sourceforge.net/packages/file.htm

Should be simple to write a batch script to test the output (e.g. "sample.doc: Rich Text Format data, version 1, ANSI") and if it matches RTF, then do a `ren sample.doc sample.rtf`.
eschwartz is offline   Reply With Quote
Old 05-23-2015, 04:45 PM   #3
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 4,225
Karma: 13659369
Join Date: Dec 2010
Device: Kindle PW2
The following simple batch file should do the job, if the .rtf files were originally created with Word or Wordpad. (Even though it's highly unlikely that it'll damage any of your files you may want to backup your Word documents before running it.)

Code:
FOR %%f IN ("*.doc") DO  (
    findstr /m /c:"rtf1" %%f && REN "%%f" "%%~nf.rtf"
)
Simply copy the above code to a text file, rename it to word.cmd, copy it to the folder with the word files and double-click on it. (The batch file will automatically rename all .rtf files with a .doc extension to .rtf files.)

@eschwartz: findstr is the Windows equivalent of grep.
Doitsu is offline   Reply With Quote
Old 05-25-2015, 03:05 AM   #4
rebl
r.eads e.njoys b.ooks lol
rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.
 
rebl's Avatar
 
Posts: 69
Karma: 560010
Join Date: Mar 2010
Location: It's time to get this Book a Rest
Device: Kindle 4 NT
Thanks! I had to put an additional double quotes pair because the file names contain spaces, but after that it worked fine:
FOR %%f IN ("*.doc") DO (
findstr /m /c:"rtf1" "%%f" && REN "%%f" "%%~nf.rtf"
)

I think a bulk rename utiliy with file search capabilities would be good for this task too, especially when working in multiple folders (but Flexible Renamer doesn't filter files by file content). Or I suppose the FOR loop could be modified to search recursively with a modified syntax (the /R parameter?). I know some ms-dos, but not good enough for this.

I tried to modify the batch file to work recursively for all doc files:
Code:
D:
CD "D:\testRTF"
FOR /R "D:\testRTF" %%f IN ("*.doc") DO  (
findstr /m /c:"rtf1" "%%f" && REN "%%f" "%%~nf.rtf"
)
PAUSE
REM REN "%%f" "%%~df%%~pf%%~nf.rtf"
Initially I used a wrong syntax for REN: REN "%%f" "%%~df%%~pf%%~nf.rtf" and I was getting an error (becuase the new file name musn't include the parth).

Last edited by rebl; 05-25-2015 at 07:49 AM.
rebl is offline   Reply With Quote
Old 05-25-2015, 08:12 AM   #5
rebl
r.eads e.njoys b.ooks lol
rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.
 
rebl's Avatar
 
Posts: 69
Karma: 560010
Join Date: Mar 2010
Location: It's time to get this Book a Rest
Device: Kindle 4 NT
I am trying to optimize the above batch file for working on a large number of files.
I was wondering if the findstr in the sintax above looks for the string "rtf1" throughout the whole file - that could take quite a long time. If the string is not found in the first line, the file should be skipped.
I found that the /B option is for matching the string only at the beginning of a line, but I couldn't find any option for matching the beginning of the file.
Also, I'm wondering if i could use "{\rtf1" for the string (should I escape the "\")?

It seems "{\rtf" works, I'm not sure if with \B is really faster but here is what I'm going to test:

Code:
PAUSE
D:
CD "D:\FolderName"
FOR /R "D:\FolderName" %%f IN ("*.doc") DO  (
findstr /B /M /C:"{\rtf" "%%f" && REN "%%f" "%%~nf.rtf"
)
PAUSE

Last edited by rebl; 05-25-2015 at 08:21 AM.
rebl is offline   Reply With Quote
Old 05-25-2015, 10:09 AM   #6
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 4,225
Karma: 13659369
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by rebl View Post
It seems "{\rtf" works, I'm not sure if with \B is really faster but here is what I'm going to test:
If you add the /R switch and remove the /B switch you could use a regular expression, but I doubt that it'd make the batch file run faster:

Code:
findstr /R /M /c:"^\{\\rtf1" "%%f" && REN "%%f" "%%~nf.rtf"
Doitsu is offline   Reply With Quote
Old 05-26-2015, 04:10 AM   #7
rebl
r.eads e.njoys b.ooks lol
rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.
 
rebl's Avatar
 
Posts: 69
Karma: 560010
Join Date: Mar 2010
Location: It's time to get this Book a Rest
Device: Kindle 4 NT
Thank you, I've read about that too, but the regex only offers beginning of line option, so it's the same as /B.
Nevertheless the batch script has run quite fast even without the /B option. Though, I prefer to use it - maybe it makes a difference.
I was able to rename all (I suppose) rtf files.
Thank you again for the help!
rebl is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
RTF documents wrongly catalogues with a DOC extension Westlyn Library Management 9 05-26-2015 05:14 AM
Need Help Deciding- doc & rtf files eSheri Which one should I buy? 10 01-13-2011 04:28 AM
Creating Bookmarks in RTF or DOC files? NiftyNifty1 Sony Reader Dev Corner 1 02-01-2009 07:58 AM
Cannot read RTF and DOC files in PRS505 garada k-7 Sony Reader 7 11-19-2008 07:08 PM
Using Finereader to batch convert PDF files to RTF gdxf Sony Reader 9 10-28-2006 04:14 PM


All times are GMT -4. The time now is 06:27 AM.


MobileRead.com is a privately owned, operated and funded community.