Dead plugin, but there's a better alternative with the LibreOffice converter that features 1:1 conversion from doc to odt (can convert from anything that it can read to anything that it can write, probably with a better quality than Python's internal conversion options).
Tested with the following batch on Windows. I don't know Python, but is there an easy option to adjust/fork the current "Microsoft Doc Input Plugin" to use LibreOffice in the background?
Code:
@echo off
REM search for documents in the "docs" folder where this batch is located
for /R docs %%f IN (*.doc) do if "%%~xf"==".doc" (
REM specific output directory
"C:\Program Files\LibreOffice\program\soffice" --convert-to odt --outdir "docs" "%%f"
REM LibreOffice --convert-to documentation
REM https://help.libreoffice.org/latest/...rtfilters.html
)
In Python probably sth. like this:
Code:
import subprocess # Import subprocess module
import os # We will use the exists() function from this module to know if the file was created.
def convert_doc_to_odt(file_path, output_dir):
subprocess.run(
f'/opt/libreoffice7.3/program/soffice \
--headless \
--convert-to odt \
--outdir {output_dir} {file_path}', shell=True)
pdf_file_path = f'{output_dir}{file_path.rsplit("/", 1)[1].split(".")[0]}.odt'
if os.path.exists(pdf_file_path):
return pdf_file_path
else:
return None
file_path = '/home/tarik/docs/file.docx'
output_dir = '/tmp/'
file = convert_doc_to_odt(file_path, output_dir)
if file:
print(f'File converted to {file}.')
else:
print('Unable to convert the file.')