Hello all,
does anybody know of a programme or other tool to convert .doc format into .docx format? I'm using AntFileConverter to transform my files into .txt format to be able to create a corpora, but AntFileConverter only seems to be working with .docx format.
Thanks! Laura
You can convert it using MS Word, or even open office. And I believe you can use even some libraries to do so on mass
On Mon, 17 Apr 2023, 17:57 Laura Narisano via Corpora, < corpora@list.elra.info> wrote:
Hello all,
does anybody know of a programme or other tool to convert .doc format into .docx format? I'm using AntFileConverter to transform my files into .txt format to be able to create a corpora, but AntFileConverter only seems to be working with .docx format.
Thanks! Laura _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
You can use pandoc: https://pandoc.org/MANUAL.html
On Mon, 17 Apr 2023 at 18:08, Nikola Milosevic via Corpora < corpora@list.elra.info> wrote:
You can convert it using MS Word, or even open office. And I believe you can use even some libraries to do so on mass
On Mon, 17 Apr 2023, 17:57 Laura Narisano via Corpora, < corpora@list.elra.info> wrote:
Hello all,
does anybody know of a programme or other tool to convert .doc format into .docx format? I'm using AntFileConverter to transform my files into .txt format to be able to create a corpora, but AntFileConverter only seems to be working with .docx format.
Thanks! Laura _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
If you're on Linux, you can use *uniconv *(also for batch conversions). See e.g. for Ubuntu-based distros: https://stackoverflow.com/questions/62251410/convert-old-doc-format-to-the-n...
Good luck! Yannis
On Mon, 17 Apr 2023 at 19:11, Francis Bond via Corpora < corpora@list.elra.info> wrote:
You can use pandoc: https://pandoc.org/MANUAL.html
On Mon, 17 Apr 2023 at 18:08, Nikola Milosevic via Corpora < corpora@list.elra.info> wrote:
You can convert it using MS Word, or even open office. And I believe you can use even some libraries to do so on mass
On Mon, 17 Apr 2023, 17:57 Laura Narisano via Corpora, < corpora@list.elra.info> wrote:
Hello all,
does anybody know of a programme or other tool to convert .doc format into .docx format? I'm using AntFileConverter to transform my files into .txt format to be able to create a corpora, but AntFileConverter only seems to be working with .docx format.
Thanks! Laura _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
-- Francis Bond https://fcbond.github.io/
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
Hi Laura,
You can use LibreOffice in batch mode to do this.
See for example the section "Converting DOC to DOCX with LibreOffice" here : http://www.cantoni.org/2020/01/15/how-to-convert-word-doc-to-docx-format
You need to know how to use command line (a terminal window) for this.
Best, Serge
Le 17/04/2023 à 18:16, Ioannis Saridakis via Corpora a écrit :
If you're on Linux, you can use /uniconv /(also for batch conversions). See e.g. for Ubuntu-based distros: https://stackoverflow.com/questions/62251410/convert-old-doc-format-to-the-n...
Good luck! Yannis
On Mon, 17 Apr 2023 at 19:11, Francis Bond via Corpora corpora@list.elra.info wrote:
You can use pandoc: https://pandoc.org/MANUAL.html On Mon, 17 Apr 2023 at 18:08, Nikola Milosevic via Corpora <corpora@list.elra.info> wrote: You can convert it using MS Word, or even open office. And I believe you can use even some libraries to do so on mass On Mon, 17 Apr 2023, 17:57 Laura Narisano via Corpora, <corpora@list.elra.info> wrote: Hello all, does anybody know of a programme or other tool to convert .doc format into .docx format? I'm using AntFileConverter to transform my files into .txt format to be able to create a corpora, but AntFileConverter only seems to be working with .docx format. Thanks! Laura _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info -- Francis Bond <https://fcbond.github.io/> _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
-- ΠΑΣΑ ΕΠΙΣΤΗΜΗ ΧΩΡΙΖΟΜΕΝΗ ΔΙΚΑΙΟΣΥΝΗΣ ΚΑΙ ΤΗΣ ΑΛΛΗΣ ΑΡΕΤΗΣ ΠΑΝΟΥΡΓΙΑ ΟΥ ΣΟΦΙΑ ΦΑΙΝΕΤΑΙ - Πλάτων All knowledge, when separated from justice and virtue, is seen to be cunning and not wisdom - Plato
*Ioannis E. Saridakis *Associate Professor, Corpus Linguistics and Translation Studies Department of Turkish Studies and Modern Asian Studies (Deputy president) School of Economics and Political Science National and Kapodistrian University of Athens http://en.uoa.gr
Corpora mailing list --corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email tocorpora-leave@list.elra.info
Hi Laura,
I've not used / encountered DOC in a long time, but perhaps pandoc ( https://pandoc.org/) can handle it? It handles DOCX and a wide variety of other formats reasonably well.
It also might be possible to use LibreOffice or similar in order to bulk process them from the commandline without having to open each file individually.
There's also this answer on StackOverflow which uses only software from Microsoft to do the conversion: https://stackoverflow.com/a/2405508
Good luck!
Peace, Dave ---- David M. Howcroft https://www.davehowcroft.com
On Mon, Apr 17, 2023 at 3:57 PM Laura Narisano via Corpora < corpora@list.elra.info> wrote:
Hello all,
does anybody know of a programme or other tool to convert .doc format into .docx format? I'm using AntFileConverter to transform my files into .txt format to be able to create a corpora, but AntFileConverter only seems to be working with .docx format.
Thanks! Laura _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info