Converting Documents From Markdown Into Microsoft Word Format

Chưa phân loại
Among other activities, writing and editing text documents belongs to the most common actions we use our (desktop) computers for. The exact way it is done follows different paths — from using a bare text editor like Vim to graphical applications like Open/Libre Office or cloud-based services that are accessible via webbrowser like Google Docs. To our disadvantage, every tool comes with its own native document format as well as selection of other supported document formats. The quality of the conversion between these formats varies widely, and can lead to a lot of frustration when crossing format boundaries.

In this article we have a look at the conversion between Markdown [1] and DOCX — the native document format of Microsoft Word that is in use since 2007. You may wonder why an enthusiast of Markdown and Asciidoc (like me) deals with this case. Well, collaborating with a group of other writers can lead to a situation whereas one or more participants request DOCX as the output format. Don’t let anybody down, and find out which limitations exist, instead, and how we can try to make all group members happy.

What is Markdown?

As already pointed out in “An Introduction into Markdown” [2], the intention for Markdown is a simple text to HTML conversion. The idea behind it was to make writing web pages, documentation and especially blog entries as easy as writing an e-mail. As of today it is the de facto-synonym for a class of lightweight markup description languages, and the goal can be seen as reached.

Markdown uses a plain text formatting syntax. With a similar approach as HTML a number of markers indicate headlines, lists, images, and references in your text. The few lines below illustrate a basic document that contains two headlines (1st and 2nd level) as well as two paragraphs, and a list environment.

# Recommended Places To Visit In Europe
## France
This is a selection of places:
* Paris (_Ile de France_)
* Strasbourg (_Alsace_)
For a proper visit plan about a week.

Conversion to DOCX

In order to convert your Markdown document to DOCX, use the tool pandoc [3]. Pandoc is a Haskell library, and describes itself as “the universal document converter”, or the “Swiss army knife for document conversions”. It is available for a variety of platforms such as Linux, Microsoft Windows, Mac OS X, and BSD. Pandoc is commonly included as a package for Linux distributions like Debian GNU/Linux, Ubuntu, and CentOS.

A simple call for a conversion is as follows:

$ pandoc -o test.docx

The first parameter `-o` refers to the output file, followed by the name of the file (`test.docx`). The file extension helps pandoc to identify the desired output format. The second parameter names the input file — in our case it is simply ``.

The long version of the command shown above contains the two parameters `-f markdown` and `-t docx`. The first one abbreviates the term `flavour`, and describes the format of the input file. The second one does the same for the output file, and abbreviates `-to`.

The full command is as follows:

$ pandoc -o test.docx -f markdown -t docx

Opening the converted file using Microsoft Word results in the following output:

For the different text elements Pandoc uses stylesheets. This allows you to adjust these elements later according to your needs throughout the entire document. The newer versions of Pandoc also offer the other way around — you can convert a DOCX file into Markdown as follows:

$ pandoc -o test.docx

Then, the generated file has the following content:

Recommended Places To Visit In Europe
This is a selection of places:
–   Paris (*Ile de France*)
–   Strasbourg (*Alsace*)
For a proper visit plan about a week.

Useful Command-line Options

The list of Pandoc options is rather long. The following ones help you to produce better results, and make your life much easier:

* `-P` (long version `–preserve-tabs`): Preserve tabs instead of converting them to spaces. This is useful for code blocks with  indented lines that are part of your text.

* `-S` (long version `–smart`): Produce typographically correct output.

This option corrects quotes, hyphens/dashes as well as ellipses  (“…”). Additional, non-breaking spaces are added after certain abbreviations such as “Mr.”.

* `–track-changes=value`: Specifies what to do with insertions, deletions, and comments that are produced with the help of the Microsoft Word  “Track Changes” feature. The value can be either accept, reject, or all in order to include or remove the changes made in the document. The result is a flat file.

For more options have a look at the documentation, and the manual page of Pandoc.


The conversion between Markdown and DOCX is no longer a mystery. It is done within a few steps, and works very well. Happy hacking 🙂

Links and References

* [1] Markdown
* [2] Frank Hofmann: Introduction to Markdown
* [3] Pandoc


The author would like to thank Annette Kalbow for her help while preparing the article.

ONET IDC thành lập vào năm 2012, là công ty chuyên nghiệp tại Việt Nam trong lĩnh vực cung cấp dịch vụ Hosting, VPS, máy chủ vật lý, dịch vụ Firewall Anti DDoS, SSL… Với 10 năm xây dựng và phát triển, ứng dụng nhiều công nghệ hiện đại, ONET IDC đã giúp hàng ngàn khách hàng tin tưởng lựa chọn, mang lại sự ổn định tuyệt đối cho website của khách hàng để thúc đẩy việc kinh doanh đạt được hiệu quả và thành công.
Bài viết liên quan

Python SciPy Tutorial

In this lesson, we will see what is the use of SciPy library in Python and how it helps us to work with mathematical equations...

Install SmartGit Git Client on Ubuntu, Linux Mint, CentOS, Fedora, RHEL

SmartGit is an efficient Git Client user interface with support for GitHub, Pull Requests + Comments, SVN as well as Mercurial....

Best Linux Distros for Beginners

Do you feel that you’ve had enough hearing about Linux and never trying it yourself? You’re not alone! Many people...
Bài Viết

Bài Viết Mới Cập Nhật


Mua Proxy v6 US Private chạy PRE, Face, Insta, Gmail

Mua shadowsocks và hướng dẫn sữ dụng trên window

Tại sao Proxy Socks lại được ưa chuộng hơn Proxy HTTP?

Mua thuê proxy v4 nuôi zalo chất lượng cao, kinh nghiệm tránh quét tài khoản zalo