Mammoth .docx converter

0

Mammoth is designed to convert .docx documents, such as those created by Microsoft Word, Google Docs and LibreOffice, and convert them to HTML. Mammoth aims to produce simple and clean

Version
Last updated
Active installations
WordPress Version
Tested up to
Rating
Total ratings
Tag
This plugin is outdated and might not be supported anymore.

Description

Mammoth is designed to convert .docx documents, such as those created by Microsoft Word, Google Docs and LibreOffice, and convert them to HTML. Mammoth aims to produce simple and clean HTML by using semantic information in the document, and ignoring other details. For instance, Mammoth converts any paragraph with the style Heading1 to h1 elements, rather than attempting to exactly copy the styling (font, text size, colour, etc.) of the heading. This allows you to paste from Word documents without the usual mess.

There’s a large mismatch between the structure used by .docx and the structure of HTML, meaning that the conversion is unlikely to be perfect for more complicated documents. Mammoth works best if you only use styles to semantically mark up your document.

The following features are currently supported:

  • Headings.

  • Lists.

  • Tables. The formatting of the table itself, such as borders, is currently ignored, but the formatting of the text is treated the same as in the rest of the document.

  • Footnotes and endnotes.

  • Images.

  • Bold, italics, superscript and subscript.

  • Links.

  • Text boxes. The contents of the text box are treated as a separate paragraph that appears after the paragraph containing the text box.

Embedded style maps

By default, Mammoth maps some common .docx styles to HTML elements. For instance, a paragraph with the style name Heading 1 is converted to a h1 element. If you have a document with your own custom styles, you can use an embedded style map to tell Mammoth how those styles should be mapped. For instance, you could convert paragraphs with the style named WarningHeading to h1 elements with class="warning" with the style mapping:

p[style-name='WarningHeading'] => h1.warning:fresh

An online tool can be used to embed style maps into an existing document. Details of how to write style maps can be found on the mammoth.js documentation.

A style map to be used for all documents can be set by configuring Mammoth (see below).

Configuration

Mammoth can be configured by writing a separate plugin. For instance, this example plugin adds a custom style map, and uses a document transform to detect paragraphs of monospace text and converts them to paragraphs with the style “Code Block”.

As a WordPress plugin, Mammoth uses the JavaScript library mammoth.js to convert documents. Mammoth will use the JavaScript global MAMMOTH_OPTIONS whenever calling mammoth.js, which allows for some customisation. MAMMOTH_OPTIONS should be defined as a function that returns an options object. This options object will then be passed in as the options argument to convertToHtml. The mammoth.js docs describe the various options available.

MAMMOTH_OPTIONS will be called with `mammoth` as the first argument. This can be useful if you need to use a function from mammoth.js, such as `mammoth.transforms.getDescendantsOfType`.

FAQs

Answers to some frequently asked questions about Mammoth.

Donations

If you’d like to say thanks, feel free to make a donation through Ko-fi.

If you use Mammoth as part of your business, please consider supporting the ongoing maintenance of Mammoth by making a weekly donation through Liberapay.