Working with OpenXML is not that easy. Although very powerful, the XML structure is complex especially when you want to programmatically edit existing OpenXML documents.
One tool that can help you to make this job somewhat easier are the PowerTools for Open XML. It contains a wide range of features and guidance for accomplishing various tasks using the Open XML SDK.
While browsing through the code, a colleague discovered the following hidden gem inside the PowerTools: the MarkupSimplifier class. This class allows you to cleanup an OpenXML document before you start editing it. The end result after the cleanup operation is a simplified XML structure that is easier to manipulate using the OpenXML SDK.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
private static void SimplifyMarkup(WordprocessingDocument wordDoc) | |
{ | |
SimplifyMarkupSettings settings = new SimplifyMarkupSettings | |
{ | |
NormalizeXml = true, // Merges Run's in a paragraph with similar formatting | |
// Additional settings if required | |
AcceptRevisions = true, | |
RemoveBookmarks = true, | |
RemoveComments = true, | |
RemoveGoBackBookmark = true, | |
RemoveWebHidden = true, | |
RemoveContentControls = true, | |
RemoveEndAndFootNotes = true, | |
RemoveFieldCodes = true, | |
RemoveLastRenderedPageBreak = true, | |
RemovePermissions = true, | |
RemoveProof = true, | |
RemoveRsidInfo = true, | |
RemoveSmartTags = true, | |
RemoveSoftHyphens = true, | |
ReplaceTabsWithSpaces = true | |
}; | |
MarkupSimplifier.SimplifyMarkup(wordDoc, settings); | |
} |