CASE 03 / 03 2024 Technical training

Custom DITA-OT Publishing Automation

Two branded DITA-OT plugins and a multi-format build pipeline producing HTML5, print-ready PDF, and PWA-installable output from one DITA source.

Duration: 10 weeks
Team: 1 publishing engineer · 1 XSLT specialist
Engagement: Project-based, fixed scope
Status: Shipped · in production for ongoing rebuilds

Challenge and approach

The challenge

Default DITA-OT output is functional but generic — no branding, no interactive navigation, no mobile optimization, and no progressive web app support. The training curriculum needed professional, branded deliverables across multiple formats, all generated from the same DITA source with zero manual post-processing.

The approach

We developed two custom DITA-OT plugins — one for HTML5 and one for PDF — with XSLT template overrides, branded CSS, custom JavaScript, and XSL-FO page layouts. Shell scripts automate the full build cycle: plugin installation, output generation, asset copying, and quality verification.

Artifact Ledger

4 output formats
2 custom plugins
0 manual steps
1 single source

Stack

Schema: DITA 1.3
Plugins: com.extense.html5.branded · com.extense.pdf.branded
HTML5: DITA-OT 4.2 · custom CSS · PWA manifest + service worker
PDF: Apache FOP 2.9 · XSL-FO · MINITOC chapter mode
Preview: Python 3.12 · live-reload topic server
Build: Shell scripts · containerized DITA-OT

Plugin architecture.

com.extense.html5.branded

XSLT topic-level and TOC navigation overrides
Branded CSS with custom highlight classes and responsive layout
JavaScript for interactive features
Full PWA support — manifest.json, service worker, installable icons
Custom font configuration
XSLT parameter injection via insertparams.xml

com.extense.pdf.branded

XSL-FO customizations for page layout, headers, and footers
Custom cover page with branding
Chapter-level mini-TOC layout (MINITOC mode)
Print-optimized font configuration
Yellow highlight support for instructional callouts
Consistent visual treatment across 218 pages

Build pipeline.

HTML5 build

Automated shell script: installs the branded plugin, cleans output, runs the DITA-OT transformation, copies static assets (JS, CSS, PWA manifests, fonts), and produces an interactive guide ready for web deployment.

PDF build

Installs the PDF plugin, builds with MINITOC chapter layout, and reports final page count and file size via pdfinfo. Produces a 218-page branded document from a single command.

Instant preview

A Python-based preview server renders individual DITA topics in the browser without a full build. Enables a rapid edit → reload authoring workflow during content development.

Decisions and trade-offs.

The choices that shaped the engagement, recorded with the option taken and what was rejected. The reasoning matters more than the outcome.

Plugin architecture

Chosen

Two separate branded plugins (HTML5 + PDF)

Rejected

Single multi-output plugin

Why: Separate plugins isolate format-specific concerns. A single plugin entangles XSL-FO logic with HTML5 CSS and becomes unmaintainable within a year as one format's needs diverge from the other's.
PDF rendering engine

Chosen

XSL-FO + Apache FOP

Rejected

CSS Paged Media (PrinceXML class)

Why: FOP is open-source, integrates cleanly with the existing DITA-OT pipeline, and avoids per-build commercial licensing — material in the client's distribution scenario where the document is rebuilt on every curriculum update.
Mini-TOC layout

Chosen

Chapter-level MINITOC at every entry

Rejected

Single document-level TOC only

Why: A 218-page document with one front TOC is unscannable. Chapter-level mini-TOCs let learners orient at every entry point without paging back to the global index.
Preview workflow

Chosen

Python-based instant topic preview server

Rejected

Full DITA-OT build per content change

Why: Full builds take 90+ seconds. Instant preview renders one topic in under a second and removes the friction that otherwise discourages incremental authoring discipline.

A note on these numbers.

The figures in the artifact ledger are direct counts from the deliverables shipped on this engagement — not ROI projections or aggregated averages. Outcome percentages referenced anywhere on this site reflect industry benchmarks published by OASIS, Gartner, and CIDM for organizations that achieve 40%+ content reuse with structured metadata. Your actual results depend on content volume, language count, update frequency, and current toolchain maturity. Every engagement begins by measuring your baseline so projections are defensible.

Sample Content Assessment

Submit a 20-page sample. We'll return conversion feasibility, content recovery rate, and engineering effort within two business days. The analysis is the basis for any further engagement, with no obligation to proceed.

Submit a sample →