The challenge
Legacy documentation existed as generic XML and Markdown with no semantic typing, no metadata, and no section identifiers. AI systems treated all content as undifferentiated plain text — chatbot retrieval precision was roughly 25%, and there was no way to filter by audience, difficulty, or intent. Content chunking broke at arbitrary points because there were no semantic boundaries.