Summary

Microsoft released MarkItDown 0.1.6 with an OCR layer service for embedded images and scanned PDFs, a fix for linear memory growth in PDF conversion, deeper security-posture documentation, and an Azure Content Understanding converter. The release strengthens MarkItDown as a document-ingestion utility for LLM, RAG, and enterprise knowledge workflows.

What changed

MarkItDown 0.1.6 shipped on May 26, 2026 with OCR, PDF memory fixes, security documentation improvements, and Azure Content Understanding conversion support.

Why it matters

Document-to-Markdown conversion remains a core bottleneck for enterprise RAG and agent context pipelines. MarkItDown's OCR and Azure integration make it more useful for messy scanned documents and Microsoft-centered enterprise ingestion workflows, while the security and memory fixes matter for production deployment.

Evidence excerpt

The MarkItDown 0.1.6 release notes list an OCR layer service for embedded images and PDF scans, a PDF memory-growth fix, clarified security posture, and an Azure Content Understanding converter.

Sources