paddleocr 3.5 adds transformers backend

source: hugging face blog: paddleocr 3.5: running ocr and document parsing tasks with a transformers backend

level: technical

paddleocr 3.5 introduces a flexible inference engine interface. developers can choose the backend by setting engine="transformers" and pass options like dtype and device placement through engine_config. the pipelines for ocr and document parsing are still managed by paddleocr, so users do not need to handle internal components manually. this release focuses on the inference backend layer, giving supported models another runtime option that fits naturally into hugging face environments.

the update helps with document ingestion for retrieval-augmented generation, document ai, and agent applications. turning pdfs, scans, tables, and complex layouts into structured data is often the hardest step before using a language model. paddleocr provides models like pp-ocrv5 for ocr and paddleocr-vl 1.5 for document parsing. with the transformers backend, these capabilities connect more easily to pytorch and transformers stacks, making it simpler to build downstream workflows.

to get started, install paddleocr 3.5, paddlex, transformers, and a compatible pytorch build. use the command line or python api with engine="transformers". the transformers backend is useful when teams already use hugging face tools and want easier model discovery and integration. for maximum throughput, the default paddle_static backend is still recommended. this release adds flexibility, not a replacement, letting developers pick the backend that fits their stack.

why it matters: it simplifies connecting ocr and document parsing to transformers-based ai pipelines, reducing setup work for rag and document ai systems.

source: hugging face blog: paddleocr 3.5: running ocr and document parsing tasks with a transformers backend