RK-Transformers Documentation
RK-Transformers is a runtime library that seamlessly integrates Hugging Face transformers and sentence-transformers with Rockchip’s RKNN Neural Processing Units (NPUs). It enables efficient and facile deployment of transformer models on edge devices powered by Rockchip SoCs (RK3588, RK3576, etc.).
Key Features
Model Export & Conversion
Automatic ONNX Export: Converts Hugging Face models to ONNX with input detection
RKNN Optimization: Exports to RKNN format with configurable optimization levels (0-3)
Quantization: INT8 (w8a8) quantization with calibration dataset support
Push to Hub: Direct integration with Hugging Face Hub for model versioning
High-Performance Inference
NPU Acceleration: Leverage Rockchip’s hardware NPU for 10-20x speedup
Multi-Core Support: Automatic core selection and load balancing across NPU cores
Memory Efficient: Optimized for edge devices with limited RAM
Framework Integration
Sentence Transformers: Drop-in replacement with
RKSentenceTransformerandRKCrossEncoderTransformers API: Compatible with standard Hugging Face pipelines
Quick Links
User Guide
API Reference
Development
Community & Support
GitHub Repository: emapco/rk-transformers
Issue Tracker: Report a bug
Hugging Face Hub: rk-transformers models
License
This project is licensed under the Apache License 2.0.