RK-Transformers
  • Installation
    • Prerequisites
    • Quick Install
      • For Inference (on Rockchip devices [arm64])
      • For Model Export (on development machines [x86_64, arm64])
      • For Development
      • Using pip
  • Quick Start
    • Export a Model to RKNN
      • RK-Transformers CLI Help
      • Basic Export (Float16)
      • Export with Quantization (INT8)
      • Export Local ONNX Model
      • Programmatic Export
    • Run Inference
      • Using Sentence Transformers
      • Using RK-Transformers API
      • Using Transformers Pipelines
    • Next Steps

User Guide

  • User Guide
    • Model Export
      • Export Workflow
      • Command-Line Export
        • Basic Export
        • Key Parameters
      • Optimization Levels
      • Programmatic Export
      • Push to Hugging Face Hub
      • Troubleshooting
        • Unsupported Operators
        • Conversion Failures
        • Memory Issues
    • Inference
      • Loading Models
        • From Hugging Face Hub
        • From Local Path
        • Selecting Specific Model File
        • Sentence Transformers
      • Running Inference
        • Basic Inference
        • Batch Inference
        • Input Padding
        • Return Types
      • Supported Tasks
      • Performance Optimization
        • Core Mask Selection
        • Input Tensor Format
        • Model Selection
      • Error Handling
        • Runtime Errors
        • Shape Mismatches
    • NPU Core Configuration
      • Available Core Masks
      • Platform-Specific Notes
      • Usage Examples
        • RK-Transformers API
        • Sentence Transformers Integration
        • CrossEncoder Integration
      • Performance Considerations
        • Single Core vs Multi-Core
        • Auto vs Manual Selection
      • Best Practices
        • Multi-Model Deployment
      • Troubleshooting
        • Performance Issues
    • Framework Integrations
      • Sentence Transformers
        • RKSentenceTransformer
        • RKCrossEncoder
      • Hugging Face Transformers
        • Using Pipelines
        • RK-Transformers Model Usage
      • Custom Integrations
        • Building Custom Wrappers
        • Integration with Other Libraries
      • Migration Guide
        • From Sentence Transformers
        • From Transformers
    • RKNN Limitations
      • Dynamic Inputs & Static Shapes
        • Performance Impact
        • Input Padding
        • Recommendations
      • Quantization Support
        • Supported Datatypes
        • Recommendations
      • Operator Support
        • Unsupported Operators
        • Solutions
        • Checking Operator Support
      • Dtype Limitations
        • Input Tensor Types
        • Model Weight Types
      • Memory Constraints
        • NPU Memory Limits
      • Platform Compatibility
        • Supported Platforms
        • Export Requirements
        • Inference Requirements
        • Version Compatibility
      • Known Issues
      • Getting Help

API Reference

  • API Reference
    • Modeling
      • Base Classes
        • RKNNRuntime
        • RKModel
      • Task-Specific Models
        • Feature Extraction
        • Sequence Classification
        • Token Classification
        • Question Answering
        • Masked Language Modeling
        • Multiple Choice
    • Configuration
      • RKNN Configuration
        • RKNNConfig
      • Quantization Configuration
        • QuantizationConfig
      • Optimization Configuration
        • OptimizationConfig
    • Exporters
      • Export Functions
        • export_rknn()
    • Integrations
      • Sentence Transformers
        • RKSentenceTransformer
        • RKCrossEncoder
    • Environment Utilities
      • get_rktransformers_version()
      • get_librknnrt_version()
      • get_rockchip_board()
      • get_rknn_toolkit_version()
      • get_edge_host_platform()
      • is_rockchip_platform()
    • Import Utilities
      • check_package_availability()
      • is_sentence_transformers_available()
      • is_rknn_toolkit_lite_available()
      • is_rknn_toolkit_available()

Development

  • Local Development
    • Development Setup
    • Running Tests
    • Linting and Formatting
    • Environment Diagnostics
RK-Transformers
  • User Guide
  • Edit on GitHub

User Guide

This section provides comprehensive guides for using RK-Transformers.

  • Model Export
    • Export Workflow
    • Command-Line Export
    • Optimization Levels
    • Programmatic Export
    • Push to Hugging Face Hub
    • Troubleshooting
  • Inference
    • Loading Models
    • Running Inference
    • Supported Tasks
    • Performance Optimization
    • Error Handling
  • NPU Core Configuration
    • Available Core Masks
    • Platform-Specific Notes
    • Usage Examples
    • Performance Considerations
    • Best Practices
    • Troubleshooting
  • Framework Integrations
    • Sentence Transformers
    • Hugging Face Transformers
    • Custom Integrations
    • Migration Guide
  • RKNN Limitations
    • Dynamic Inputs & Static Shapes
    • Quantization Support
    • Operator Support
    • Dtype Limitations
    • Memory Constraints
    • Platform Compatibility
    • Known Issues
    • Getting Help
Previous Next

© Copyright 2025 Emmanuel Cortes. All rights reserved.

Built with Sphinx using a theme provided by Read the Docs.