con un clic
lite-converter
// Model conversion pipeline, parser development, optimization passes and quantization. Use when converting models to .ms, writing parser code, implementing optimizer passes, or configuring quantization.
// Model conversion pipeline, parser development, optimization passes and quantization. Use when converting models to .ms, writing parser code, implementing optimizer passes, or configuring quantization.
| name | lite-converter |
| description | Model conversion pipeline, parser development, optimization passes and quantization. Use when converting models to .ms, writing parser code, implementing optimizer passes, or configuring quantization. |
| paths | ["mindspore-lite/tools/converter/**","mindspore-lite/tools/optimizer/**","mindspore-lite/schema/**","mindspore-lite/tools/schema_gen/**"] |
Input Model (MindIR/TF/Caffe/ONNX/TFLite/PyTorch)
-> Parse (framework-specific Parser) -> Unified MindIR (ANF Graph)
-> Import -> Internal graph representation
-> Optimize (Constant Folding, Op Fusion, Format Transform, Parallel Split, Redundant elimination)
-> Quantize (optional: Weight / Full / Mixed Precision)
-> Export (.ms for LiteRT or .mindir for ExtendRT)
# MindIR -> .ms (device-side)
./converter_lite --fmk=MINDIR --modelFile=model.mindir --outputFile=model
# TensorFlow / TFLite / Caffe / ONNX / PyTorch -> .ms
./converter_lite --fmk=TF --modelFile=model.pb --outputFile=model
./converter_lite --fmk=TFLITE --modelFile=model.tflite --outputFile=model
./converter_lite --fmk=CAFFE --modelFile=model.prototxt --weightFile=model.caffemodel --outputFile=model
./converter_lite --fmk=ONNX --modelFile=model.onnx --outputFile=model
./converter_lite --fmk=PYTORCH --modelFile=model.pt --outputFile=model
# With quantization
./converter_lite --fmk=MINDIR --modelFile=model.mindir --outputFile=model \
--quantType=WeightQuant --bitNum=8
# Cloud-side optimization
# Ascend-specific
./converter_lite --fmk=ONNX --modelFile=model.ONNX --outputFile=model --optimize=ascend_oriented
# CPU
./converter_lite --fmk=ONNX --modelFile=model.ONNX --outputFile=model
| Parameter | Description |
|---|---|
--fmk | Input framework: MINDIR/TF/TFLITE/CAFFE/ONNX/PYTORCH |
--modelFile | Input model file path |
--weightFile | Caffe weight file (Caffe only) |
--outputFile | Output file path (without extension) |
--quantType | WeightQuant / FullQuant / NoQuant |
--bitNum | Quantization bits: 1-8 (default 8) |
--optimize | ascend_oriented / general / none |
--configFile | Quantization or runtime config file |
--inputShape | Dynamic shape input (e.g., input1:1,3,224,224;input2:1,3,256,256) |
mindspore-lite/tools/converter/parser/
onnx/ # onnx_model_parser.cc + per-operator parsers
tf/ # tf_model_parser.cc + per-operator parsers
tflite/ # tflite_model_parser.cc + per-operator parsers
caffe/ # caffe_model_parser.cc + per-operator parsers
tools/converter/parser/XxxModelParser with Parse() method (original model -> ANF Graph)tools/converter/converter_flags.ccEach Parser maps original operators to MindSpore internals:
mindspore-lite/tools/converter/
adapter/ # Format adapters
config_parser/ # Config file parsing
converter_lite/ # CLI tool entry point
cxx_api/ # Converter C++ API
decomposer/ # Operator decomposition
import/ # Model import (mindir_importer, primitive_adjust)
legacy_optimizer/ # Legacy optimization passes
micro/ # Micro code generation
ops/ # Operator utilities
parser/ # Framework-specific parsers
preprocess/ # Data preprocessing pipeline
quantizer/ # Quantization implementations
calibrator.cc # Calibration data processing
full_quant_quantizer/ # Full quantization
weight_quantizer/ # Weight-only quantization
gptq_quantizer/ # GPTQ quantization
fse_encoder/ # FSE encoding
huffman_encode/ # Huffman encoding
registry/ # Extension registration
session/ # Conversion session
mindspore-lite/tools/optimizer/
common/ # Common optimization utilities
const_fold/ # Constant folding passes
fiss on/ # Operator fission passes (note: directory is "fisson")
format/ # Format transform passes
fusion/ # Operator fusion passes (conv_bn, conv_activation, matmul_add, etc.)
graph/ # Graph-level optimization
parallel/ # Parallel split passes
class MyFusionPass : public Pass {
public:
bool Run(const FuncGraphPtr &graph) override {
auto node_list = TopoSort(graph->get_return());
for (auto &node : node_list) {
if (!CheckPattern(node)) continue;
DoFusion(node);
changed = true;
}
return changed;
}
};
Passes execute during converter phase in a fixed order chain.
[common_quant_param]
quant_type=WEIGHT_QUANT
bit_num=8
[data_preprocess]
calibrate_path=/path/to/calibration/images/
calibrate_size=100
[input_format]
input_type=IMAGE
resize_height=224
resize_width=224
./converter_lite --fmk=MINDIR --modelFile=model.mindir --outputFile=model_quant --configFile=quant_config.cfg| Type | Description | Use Case |
|---|---|---|
| WeightQuant | Weight-only | Reduce model size, minimal accuracy loss |
| FullQuant | Weight + activation | Maximum compression, needs calibration data |
| Mixed Precision | Partial INT8 + FP32 | Balance accuracy and performance |
| Format | Target Runtime | Serialization |
|---|---|---|
.ms | LiteRT (device-side) | FlatBuffers, zero-copy deserialization |
.mindir | ExtendRT (cloud-side) | Protobuf, supports large models |
Schema files: mindspore-lite/schema/ops.fbs (~1.3K lines), model.fbs, ops_types.fbs
schema/ops.fbs operator list--inputShape to specify input shapes.caffemodel weight file path is correctCloud-side inference with ExtendRT and Ascend backends. Use for server-side inference, Ascend 310/910 deployment, ModelParallelRunner for concurrent serving, ModelGroup for weight sharing, distributed inference, or .mindir format loading.
实现开源模型从PyTorch→ONNX→MindIR→MindSpore Lite的端到端导出/验证/部署/性能评测。用户要求模型拆分导出、精度对齐、MindIR转换或部署工具链时调用。
Build configuration, CMake options, cross-compilation and packaging. Use when building MindSpore Lite, configuring CMake, cross-compiling for ARM/iOS/MCU, packaging release archives, or troubleshooting build errors.
Code formatting, naming conventions, security checks and CI verification. Use when running clang-format, checking code style, writing secure code for model parsing, reviewing code quality, or configuring CI/Jenkins pipelines.
Debugging, unit testing, benchmarking and performance analysis. Use when running gtest, benchmark tools, profiling latency or accuracy, diagnosing operator precision issues, delegate fallback, or memory leaks.
Device-side inference with LiteRT, NNACL and hardware delegates. Use for mobile/IoT inference, Android/iOS integration, NPU/GPU/CoreML delegates, Micro codegen for MCU, on-device training, or C/C++/Java/Python API usage with .ms models.