con un clic
lite-debug-test
// Debugging, unit testing, benchmarking and performance analysis. Use when running gtest, benchmark tools, profiling latency or accuracy, diagnosing operator precision issues, delegate fallback, or memory leaks.
// Debugging, unit testing, benchmarking and performance analysis. Use when running gtest, benchmark tools, profiling latency or accuracy, diagnosing operator precision issues, delegate fallback, or memory leaks.
Cloud-side inference with ExtendRT and Ascend backends. Use for server-side inference, Ascend 310/910 deployment, ModelParallelRunner for concurrent serving, ModelGroup for weight sharing, distributed inference, or .mindir format loading.
Model conversion pipeline, parser development, optimization passes and quantization. Use when converting models to .ms, writing parser code, implementing optimizer passes, or configuring quantization.
实现开源模型从PyTorch→ONNX→MindIR→MindSpore Lite的端到端导出/验证/部署/性能评测。用户要求模型拆分导出、精度对齐、MindIR转换或部署工具链时调用。
Build configuration, CMake options, cross-compilation and packaging. Use when building MindSpore Lite, configuring CMake, cross-compiling for ARM/iOS/MCU, packaging release archives, or troubleshooting build errors.
Code formatting, naming conventions, security checks and CI verification. Use when running clang-format, checking code style, writing secure code for model parsing, reviewing code quality, or configuring CI/Jenkins pipelines.
Device-side inference with LiteRT, NNACL and hardware delegates. Use for mobile/IoT inference, Android/iOS integration, NPU/GPU/CoreML delegates, Micro codegen for MCU, on-device training, or C/C++/Java/Python API usage with .ms models.
| name | lite-debug-test |
| description | Debugging, unit testing, benchmarking and performance analysis. Use when running gtest, benchmark tools, profiling latency or accuracy, diagnosing operator precision issues, delegate fallback, or memory leaks. |
| paths | ["mindspore-lite/test/**","mindspore-lite/tools/benchmark/**","mindspore-lite/tools/benchmark_train/**","mindspore-lite/python/test/**"] |
Based on Google Test (gtest/gmock), defined in mindspore-lite/test/.
mindspore-lite/test/
ut/
src/
api/ # C/C++ API tests
registry/ # Custom operator registration tests
runtime/
kernel/arm/ # ARM kernel tests
kernel/cuda/ # CUDA kernel tests
kernel/dsp/ # DSP kernel tests
kernel/opencl/ # OpenCL kernel tests
nnacl/ # NNACL kernel tests (fp32, int8, infer)
python/ # Python API tests
st/
java/ # Java system tests (gradlew)
config_level0/ # Level 0 baseline tests (~99 .cfg files)
ascend/ # Ascend-specific configs
micro/ # Micro-specific configs
config_level1/ # Level 1 extended tests
runtest.sh # Test runner script
# Build with tests
bash build.sh -I x86_64 -e cpu -t on -j8 # -t on = MSLITE_ENABLE_TESTCASES=ON
# Run all UT
cd output/bin && bash test/runtest.sh
# Run specific tests
./test/ut_linux_x86 --gtest_filter=TestConv2D.*
./test/ut_linux_x86 --gtest_filter=TestMatMulFp32.*
# Python tests
cd mindspore-lite/python && python -m pytest test/ut/python/ -v
# Java tests
cd mindspore-lite/test/st/java && ./gradlew test
System tests use .cfg configuration files containing model path, input data, expected output (golden data), and accuracy threshold.
cd test/config_level0 # Basic functionality verification
cd test/config_level1 # Extended functionality verification
mindspore-lite/tools/benchmark/ provides performance benchmarking and accuracy verification.
# Measure latency
./benchmark --modelFile=model.ms --inputShapes=1,224,224,3
# Specify device and threads
./benchmark --modelFile=model.ms --device=CPU --numThreads=4
# Accuracy verification vs golden data
./benchmark --modelFile=model.ms \
--inDataFile=input.bin \
--benchmarkDataFile=golden.bin \
--accuracyThreshold=0.001
# Warmup and repeat
./benchmark --modelFile=model.ms --warmUpLoopCount=3 --loopCount=100
# Per-kernel timing
./benchmark --modelFile=model.ms --timeProfiling=true
# NPU/DSP device
./benchmark --modelFile=model.ms --device=NPU
./benchmark --modelFile=model.ms --device=DSP
| Parameter | Default | Description |
|---|---|---|
--modelFile | Required | Path to .ms model |
--device | CPU | CPU / GPU / Kirin NPU / DSP |
--numThreads | 2 | Thread count |
--loopCount | 10 | Test iterations |
--warmUpLoopCount | 3 | Warmup iterations |
--accuracyThreshold | 0.5 | Accuracy threshold (float) |
--inDataFile | Null | Input data; multiple files separated by , |
--benchmarkDataFile | Null | Golden output for accuracy comparison |
--inputShapes | Null | NHWC format; dims by ,; multiple inputs by : |
--timeProfiling | false | Print per-kernel timing |
--perfProfiling | false | CPU PMU data (aarch64 only) |
--enableFp16 | false | Prefer FP16 operators |
--cosineDistanceThreshold | -1.1 | Cosine distance threshold |
--cpuBindMode | 1 | 0=none, 1=big cores, 2=mid cores |
Avg/Max/Min inference time (ms), FPS, memory usage, accuracy vs golden.
./benchmark_train --modelFile=train_model.ms --epochs=10 --timeProfiling=true
{
"common_dump_settings": {
"dump_mode": 1,
"path": "/absolute_path",
"net_name": "ResNet50",
"input_output": 0,
"kernels": ["Default/Conv-op12"]
}
}
dump_mode: 0=all kernels, 1=listed kernels onlyinput_output: 0=input+output, 1=input only, 2=output onlyexport MINDSPORE_DUMP_CONFIG=/path/to/data_dump.jsonbash build.sh -I x86_64 -e cpu -p on -j8 # -p on = MSLITE_ENABLE_PROFILE=ON
export MSLITE_PROFILE=1
./benchmark --modelFile=model.ms
perf record -g ./benchmark --modelFile=model.ms && perf report
valgrind --leak-check=full ./benchmark --modelFile=model.ms --loopCount=1
export OPENCL_PROFILING=1
./benchmark --modelFile=model.ms --device=GPU
Use MSKernelCallBack for per-node comparison:
auto after_call = [&](const std::vector<MSTensor> &inputs,
const std::vector<MSTensor> &outputs,
const MSCallBackParam ¶m) -> bool {
MS_LOG(INFO) << param.node_name_ << " " << param.node_type_;
return true;
};
model->Predict(inputs, &outputs, nullptr, after_call);
Expected NPU but fell back to CPU:
Malloc/Free pairingTest compilation in test/CMakeLists.txt (~302 lines), controlled by MSLITE_ENABLE_TESTCASES. Disabled by default.