con un clic
profile
// Profile ExecuTorch model execution. Use when measuring performance, analyzing operator timing, or debugging slow models.
// Profile ExecuTorch model execution. Use when measuring performance, analyzing operator timing, or debugging slow models.
Build, test, or develop the QNN (Qualcomm AI Engine Direct) backend. Use when working on backends/qualcomm/, building QNN (use backends/qualcomm/scripts/build.sh), adding new ops or passes, running QNN delegate tests, or exporting models for Qualcomm HTP/GPU targets. Also exposes a Buck-vs-CMake parity workflow — invoke as `/qualcomm buck-fix`, `/qualcomm buck-cmake fix`, `/qualcomm buck-parity`, or any user request to fix `test-qnn-buck-build-linux` CI failures or check buck/cmake drift in backends/qualcomm/.
Build and configure ExecuTorch as a Zephyr RTOS module for embedded boards. Use when setting up a Zephyr workspace with ET, adding board support (overlays, confs, memory layout), building with west, or debugging linker memory overflow.
Search the ExecuTorch tribal knowledge base covering QNN, XNNPACK, Vulkan, CoreML, Arm, and Cadence backends, quantization recipes, export pitfalls, runtime errors, and SoC compatibility. Use when debugging ExecuTorch errors, choosing quantization configs, checking backend op support, or answering questions about Qualcomm HTP / Snapdragon / Apple Neural Engine behavior.
Build ExecuTorch from source — Python package, C++ runtime, runners, cross-compilation, and backend-specific builds. Use when compiling anything in the ExecuTorch repo, diagnosing build failures, or setting up platform-specific builds.
Analyze and reduce ExecuTorch binary size. Use when investigating binary size, running size tests, or optimizing the runtime for size-constrained deployments.
Export a PyTorch model to .pte format for ExecuTorch. Use when converting models, lowering to edge, or generating .pte files.
| name | profile |
| description | Profile ExecuTorch model execution. Use when measuring performance, analyzing operator timing, or debugging slow models. |
program = runtime.load_program("model.pte", enable_etdump=True, debug_buffer_size=int(1e7))
outputs = program.load_method("forward").execute(inputs)
program.write_etdump_result_to_file("etdump.etdp", "debug.bin")
from executorch.devtools import Inspector
inspector = Inspector(etrecord="model.etrecord", etdump_path="etdump.etdp")
inspector.print_data_tabular()