- Experience working with AI model training or deployment pipelines
- Understanding of GPU/MLU compute principles and model optimization
- Basic familiarity with performance profiling tools and metrics
Audience
- Performance engineers
- Machine learning infrastructure teams
- AI system architects
Ascend, Biren, and Cambricon are leading AI hardware platforms in China, each offering unique acceleration and profiling tools for production-scale AI workloads.
This instructor-led, live training (online or onsite) is aimed at advanced-level AI infrastructure and performance engineers who wish to optimize model inference and training workflows across multiple Chinese AI chip platforms.
By the end of this training, participants will be able to:
- Benchmark models on Ascend, Biren, and Cambricon platforms.
- Identify system bottlenecks and memory/compute inefficiencies.
- Apply graph-level, kernel-level, and operator-level optimizations.
- Tune deployment pipelines to improve throughput and latency.
Format of the Course
- Interactive lecture and discussion.
- Hands-on use of profiling and optimization tools on each platform.
- Guided exercises focused on practical tuning scenarios.
Course Customization Options
- To request a customized training for this course based on your performance environment or model type, please contact us to arrange.
Performance Concepts and Metrics
- Latency, throughput, power usage, resource utilization
- System vs model-level bottlenecks
- Profiling for inference vs training
Profiling on Huawei Ascend
- Using CANN Profiler and MindInsight
- Kernel and operator diagnostics
- Offload patterns and memory mapping
Profiling on Biren GPU
- Biren SDK performance monitoring features
- Kernel fusion, memory alignment, and execution queues
- Power and temperature-aware profiling
Profiling on Cambricon MLU
- BANGPy and Neuware performance tools
- Kernel-level visibility and log interpretation
- MLU profiler integration with deployment frameworks
Graph and Model-Level Optimization
- Graph pruning and quantization strategies
- Operator fusion and computational graph restructuring
- Input size standardization and batch tuning
Memory and Kernel Optimization
- Optimizing memory layout and reuse
- Efficient buffer management across chipsets
- Kernel-level tuning techniques per platform
Cross-Platform Best Practices
- Performance portability: abstraction strategies
- Building shared tuning pipelines for multi-chip environments
- Example: tuning an object detection model across Ascend, Biren, and MLU
Summary and Next Steps
United Arab Emirates - Performance Optimization on Ascend, Biren, and Cambricon
Qatar - Performance Optimization on Ascend, Biren, and Cambricon
Egypt - Performance Optimization on Ascend, Biren, and Cambricon
Saudi Arabia - Performance Optimization on Ascend, Biren, and Cambricon
South Africa - Performance Optimization on Ascend, Biren, and Cambricon
Brasil - Performance Optimization on Ascend, Biren, and Cambricon
Canada - Performance Optimization on Ascend, Biren, and Cambricon
中国 - Performance Optimization on Ascend, Biren, and Cambricon
香港 - Performance Optimization on Ascend, Biren, and Cambricon
澳門 - Performance Optimization on Ascend, Biren, and Cambricon
台灣 - Performance Optimization on Ascend, Biren, and Cambricon
USA - Performance Optimization on Ascend, Biren, and Cambricon
Österreich - Performance Optimization on Ascend, Biren, and Cambricon
Schweiz - Performance Optimization on Ascend, Biren, and Cambricon
Deutschland - Performance Optimization on Ascend, Biren, and Cambricon
Czech Republic - Performance Optimization on Ascend, Biren, and Cambricon
Denmark - Performance Optimization on Ascend, Biren, and Cambricon
Estonia - Performance Optimization on Ascend, Biren, and Cambricon
Finland - Performance Optimization on Ascend, Biren, and Cambricon
Greece - Performance Optimization on Ascend, Biren, and Cambricon
Magyarország - Performance Optimization on Ascend, Biren, and Cambricon
Ireland - Performance Optimization on Ascend, Biren, and Cambricon
Luxembourg - Performance Optimization on Ascend, Biren, and Cambricon
Latvia - Performance Optimization on Ascend, Biren, and Cambricon
España - Performance Optimization on Ascend, Biren, and Cambricon
Italia - Performance Optimization on Ascend, Biren, and Cambricon
Lithuania - Performance Optimization on Ascend, Biren, and Cambricon
Nederland - Performance Optimization on Ascend, Biren, and Cambricon
Norway - Performance Optimization on Ascend, Biren, and Cambricon
Portugal - Performance Optimization on Ascend, Biren, and Cambricon
România - Performance Optimization on Ascend, Biren, and Cambricon
Sverige - Performance Optimization on Ascend, Biren, and Cambricon
Türkiye - Performance Optimization on Ascend, Biren, and Cambricon
Malta - Performance Optimization on Ascend, Biren, and Cambricon
Belgique - Performance Optimization on Ascend, Biren, and Cambricon
France - Performance Optimization on Ascend, Biren, and Cambricon
日本 - Performance Optimization on Ascend, Biren, and Cambricon
Australia - Performance Optimization on Ascend, Biren, and Cambricon
Malaysia - Performance Optimization on Ascend, Biren, and Cambricon
New Zealand - Performance Optimization on Ascend, Biren, and Cambricon
Philippines - Performance Optimization on Ascend, Biren, and Cambricon
Singapore - Performance Optimization on Ascend, Biren, and Cambricon
Thailand - Performance Optimization on Ascend, Biren, and Cambricon
Vietnam - Performance Optimization on Ascend, Biren, and Cambricon
India - Performance Optimization on Ascend, Biren, and Cambricon
Argentina - Performance Optimization on Ascend, Biren, and Cambricon
Chile - Performance Optimization on Ascend, Biren, and Cambricon
Costa Rica - Performance Optimization on Ascend, Biren, and Cambricon
Ecuador - Performance Optimization on Ascend, Biren, and Cambricon
Guatemala - Performance Optimization on Ascend, Biren, and Cambricon
Colombia - Performance Optimization on Ascend, Biren, and Cambricon
México - Performance Optimization on Ascend, Biren, and Cambricon
Panama - Performance Optimization on Ascend, Biren, and Cambricon
Peru - Performance Optimization on Ascend, Biren, and Cambricon
Uruguay - Performance Optimization on Ascend, Biren, and Cambricon
Venezuela - Performance Optimization on Ascend, Biren, and Cambricon
Polska - Performance Optimization on Ascend, Biren, and Cambricon
United Kingdom - Performance Optimization on Ascend, Biren, and Cambricon
South Korea - Performance Optimization on Ascend, Biren, and Cambricon
Pakistan - Performance Optimization on Ascend, Biren, and Cambricon
Sri Lanka - Performance Optimization on Ascend, Biren, and Cambricon
Bulgaria - Performance Optimization on Ascend, Biren, and Cambricon
Bolivia - Performance Optimization on Ascend, Biren, and Cambricon
Indonesia - Performance Optimization on Ascend, Biren, and Cambricon
Kazakhstan - Performance Optimization on Ascend, Biren, and Cambricon
Moldova - Performance Optimization on Ascend, Biren, and Cambricon
Morocco - Performance Optimization on Ascend, Biren, and Cambricon
Tunisia - Performance Optimization on Ascend, Biren, and Cambricon
Kuwait - Performance Optimization on Ascend, Biren, and Cambricon
Oman - Performance Optimization on Ascend, Biren, and Cambricon
Slovakia - Performance Optimization on Ascend, Biren, and Cambricon
Kenya - Performance Optimization on Ascend, Biren, and Cambricon
Nigeria - Performance Optimization on Ascend, Biren, and Cambricon
Botswana - Performance Optimization on Ascend, Biren, and Cambricon
Slovenia - Performance Optimization on Ascend, Biren, and Cambricon
Croatia - Performance Optimization on Ascend, Biren, and Cambricon
Serbia - Performance Optimization on Ascend, Biren, and Cambricon
Bhutan - Performance Optimization on Ascend, Biren, and Cambricon
Nepal - Performance Optimization on Ascend, Biren, and Cambricon
Uzbekistan - Performance Optimization on Ascend, Biren, and Cambricon