CNNOpt 2022

Effective Performance Modeling and Domain-Specific Compiler Optimization of CNNs for GPUs

Posted by Treaseven on February 19, 2025

CNNOpt Overview

Design Details

  • Pruning Register Tiles for Input Channel
  • Design space pruning via capacity constraints
  • Impact of Thread Occupancy: S Kernel
  • Tail effect and Synchronizations: Reduction Parallelism along Input Channels

Performance modeling for rapid design space exploration

Reference

Effective Performance Modeling and Domain-Specific Compiler Optimization of CNNs for GPUs~