Nimble NIPS 2021

Nimble Lightweight and Parallel GPU Task Scheduling for Deep Learning

Posted by Treaseven on December 11, 2024

Motivation

现有的深度学习框架存在极大的调度开销和没有必要顺序执行,作者提出提前调度,来减少在执行的大多数调度开销

  1. 高调度开销使GPU变得空闲
  2. 非并行GPU任务执行

System Design

Ahead-of-time scheduling

stream assignment algorithm

  • Stream Synchronization
  • Goal of the Algorithm:最大化逻辑并行度、最小化同步数目
  • Algorithm Description

Evaluation

Reference

Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning