Research

Achieve Real-Time DNN Execution on Off-the-Shelf Mobile Devices with Compression-Compilation Co-Design

The PCONV acceleration framework uses a new pruning dimension — fine-grained pruning patterns inside the coarse-grained structures to guarantee both the high accuracy and hardware friendliness. The flexibility relies on the mathematical-sound pattern-based sparsity, which maintains high accuracy of the sparse DNN with high sparsity ratio. The pattern-based sparsity naturally match the source-to-source compilation and code optimization technique, which bridge the gap between algorithm-level compression and embedded hardware acceleration. PCONV achieves real-time execution of most representative DNN structures on the off-the-shelf smartphones.

Research

Achieve Real-Time DNN Execution on Off-the-Shelf Mobile Devices with Compression-Compilation Co-Design

Share this: