Accelerated Parallel Library: A .NET library for GPGPU programming using the Task Parallel Library abstraction

Authors

Hørup, Søren Alsbjerg ; Juul, Søren Andreas ; Larsen, Henrik Holtegaard

Term

4. term

Education

Software, Master

Publication year

2011

Submitted on

2011-06-07

Abstract

This thesis introduces the Accelerated Parallel Library (APL), a .NET library for GPGPU programming in Common Language Infrastructure languages such as C# and VB.NET. The goal is to enable GPU acceleration with minimal code changes by exposing the same programming interface as the Task Parallel Library’s Parallel class. APL uses reflection to access Common Intermediate Language (CIL), just-in-time compiles it to Parallel Thread Execution (PTX), and runs it on the GPU via the CUDA Driver API. The library currently supports the CIL opcodes needed for a benchmark suite of four benchmarks. Experiments show that APL is generally slower than handwritten CUDA C, but on Vector Addition and Black Scholes it achieves steady-state speedups of 1.03x and 1.02x compared to the CUDA C implementations. APL also outperforms the Task Parallel Library (TPL) in most cases, with a maximum observed speedup of 82x in one case. These results demonstrate the viability of a TPL-like abstraction for leveraging GPUs in .NET while highlighting a remaining performance gap to specialized CUDA C in several scenarios.

Denne afhandling præsenterer Accelerated Parallel Library (APL), et .NET-bibliotek til GPGPU-programmering i Common Language Infrastructure-sprog som C# og VB.NET. Formålet er at gøre GPU-acceleration tilgængelig for .NET-udviklere med minimale kodeændringer ved at tilbyde samme programmeringsgrænseflade som Task Parallel Librarys Parallel-klasses parallelle løkker. APL anvender refleksion til at læse Common Intermediate Language (CIL), JIT-kompilerer til Parallel Thread Execution (PTX) og eksekverer på GPU’en via CUDA Driver API. Biblioteket understøtter de CIL-opkoder, der kræves for en benchmarksuite bestående af fire benchmarks. Evalueringen viser, at APL generelt er langsommere end håndskrevet CUDA C, men i to benchmarks (Vector Addition og Black Scholes) opnår APL en steady-state hastighedsforøgelse på hhv. 1.03x og 1.02x i forhold til CUDA C. Samtidig overgår APL i de fleste tilfælde Task Parallel Library (TPL) og giver i et tilfælde en hastighedsforøgelse på 82x. Resultaterne indikerer, at TPL-lignende abstraktion i .NET kan udnytte GPU’er effektivt og brugervenligt, om end der stadig er en ydeevnekløft til specialiseret CUDA C i flere scenarier.

[This abstract has been generated with the help of AI directly from the project full text]

Keywords

GPGPU ; APL

Documents

Download PDF
View record in AAU Student Projects

A master's thesis from Aalborg University

Accelerated Parallel Library: A .NET library for GPGPU programming using the Task Parallel Library abstraction