Get in Touch

Course Outline

Introduction

Grasping the Fundamentals of Heterogeneous Computing Methodology

The Case for Parallel Computing: Understanding Its Necessity

Multi-Core Processors: Architecture and Design

Introduction to Threads: Core Concepts and Basics of Parallel Programming

Mastering GPU Software Optimization Techniques

OpenMP: A Standard for Directive-Based Parallel Programming

Practical Demonstrations of Programs on Multicore Systems

Introduction to GPU Computing

Leveraging GPUs for Parallel Computing

The GPU Programming Model

Practical Demonstrations of Programs on GPU Hardware

SDK, Toolkit, and Environment Setup for GPU Development

Working with Various Libraries

Demo of GPU Capabilities, Tools, Sample Programs, and OpenACC

Understanding the CUDA Programming Model

Studying the CUDA Architecture

Setting Up and Exploring the CUDA Development Environment

Working with the CUDA Runtime API

Understanding the CUDA Memory Model

Exploring Additional CUDA API Features

Efficient Global Memory Access in CUDA: Optimization Strategies

Optimizing Data Transfers in CUDA Using CUDA Streams

Leveraging Shared Memory in CUDA

Understanding and Implementing Atomic Operations and Instructions in CUDA

Case Study: Basic Digital Image Processing with CUDA

Working with Multi-GPU Programming

Advanced Hardware Profiling and Sampling on NVIDIA / CUDA

Utilizing the CUDA Dynamic Parallelism API for Dynamic Kernel Launch

Summary and Conclusion

Requirements

  • Proficiency in C Programming
  • Familiarity with Linux GCC
 21 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories