Nhập mã khuyến mãi ONETCOMVN được giảm 10%

Tài Khoản

GPU Programming with C++

28/12/2020

đang xem

Tin Tức

Overview

In this guide, we’ll explore the power of GPU programming with C++. Developers can expect incredible performance with C++, and accessing the phenomenal power of the GPU with a low-level language can yield some of the fastest computation currently available.

Requirements

While any machine capable of running a modern version of Linux can support a C++ compiler, you’ll need an NVIDIA-based GPU to follow along with this exercise. If you don’t have a GPU, you can spin up a GPU-powered instance in Amazon Web Services or another cloud provider of your choice.

If you choose a physical machine, please ensure you have the NVIDIA proprietary drivers installed. You can find instructions for this here: https://linuxhint.com/install-nvidia-drivers-linux/

In addition to the driver, you’ll need the CUDA toolkit. In this example, we’ll use Ubuntu 16.04 LTS, but there are downloads available for most major distributions at the following URL: https://developer.nvidia.com/cuda-downloads

For Ubuntu, you would choose the .deb based download. The downloaded file will not have a .deb extension by default, so I recommend renaming it to have a .deb at the end. Then, you can install with:

sudo dpkg -i package-name.deb

You will likely be prompted to install a GPG key, and if so, follow the instructions provided to do so.

Once you’ve done that, update your repositories:

sudo apt-get update
sudo apt-get install cuda -y

Once done, I recommend rebooting to ensure everything is properly loaded.

The Benefits of GPU Development

CPUs handle many different inputs and outputs and contain a large assortment of functions for not only dealing with a wide assortment of program needs but also for managing varying hardware configurations. They also handle memory, caching, the system bus, segmenting, and IO functionality, making them a jack of all trades.

GPUs are the opposite – they contain many individual processors that are focused on very simple mathematical functions. Because of this, they process tasks many times faster than CPUs. By specializing in scalar functions (a function that takes one or more inputs but returns only a single output), they achieve extreme performance at the cost of extreme specialization.

Example Code

In the example code, we add vectors together. I have added a CPU and GPU version of the code for speed comparison.
gpu-example.cpp contents below:

#include "cuda_runtime.h"
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <cstdio>
#include <chrono>

typedef std::chrono::high_resolution_clock Clock;

#define ITER 65535

// CPU version of the vector add function
void vector_add_cpu(int *a, int *b, int *c, int n) {
int i;

// Add the vector elements a and b to the vector c
for (i = 0; i < n; ++i) {
c[i] = a[i] + b[i];
}
}

// GPU version of the vector add function
__global__ void vector_add_gpu(int *gpu_a, int *gpu_b, int *gpu_c, int n) {
int i = threadIdx.x;
// No for loop needed because the CUDA runtime
// will thread this ITER times
gpu_c[i] = gpu_a[i] + gpu_b[i];
}

int main() {

int *a, *b, *c;
int *gpu_a, *gpu_b, *gpu_c;

a = (int *)malloc(ITER * sizeof(int));
b = (int *)malloc(ITER * sizeof(int));
c = (int *)malloc(ITER * sizeof(int));

// We need variables accessible to the GPU,
// so cudaMallocManaged provides these
cudaMallocManaged(&gpu_a, ITER * sizeof(int));
cudaMallocManaged(&gpu_b, ITER * sizeof(int));
cudaMallocManaged(&gpu_c, ITER * sizeof(int));

for (int i = 0; i < ITER; ++i) {
a[i] = i;
b[i] = i;
c[i] = i;
}

// Call the CPU function and time it
auto cpu_start = Clock::now();
vector_add_cpu(a, b, c, ITER);
auto cpu_end = Clock::now();
std::cout << "vector_add_cpu: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(cpu_end – cpu_start).count()
<< " nanoseconds.n";

// Call the GPU function and time it
// The triple angle brakets is a CUDA runtime extension that allows
// parameters of a CUDA kernel call to be passed.
// In this example, we are passing one thread block with ITER threads.
auto gpu_start = Clock::now();
vector_add_gpu <<<1, ITER>>> (gpu_a, gpu_b, gpu_c, ITER);
cudaDeviceSynchronize();
auto gpu_end = Clock::now();
std::cout << "vector_add_gpu: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(gpu_end – gpu_start).count()
<< " nanoseconds.n";

// Free the GPU-function based memory allocations
cudaFree(a);
cudaFree(b);
cudaFree(c);

// Free the CPU-function based memory allocations
free(a);
free(b);
free(c);

return 0;
}

Makefile contents below:

INC=-I/usr/local/cuda/include
NVCC=/usr/local/cuda/bin/nvcc
NVCC_OPT=-std=c++11

all:
$(NVCC) $(NVCC_OPT) gpu-example.cpp -o gpu-example

clean:
-rm -f gpu-example

To run the example, compile it:

make

Then run the program:

./gpu-example

As you can see, the CPU version (vector_add_cpu) runs considerably slower than the GPU version (vector_add_gpu).

If not, you may need to adjust the ITER define in gpu-example.cu to a higher number. This is due to the GPU setup time being longer than some smaller CPU-intensive loops. I found 65535 to work well on my machine, but your mileage may vary. However, once you clear this threshold, the GPU is dramatically faster than the CPU.

Conclusion

I hope you’ve learned a lot from our introduction into GPU programming with C++. The example above doesn’t accomplish a great deal, but the concepts demonstrated provide a framework that you can use to incorporate your ideas to unleash the power of your GPU.

ONET IDC

ONET IDC thành lập vào năm 2012, là công ty chuyên nghiệp tại Việt Nam trong lĩnh vực cung cấp dịch vụ Hosting, VPS, máy chủ vật lý, dịch vụ Firewall Anti DDoS, SSL… Với 10 năm xây dựng và phát triển, ứng dụng nhiều công nghệ hiện đại, ONET IDC đã giúp hàng ngàn khách hàng tin tưởng lựa chọn, mang lại sự ổn định tuyệt đối cho website của khách hàng để thúc đẩy việc kinh doanh đạt được hiệu quả và thành công.

Chia sẻ

Bài Viết Mới

Điều khoản dịch vụ” (Terms of Service)

Hướng dẫn fake ip bằng phần mềm SStap

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

BitBrowser – Best Anti-Detect Browser!

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access

Dịch Vụ Thiết Lập Hệ Thống Tường Lửa (Firewall)

Dịch Vụ Triển Khai Hệ Thống Ảo Hóa & Cloud

Dịch Vụ Triển Khai Hệ Thống Ceph

Dịch Vụ Triển Khai Hệ Thống BGP Multi-Peer Cho ISP

Bài Viết

Bài Viết Mới Cập Nhật

Điều khoản dịch vụ” (Terms of Service)

ĐIỀU KHOẢN DỊCH VỤ (Áp dụng cho dịch vụ VPS, Hosting, Proxy) Cập nhật lần cuối: [ngày/tháng/năm] 1. Giới thiệu Khi sử...

05/07/2025

Hướng dẫn fake ip bằng phần mềm SStap

Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau...

10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”...

02/06/2025

BitBrowser – Best Anti-Detect Browser!

Good anti association effect, complete browser fingerprint modification, affordable price! Please recommend it to friends around you! BitBrowser – anti detect browser, Dorang Account Defense Association ⚙️ Function: – RPA automation – API script – Extended plug -in – Window synchronization – Support Global Proxy IP Used for: Capital monetization, crypto，E-commerce, Social Media Marketing, Shopping Price Comparison, Price Comparison, Advertising, Alliance Marketing, Agency Operation, Self-testing etc. ♾️10 Profiles for Free ♾️ Free registration link：https://www.bitbrowser.net/vi/?code=5df4f4ec WhatsApp service group : https://chat.whatsapp.com/FCQaHfHbR351GIje98OIA9 Technical service group : https://t.me/bitbrowser000

26/05/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông...

04/04/2025

Bài Viết Mới

Điều khoản dịch vụ” (Terms of Service)

Hướng dẫn fake ip bằng phần mềm SStap

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

BitBrowser – Best Anti-Detect Browser!

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access

Dịch Vụ Thiết Lập Hệ Thống Tường Lửa (Firewall)

Dịch Vụ Triển Khai Hệ Thống Ảo Hóa & Cloud

Dịch Vụ Triển Khai Hệ Thống Ceph

Dịch Vụ Triển Khai Hệ Thống BGP Multi-Peer Cho ISP

Hotline/Zalo

09.016.19.525

Nhận chương trình khuyến mãi từ ONET IDC

72 Lê Thánh Tôn, P.Bến Nghé, Quận 1, TP HCM

1001 S MAIN ST STE 600 KALISPELL, MT 59901

Điện thoại: 09.016.19.525

Email liên hệ:

[email protected]

GPU Programming with C++

Overview

Requirements

The Benefits of GPU Development

Example Code

Conclusion

Bài Viết Mới

Vim Install Plugins

Hướng dẫn ký file PDF bằng chữ ký số (chữ ký điện tử) và sửa lỗi mới nhất 2021 foxit reader

Moodle [Part 6] – Hướng dẫn tạo Category (Chủ đề hay Thể loại) và các Course (Khóa học)

Bài Viết Mới Cập Nhật

Điều khoản dịch vụ” (Terms of Service)

ĐIỀU KHOẢN DỊCH VỤ (Áp dụng cho dịch vụ VPS, Hosting, Proxy) Cập nhật lần cuối: [ngày/tháng/năm] 1. Giới thiệu Khi sử...

05/07/2025

Hướng dẫn fake ip bằng phần mềm SStap

Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau...

10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”...

02/06/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông...

04/04/2025

Bài Viết Mới

CHÍNH SÁCH & ĐIỀU KHOẢN

GPU Programming with C++

Overview

Requirements

The Benefits of GPU Development

Example Code

Conclusion

Bài Viết Mới

Vim Install Plugins

Hướng dẫn ký file PDF bằng chữ ký số (chữ ký điện tử) và sửa lỗi mới nhất 2021 foxit reader

Moodle [Part 6] – Hướng dẫn tạo Category (Chủ đề hay Thể loại) và các Course (Khóa học)

Bài Viết Mới Cập Nhật

Điều khoản dịch vụ” (Terms of Service) ĐIỀU KHOẢN DỊCH VỤ (Áp dụng cho dịch vụ VPS, Hosting, Proxy) Cập nhật lần cuối: [ngày/tháng/năm] 1. Giới thiệu Khi sử... 05/07/2025

Hướng dẫn fake ip bằng phần mềm SStap Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau... 10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”... 02/06/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP) Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông... 04/04/2025

Bài Viết Mới

CHÍNH SÁCH & ĐIỀU KHOẢN

Điều khoản dịch vụ” (Terms of Service)

ĐIỀU KHOẢN DỊCH VỤ (Áp dụng cho dịch vụ VPS, Hosting, Proxy) Cập nhật lần cuối: [ngày/tháng/năm] 1. Giới thiệu Khi sử...

05/07/2025

Hướng dẫn fake ip bằng phần mềm SStap

Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau...

10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”...

02/06/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông...

04/04/2025