- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

nvcc gcc g++混合编译器编程

风吹稻花香发表于 2021/06/05 00:33:19 2021/06/05

【摘要】 nvcc gcc g++混合编译器编程有很多同鞋问怎么使用CUDA和其它的编译器连用呢？混合编程？先吧代码贴出来：文件1 : test1.cu [cpp]  view plain  copy //文件：test1.cu  #include <stdio.h>&nbsp...

nvcc gcc g++混合编译器编程

有很多同鞋问怎么使用CUDA和其它的编译器连用呢？混合编程？

先吧代码贴出来：

文件1 : test1.cu

      [cpp]  view plaincopy
//文件：test1.cu  
#include <stdio.h>  
#include <stdlib.h>  
#include <cuda_runtime.h>  
  
#define ROWS 32  
#define COLS 16  
#define CHECK(res) if(res!=cudaSuccess){exit(-1);}  
__global__ void Kerneltest(int **da, unsigned int rows, unsigned int cols)  
{  
  unsigned int row = blockDim.y*blockIdx.y + threadIdx.y;  
  unsigned int col = blockDim.x*blockIdx.x + threadIdx.x;  
  if (row < rows && col < cols)  
  {  
    da[row][col] = row*cols + col;  
  }  
}  
  
extern "C" int func() // 注意这里定义形式  
{  
  int **da = NULL;  
  int **ha = NULL;  
  int *dc = NULL;  
  int *hc = NULL;  
  cudaError_t res;  
  int r, c;  
  bool is_right=true;  
  
  res = cudaMalloc((void**)(&da), ROWS*sizeof(int*));CHECK(res)  
  res = cudaMalloc((void**)(&dc), ROWS*COLS*sizeof(int));CHECK(res)  
  ha = (int**)malloc(ROWS*sizeof(int*));  
  hc = (int*)malloc(ROWS*COLS*sizeof(int));  
  
  for (r = 0; r < ROWS; r++)  
  {  
    ha[r] = dc + r*COLS;  
  }  
  res = cudaMemcpy((void*)(da), (void*)(ha), ROWS*sizeof(int*), cudaMemcpyHostToDevice);CHECK(res)  
  dim3 dimBlock(16,16);  
  dim3 dimGrid((COLS+dimBlock.x-1)/(dimBlock.x), (ROWS+dimBlock.y-1)/(dimBlock.y));  
  Kerneltest<<<dimGrid, dimBlock>>>(da, ROWS, COLS);  
  res = cudaMemcpy((void*)(hc), (void*)(dc), ROWS*COLS*sizeof(int), cudaMemcpyDeviceToHost);CHECK(res)  
  
  for (r = 0; r < ROWS; r++)  
  {  
    for (c = 0; c < COLS; c++)  
    {     
      printf("%4d ", hc[r*COLS+c]);  
      if (hc[r*COLS+c] != (r*COLS+c))  
      {     
        is_right = false;  
      }     
    }     
    printf("\n");  
  }  
  printf("the result is %s!\n", is_right? "right":"false");  
  
  cudaFree((void*)da);  
  cudaFree((void*)dc);  
  free(ha);  
  free(hc);  
//  getchar();  
  return 0;  
}  

文件2：test2.c

      [cpp]  view plaincopy
#include <stdio.h>  
  
int func(); // 注意声明  
int main()  
{  
    func();  
    return 0;  
}  

文件3 ：test3.cpp

      [cpp]  view plaincopy
#include <iostream>  
using namespace std;  
  
extern "C" int func(); //注意这里的声明  
int main()  
{  
    func();  
    return 0;  
}  

几个方案可以用：

方案1：

将所有文件分别编译，最后统一合并！

对于C程序

      [cpp]  view plaincopy
[]$nvcc -c test1.cu  
[]$gcc  -c test2.c  
[]$gcc  -o testc test1.o test2.o -lcudart -L/usr/local/cuda/lib64  

C++ 程序

      [cpp]  view plaincopy
[]$nvcc -c test1.cu  
[]$g++  -c test3.cpp  
[]$g++  -o testcpp test1.o test3.o -lcudart -L/usr/local/cuda/lib64  

方案2：

将CUDA程序弄成静态库

对于C程序

      [cpp]  view plaincopy
[]$nvcc -lib test1.cu -o libtestcu.a  
[]$gcc       test2.c -ltestcu -L. -lcudart -L/usr/local/cuda/lib64 -o testc  

特别注意：test2.c在链接库的前面

对于C++

完全域C类似，只要将gcc 换成g++， test2.c换成test3.cpp

方案3：

将CUDA程序弄成动态库

makefile

      [cpp]  view plaincopy
all : c cpp   
  
c : libtestcu.so  
  gcc test2.c   -ltestcu -L. -lcudart -L/usr/local/cuda/lib64 -o testc  
  
cpp : libtestcu.so  
  g++ test3.cpp -ltestcu -L. -lcudart -L/usr/local/cuda/lib64 -o testcpp  
  
libtestcu.so : test.cu  
  nvcc -o libtestcu.so -shared -Xcompiler -fPIC test1.cu  
  
  
clean :  
  rm *.so testc testcpp  -f  

应该能看懂。

文章来源: blog.csdn.net，作者：网奇，版权归原作者所有，如需转载，请联系作者。

原文链接：blog.csdn.net/jacke121/article/details/80402205

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

nvcc gcc g++混合编译器编程

nvcc gcc g++混合编译器编程

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

nvcc gcc g++混合编译器编程

nvcc gcc g++混合编译器编程

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

推荐阅读

相关产品