]>
Commit | Line | Data |
---|---|---|
1 | .. | |
2 | Copyright 1988-2022 Free Software Foundation, Inc. | |
3 | This is part of the GCC manual. | |
4 | For copying conditions, see the copyright.rst file. | |
5 | ||
6 | First invocation: OpenACC library API | |
7 | ************************************* | |
8 | ||
9 | In this second use case (see below), a function in the OpenACC library is | |
10 | called prior to any of the functions in the CUBLAS library. More specificially, | |
11 | the function ``acc_set_device_num()``. | |
12 | ||
13 | In the use case presented here, the function ``acc_set_device_num()`` | |
14 | is used to both initialize the OpenACC library and allocate the hardware | |
15 | resources on the host and the device. In the call to the function, the | |
16 | call parameters specify which device to use and what device | |
17 | type to use, i.e., ``acc_device_nvidia``. It should be noted that this | |
18 | is but one method to initialize the OpenACC library and allocate the | |
19 | appropriate hardware resources. Other methods are available through the | |
20 | use of environment variables and these will be discussed in the next section. | |
21 | ||
22 | Once the call to ``acc_set_device_num()`` has completed, other OpenACC | |
23 | functions can be called as seen with multiple calls being made to | |
24 | ``acc_copyin()``. In addition, calls can be made to functions in the | |
25 | CUBLAS library. In the use case a call to ``cublasCreate()`` is made | |
26 | subsequent to the calls to ``acc_copyin()``. | |
27 | As seen in the previous use case, a call to ``cublasCreate()`` | |
28 | initializes the CUBLAS library and allocates the hardware resources on the | |
29 | host and the device. However, since the device has already been allocated, | |
30 | ``cublasCreate()`` will only initialize the CUBLAS library and allocate | |
31 | the appropriate hardware resources on the host. The context that was created | |
32 | as part of the OpenACC initialization is shared with the CUBLAS library, | |
33 | similarly to the first use case. | |
34 | ||
35 | .. code-block:: c++ | |
36 | ||
37 | dev = 0; | |
38 | ||
39 | acc_set_device_num(dev, acc_device_nvidia); | |
40 | ||
41 | /* Copy the first set to the device */ | |
42 | d_X = acc_copyin(&h_X[0], N * sizeof (float)); | |
43 | if (d_X == NULL) | |
44 | { | |
45 | fprintf(stderr, "copyin error h_X\n"); | |
46 | exit(EXIT_FAILURE); | |
47 | } | |
48 | ||
49 | /* Copy the second set to the device */ | |
50 | d_Y = acc_copyin(&h_Y1[0], N * sizeof (float)); | |
51 | if (d_Y == NULL) | |
52 | { | |
53 | fprintf(stderr, "copyin error h_Y1\n"); | |
54 | exit(EXIT_FAILURE); | |
55 | } | |
56 | ||
57 | /* Create the handle */ | |
58 | s = cublasCreate(&h); | |
59 | if (s != CUBLAS_STATUS_SUCCESS) | |
60 | { | |
61 | fprintf(stderr, "cublasCreate failed %d\n", s); | |
62 | exit(EXIT_FAILURE); | |
63 | } | |
64 | ||
65 | /* Perform saxpy using CUBLAS library function */ | |
66 | s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1); | |
67 | if (s != CUBLAS_STATUS_SUCCESS) | |
68 | { | |
69 | fprintf(stderr, "cublasSaxpy failed %d\n", s); | |
70 | exit(EXIT_FAILURE); | |
71 | } | |
72 | ||
73 | /* Copy the results from the device */ | |
74 | acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float)); |