gdata.io.handleScriptLoaded({"version":"1.0","encoding":"UTF-8","feed":{"xmlns":"http://www.w3.org/2005/Atom","xmlns$openSearch":"http://a9.com/-/spec/opensearchrss/1.0/","xmlns$gd":"http://schemas.google.com/g/2005","xmlns$georss":"http://www.georss.org/georss","xmlns$thr":"http://purl.org/syndication/thread/1.0","xmlns$blogger":"http://schemas.google.com/blogger/2008","id":{"$t":"tag:blogger.com,1999:blog-913600556879440043"},"updated":{"$t":"2024-01-01T15:38:08.302+05:30"},"category":[{"term":"Data Structures"},{"term":"Vedic Mathematics"},{"term":"My Vlogs"},{"term":"Website Designing"},{"term":"Guest Blogging"},{"term":"Youtube"},{"term":"PPL"},{"term":"Android"},{"term":"Android App Development"},{"term":"High Performance Computing"},{"term":"Socket Programming"},{"term":"Java"},{"term":"Cloud Computing"},{"term":"Unboxing \u0026 Review"},{"term":"Database"},{"term":"OpenMPI"},{"term":"OPENCL"},{"term":"CUDA"},{"term":"LEX \u0026 YACC"},{"term":"Vocabulary"},{"term":"Compiler"},{"term":"Blogging Tips"},{"term":"Networking"},{"term":"Linux"},{"term":"Nanded City Pune"},{"term":"Parallel Computing"},{"term":"SDL"},{"term":"Fedora"},{"term":"Udemy Courses"},{"term":"Dia Software"},{"term":"MPI"},{"term":"Multithreading"},{"term":"Computer Networks"},{"term":"Abbreviations in Computer Science"},{"term":"Salesforce"},{"term":"Lisp"},{"term":"YouTube Tips"},{"term":"MS Excel Formulas \u0026 Functions"},{"term":"C Plus Plus Programming"},{"term":"GATE"},{"term":"Mysql"},{"term":"Google Forms"},{"term":"Wine"},{"term":"Swing"},{"term":"Mathematics"},{"term":"SQL"},{"term":"Amazon Links Summary"},{"term":"Thread Pool"},{"term":"General"},{"term":"Amazon Affiliate Program"},{"term":"How To Write Blog"},{"term":"C Programming"},{"term":"Applet"},{"term":"Selenium Automation Testing"},{"term":"Skill Development Lab"},{"term":"OPENMP"},{"term":"Python"},{"term":"Ubuntu"}],"title":{"type":"text","$t":"Computer Revolution (www.comrevo.com)"},"subtitle":{"type":"html","$t":""},"link":[{"rel":"http://schemas.google.com/g/2005#feed","type":"application/atom+xml","href":"https://www.blogger.com/feeds/913600556879440043/posts/default/-/OPENCL?alt\u003djson-in-script\u0026max-results\u003d6"},{"rel":"self","type":"application/atom+xml","href":"https://www.blogger.com/feeds/913600556879440043/posts/default/-/OPENCL?alt\u003djson-in-script\u0026max-results\u003d6"},{"rel":"alternate","type":"text/html","href":"http://www.comrevo.com/search/label/OPENCL"},{"rel":"hub","href":"http://pubsubhubbub.appspot.com/"}],"author":[{"name":{"$t":"Parag Jambhulkar"},"uri":{"$t":"https://www.blogger.com/profile/13991750622483538113"},"email":{"$t":"noreply@blogger.com"},"gd$image":{"rel":"http://schemas.google.com/g/2005#thumbnail","width":"35","height":"35","src":"//www.blogger.com/img/blogger_logo_round_35.png"}}],"generator":{"version":"7.00","uri":"https://www.blogger.com","$t":"Blogger"},"openSearch$totalResults":{"$t":"1"},"openSearch$startIndex":{"$t":"1"},"openSearch$itemsPerPage":{"$t":"6"},"entry":[{"id":{"$t":"tag:blogger.com,1999:blog-913600556879440043.post-455554984900573235"},"published":{"$t":"2017-03-09T12:01:00.000+05:30"},"updated":{"$t":"2017-08-31T09:09:50.153+05:30"},"category":[{"scheme":"http://www.blogger.com/atom/ns#","term":"CUDA"},{"scheme":"http://www.blogger.com/atom/ns#","term":"OPENCL"}],"title":{"type":"text","$t":"OpenCL Program for Vector / Array Addition"},"content":{"type":"html","$t":"\u003cdiv dir\u003d\"ltr\" style\u003d\"text-align: left;\" trbidi\u003d\"on\"\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; To learn Parallel Computing with OpenCL, you should start with example of Array Addition as it illustrates the proper use of multi-threading paradigm.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;In this post, we will see OpenCL program for Array / Vector Addition. \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u003c/span\u003e\u003cbr /\u003e\n\u003cbr /\u003e\n\u003ca name\u003d'more'\u003e\u003c/a\u003e\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;In OpenCL, we have to follow the following steps:\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u003cbr /\u003e\u003c/span\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e1. Get a list of platforms available on your system. (i.e. OpenCL installations on your system).\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e2. Get a list of devices(cpu or gpu) supported for available OpenCL platforms.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e3. Get context properties list.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e4. Create a context (collection of GPU and CPU).\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e5. Create command queue for the context and device.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e6. Create a program object for all kernels.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e7. Build/Compile the program.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e8. Create kernel object for the specific kernel which you want to execute.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e9. Load data into the input buffer.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e10. Set the arguments for the kernel.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e11. Enqueue the kernel for execution.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e12. Copy the results from Output buffer to Host(CPU) variable.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e13. Print the results.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e14. Free OpenCL resources.\u003c/span\u003e\u003cbr /\u003e\n\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Following program is run on a system which has NVIDIA gpu and CUDA installed on it.\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Program code and its output is as follows:\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026nbsp;\u003cb\u003eProgram: (openclprogram.c)\u003c/b\u003e\u003c/span\u003e\u003cbr /\u003e\n\u003cdiv style\u003d\"background-color: lightgreen;\"\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e#include \u0026lt;stdio.h\u0026gt;\u003cbr /\u003e#include \u0026lt;stdlib.h\u0026gt;\u003cbr /\u003e#include \u0026lt;CL/cl.h\u0026gt;\u003cbr /\u003e\u0026nbsp;\u0026nbsp; \u003cbr /\u003e/* Kernel (function to be run on device) */\u0026nbsp;\u0026nbsp;\u0026nbsp; \u003cbr /\u003econst char *code \u003d\u003cbr /\u003e\u0026nbsp;\"__kernel void arrayadd(__global int *x, __global int *y, __global int *z)\\n\"\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; \"{\\n\"\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; \"\u0026nbsp; size_t id \u003d get_global_id(0);\\n\"\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; \"\u0026nbsp; z[id] \u003d x[id] + y[id];\\n\"\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; \"}\\n\";\u003cbr /\u003e\u003cbr /\u003eint main()\u003cbr /\u003e{\u003cbr /\u003e\u0026nbsp; cl_context context;\u003cbr /\u003e\u0026nbsp; cl_context_properties properties[3];\u003cbr /\u003e\u0026nbsp; cl_kernel kernel;\u003cbr /\u003e\u0026nbsp; cl_command_queue commandqueue;\u003cbr /\u003e\u0026nbsp; cl_program program;\u003cbr /\u003e\u0026nbsp; cl_int err;\u003cbr /\u003e\u0026nbsp; cl_uint num_of_platforms;\u003cbr /\u003e\u0026nbsp; cl_platform_id platform_id;\u003cbr /\u003e\u0026nbsp; cl_device_id device_id;\u003cbr /\u003e\u0026nbsp; cl_uint num_of_devices;\u003cbr /\u003e\u0026nbsp; cl_mem buffer1, buffer2, outputbuffer;\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; size_t global;\u003cbr /\u003e\u0026nbsp; int arraysize;\u003cbr /\u003e\u0026nbsp; int a[20];\u003cbr /\u003e\u0026nbsp; int b[20];\u003cbr /\u003e\u0026nbsp; int results[20];\u003cbr /\u003e\u0026nbsp; int i;\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; printf(\"Enter the size of arrays\\n\");\u003cbr /\u003e\u0026nbsp; scanf(\"%d\",\u0026amp;arraysize);\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; printf(\"Enter %d elements of First Array\\n\",arraysize);\u003cbr /\u003e\u0026nbsp; for(i\u003d0;i\u0026lt;arraysize;i++)\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp; scanf(\"%d\",\u0026amp;a[i]);\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; printf(\"Enter %d elements of Second Array\\n\",arraysize);\u003cbr /\u003e\u0026nbsp; for(i\u003d0;i\u0026lt;arraysize;i++)\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp; scanf(\"%d\",\u0026amp;b[i]);\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; /* Get a list of platforms available on your system. (i.e. OpenCL installations on your system).*/\u003cbr /\u003e\u0026nbsp; if (clGetPlatformIDs(1, \u0026amp;platform_id, \u0026amp;num_of_platforms)!\u003d CL_SUCCESS)\u003cbr /\u003e\u0026nbsp;\u0026nbsp; {\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; printf(\"Not getting Platform id\\n\");\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; return 1;\u003cbr /\u003e\u0026nbsp;\u0026nbsp; }\u003cbr /\u003e\u0026nbsp; \u003cbr /\u003e\u0026nbsp; /* Get a list of devices(cpu or gpu) supported for available OpenCL platforms.*/\u003cbr /\u003e\u0026nbsp; if (clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_ALL, 1, \u0026amp;device_id, \u0026amp;num_of_devices) !\u003d CL_SUCCESS)\u003cbr /\u003e\u0026nbsp;\u0026nbsp; {\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; printf(\"Not getting Device id\\n\");\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; return 1;\u003cbr /\u003e\u0026nbsp;\u0026nbsp; }\u003cbr /\u003e\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp;\u0026nbsp; /* Get context properties list.*/\u003cbr /\u003e\u0026nbsp; properties[0]\u003d CL_CONTEXT_PLATFORM;\u003cbr /\u003e\u0026nbsp; properties[1]\u003d (cl_context_properties) platform_id;\u003cbr /\u003e\u0026nbsp; properties[2]\u003d 0;\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; /* Create a context (i.e. collection of GPU and CPU).*/\u003cbr /\u003e\u0026nbsp; context \u003d clCreateContext(properties,1,\u0026amp;device_id,NULL,NULL,\u0026amp;err);\u003cbr /\u003e\u0026nbsp; \u003cbr /\u003e\u0026nbsp; /* Create command queue for the context and device.*/\u003cbr /\u003e\u0026nbsp; commandqueue \u003d clCreateCommandQueue(context, device_id, 0, \u0026amp;err);\u003cbr /\u003e\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp;\u0026nbsp; /* Create a program object for all kernels.*/\u003cbr /\u003e\u0026nbsp;\u0026nbsp; program \u003d clCreateProgramWithSource(context,1,(const char **) \u0026amp;code, NULL, \u0026amp;err);\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp; /* Build/Compile the program.*/\u003cbr /\u003e\u0026nbsp; if (clBuildProgram(program, 0, NULL, NULL, NULL, NULL) !\u003d CL_SUCCESS)\u003cbr /\u003e\u0026nbsp;\u0026nbsp; {\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; printf(\"Error during building program\\n\");\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; return 1;\u003cbr /\u003e\u0026nbsp;\u0026nbsp; }\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp; /* Create kernel object for the specific kernel which you want to execute.*/\u003cbr /\u003e\u0026nbsp; kernel \u003d clCreateKernel(program, \"arrayadd\", \u0026amp;err);\u003cbr /\u003e\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; buffer1 \u003d clCreateBuffer(context, CL_MEM_READ_ONLY, sizeof(int) * arraysize, NULL, NULL);\u003cbr /\u003e\u0026nbsp; buffer2 \u003d clCreateBuffer(context, CL_MEM_READ_ONLY, sizeof(int) * arraysize, NULL, NULL);\u003cbr /\u003e\u0026nbsp; outputbuffer \u003d clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(int) * arraysize, NULL, NULL);\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; /* Load data into the input buffer.*/\u003cbr /\u003e\u0026nbsp; clEnqueueWriteBuffer(commandqueue, buffer1, CL_TRUE, 0, sizeof(int) * arraysize, a, 0, NULL, NULL);\u003cbr /\u003e\u0026nbsp; clEnqueueWriteBuffer(commandqueue, buffer2, CL_TRUE, 0, sizeof(int) * arraysize, b, 0, NULL, NULL);\u003cbr /\u003e\u0026nbsp; \u003cbr /\u003e\u0026nbsp; /* Set the arguments for the kernel.*/\u003cbr /\u003e\u0026nbsp; clSetKernelArg(kernel, 0, sizeof(cl_mem), \u0026amp;buffer1);\u003cbr /\u003e\u0026nbsp; clSetKernelArg(kernel, 1, sizeof(cl_mem), \u0026amp;buffer2);\u003cbr /\u003e\u0026nbsp; clSetKernelArg(kernel, 2, sizeof(cl_mem), \u0026amp;outputbuffer);\u003cbr /\u003e\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp; global\u003darraysize;\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp; /* Enqueue the kernel for execution. */\u003cbr /\u003e\u0026nbsp; clEnqueueNDRangeKernel(commandqueue, kernel, 1, NULL, \u0026amp;global, NULL, 0, NULL, NULL);\u003cbr /\u003e\u0026nbsp; clFinish(commandqueue);\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; /* Copy the results from Output buffer to Host(CPU) variable. */\u003cbr /\u003e\u0026nbsp; clEnqueueReadBuffer(commandqueue, outputbuffer, CL_TRUE, 0, sizeof(int) *arraysize, results, 0, NULL, NULL);\u003cbr /\u003e\u003cbr /\u003e\u0026nbsp; /* Print the results. */\u003cbr /\u003e\u0026nbsp; printf(\"Addition of Two Arrays is as follows: \\n\");\u003cbr /\u003e\u0026nbsp; \u003cbr /\u003e\u0026nbsp; for(i\u003d0;i\u0026lt;arraysize; i++)\u003cbr /\u003e\u0026nbsp;\u0026nbsp; {\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; printf(\"%d\\n\",results[i]);\u003cbr /\u003e\u0026nbsp;\u0026nbsp; }\u003cbr /\u003e\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp;/* Free OpenCL resources. */\u003cbr /\u003e\u0026nbsp; clReleaseMemObject(buffer1);\u003cbr /\u003e\u0026nbsp; clReleaseMemObject(buffer2);\u003cbr /\u003e\u0026nbsp; clReleaseMemObject(outputbuffer);\u003cbr /\u003e\u0026nbsp; clReleaseProgram(program);\u003cbr /\u003e\u0026nbsp; clReleaseKernel(kernel);\u003cbr /\u003e\u0026nbsp; clReleaseCommandQueue(commandqueue);\u003cbr /\u003e\u0026nbsp; clReleaseContext(context);\u003cbr /\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp; \u003cbr /\u003e\u0026nbsp; return 0;\u003cbr /\u003e}\u003c/span\u003e\u003c/div\u003e\n\u003cdiv\u003e\n\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u003cb\u003eOutput:\u003c/b\u003e\u003c/span\u003e\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\n\n\u003c/span\u003e\u003c/div\u003e\n\u003cdiv style\u003d\"background-color: lightgreen;\"\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u0026gt;\u0026gt;\u0026gt; gcc openclprogram.c -I /usr/local/cuda/include/ -L /usr/local/cuda/lib64/ -lOpenCL\u003cbr /\u003e\u0026gt;\u0026gt;\u0026gt; ./a.out\u003cbr /\u003eEnter the size of arrays\u003cbr /\u003e5\u003cbr /\u003eEnter 5 elements of First Array\u003cbr /\u003e4 2 7 3 8\u003cbr /\u003eEnter 5 elements of Second Array\u003cbr /\u003e9 10 22 3 6\u003cbr /\u003eAddition of Two Arrays is as follows: \u003cbr /\u003e13\u003cbr /\u003e12\u003cbr /\u003e29\u003cbr /\u003e6\u003cbr /\u003e14\u003c/span\u003e\u003cbr /\u003e\n\u003cbr /\u003e\u003c/div\u003e\n\u003cbr /\u003e\n\u003cbr /\u003e\n\u003cbr /\u003e\n\u003cbr /\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u003cbr /\u003e\u003c/span\u003e\n\u003cspan style\u003d\"font-size: large;\"\u003e\u003cbr /\u003e\u003c/span\u003e\u003c/div\u003e\n"},"link":[{"rel":"replies","type":"application/atom+xml","href":"https://www.comrevo.com/feeds/455554984900573235/comments/default","title":"Post Comments"},{"rel":"replies","type":"text/html","href":"https://www.comrevo.com/2017/03/opencl-program-for-array-addition.html#comment-form","title":"0 Comments"},{"rel":"edit","type":"application/atom+xml","href":"https://www.blogger.com/feeds/913600556879440043/posts/default/455554984900573235"},{"rel":"self","type":"application/atom+xml","href":"https://www.blogger.com/feeds/913600556879440043/posts/default/455554984900573235"},{"rel":"alternate","type":"text/html","href":"https://www.comrevo.com/2017/03/opencl-program-for-array-addition.html","title":"OpenCL Program for Vector / Array Addition"}],"author":[{"name":{"$t":"Parag Jambhulkar"},"uri":{"$t":"https://www.blogger.com/profile/13991750622483538113"},"email":{"$t":"noreply@blogger.com"},"gd$image":{"rel":"http://schemas.google.com/g/2005#thumbnail","width":"35","height":"35","src":"//www.blogger.com/img/blogger_logo_round_35.png"}}],"thr$total":{"$t":"0"}}]}});