In this week's lab, we saw that the benefit of using CUDA depends on the amount of computation performed on the GPU compared to the time spent transferring the data. We varied the amount of data transferred and the expense of the operations being performed by the GPU. To recap:

The final part of this week's exercise is to transfer 3 vectors (as in Parts I and II) but use a more expensive computation (the hypotenuse calculation).

Part V: Homework

Your task is to write vectorHypot.cu that computes C[i] = sqrt( A[i]*A[i] + B[i]*B[i] ) for all the elements in arrays A, B, and C. Your program should compute this sequentially and using CUDA, time both computations, and verify the correctness of the computations, as we did in this week's lab.

As before, create a line chart that compares your sequential and CUDA computations for arrays of size 50,000; 500,000; 5,000,000; and 50,000,000; and a stacked bar chart showing the times spent in the different portions of the CUDA computation.

This is a one week project.

Hand In

Please staple or paperclip your pages together.

