# Produce a single graph comparing the speed of all implementations. Explain what is SIMD, OMP, and MPI? What are the differences between them?

ASSIGNMENT

Goal of this assignment

Distributed parallel computing

SIMD using vectorized data

Symmetric Multiprocessing using OpenMP

Distributed Memory using Message Passing Interface (MPI)

Problem Statement Given the following algorithm doing matrix multiplication, implement multiples variations using different types of parallel processing we saw in class: SIMD, OMP, MPI, and OMP+MPI.

for (i = 0; i < N; i++)

for (j = 0; j < N; j++) {

c[i][j] = 0;

for (k = 0; k < N; k++)

c[i][j] += a[i][k] * b[k][j];

}

Step 1

Run Matrix Multiplication non-vectorized in C

Create a vectorized SIMD matrix multiplication version in C

Run HelloWorld.c MPI

Lab Parallel Computing – On

Read input matrix from two files as described in MPI and OpenMP Approaches to consider .docx

Download MPI and OpenMP Approaches to consider .docx in section: main program. This will be used for demo and grading.

Matrix multiplication in C on Wolfgand cluster with OpenMP.

Matrix multiplication in C on Wolfgand cluster with MPI (Distributed Memory)

Update graph to include SIMD, OpenMP and MPI versions. (You can removed unoptimized algorithm as it expected will be “off the chart” and make the chart difficult to read)

Extra credit: Matrix multiplication in C on Wolfgand cluster with both OpenMP and MPI.

Automate running matrix multiplication on different size matrix and generating data in tabular format for graph production.

Matrix multiplication in C on Wolfgand cluster without SIMD and without parallelization.

Matrix multiplication in C on Wolfgand cluster with SIMD non-vectorized (w/o -O3) and SIMD vectorized (w/ -O3) (rewrite algo accordingly and try with and without -O3).

Produce a single graph comparing the speed of all implementations

Writing

Research Question: what is SIMD, OMP, and MPI? What are the differences between them?

Describe what is shown on the graph you have produced.

Produce a single graph comparing the speed of all implementations. Explain what is SIMD, OMP, and MPI? What are the differences between them?