Tuesday June 3, 2014
This problem challenges you to create a shared memory parallel version of the precipitating particles application.
This problem is based on the parallel
algorithm for particle precipitation linked below:
http://hpcuniversity.org/students/weeklyChallenge/75/
Your first task is to translate this into a shared-memory parallel
implementation of the serial code linked below:
http://hpcuniversity.org/students/weeklyChallenge/74/
You should use the parallel algorithm document you designed last time to
develop this shared memory algorithm.
There are a variety of tools that allow you to write shared-memory
parallel code, from the low-level procedural code of pthreads, all the
way to the declarative code of OpenMP. While OpenMP does not give you
much idea of how the parallelism is being implemented (in fact, you
might even suspect it's implemented in pthreads under the covers), it
does allow you to add parallelism to an existing serial algorithm
implementation without changing the overall structure. For this reason,
we recommend that you approach the problem using OpenMP, although you
can use any method you choose.
Your second task is to demonstrate measurable speedup, or explain why
your implementation did not achieve it. You should place timing
statements in both your serial and parallel implementations and make a
graph of how run time changes as a function of the thread count.
Some things to keep in mind while you're implementing this algorithm:
1. Not all parallelizable regions can actually demonstrate speedup. It
might cost more to fire up a team of threads than it does for a single
thread to do the work. You can use timing statements in your code or a
profiler like gprof (see http://hpcuniversity.org/students
/weeklyChallenge/55/) to make that decision.
2. While you might have an overall region that is parallelizable, there
might be a particular portion of that region that needs to be executed
serially. You should identify these regions based on the algorithm
document you wrote last time, and use some kind of locking to ensure
that only one thread can be in that region at a time. The OpenMP ATOMIC
and CRITICAL clauses can help. Note that the use of these clauses will slow
your code down, so be judicious in their use.
algorithm for particle precipitation linked below:
http://hpcuniversity.org/students/weeklyChallenge/75/
Your first task is to translate this into a shared-memory parallel
implementation of the serial code linked below:
http://hpcuniversity.org/students/weeklyChallenge/74/
You should use the parallel algorithm document you designed last time to
develop this shared memory algorithm.
There are a variety of tools that allow you to write shared-memory
parallel code, from the low-level procedural code of pthreads, all the
way to the declarative code of OpenMP. While OpenMP does not give you
much idea of how the parallelism is being implemented (in fact, you
might even suspect it's implemented in pthreads under the covers), it
does allow you to add parallelism to an existing serial algorithm
implementation without changing the overall structure. For this reason,
we recommend that you approach the problem using OpenMP, although you
can use any method you choose.
Your second task is to demonstrate measurable speedup, or explain why
your implementation did not achieve it. You should place timing
statements in both your serial and parallel implementations and make a
graph of how run time changes as a function of the thread count.
Some things to keep in mind while you're implementing this algorithm:
1. Not all parallelizable regions can actually demonstrate speedup. It
might cost more to fire up a team of threads than it does for a single
thread to do the work. You can use timing statements in your code or a
profiler like gprof (see http://hpcuniversity.org/students
/weeklyChallenge/55/) to make that decision.
2. While you might have an overall region that is parallelizable, there
might be a particular portion of that region that needs to be executed
serially. You should identify these regions based on the algorithm
document you wrote last time, and use some kind of locking to ensure
that only one thread can be in that region at a time. The OpenMP ATOMIC
and CRITICAL clauses can help. Note that the use of these clauses will slow
your code down, so be judicious in their use.
Show solution
Challenge Resources:
Precipitate Shared Memory Solution zip
—
Solution to the "Precipitate: Shared Memory Implementation" challenge problem
©1994-2024
|
Shodor
|
Privacy Policy
|
NSDL
|
XSEDE
|
Blue Waters
|
ACM SIGHPC
|
|
|
|
|
|
XSEDE Code of Conduct
|
Not Logged In. Login