Present your own fully documented and tested programming example illustrating the problem of unbalanced loads. Describe the use of OpenMP's scheduler as a means of mitigating this problem.
The below example shows a number of tasks that all update a global counter. Since threads share the same memory space, they indeed see and update the same memory location. The code returns a false result because updating the variable is much quicker than creating the thread as on a multicore processor the chance of errors will greatly increase. If we artificially increase the time for the update, we will no longer get the right result. All threads read out the value of sum, wait a while (presumably calculating something) and then update.
#include
#include
#include "pthread.h"
int sum=0;
void adder() {
int sum = 0;
int t = sum; sleep(1); sum = t+1;
return;
}
#define NTHREADS 50
int main() {
int i;
pthread_t threads[NTHREADS];
printf("forking\n");
for (i=0; i
if (pthread_create(threads+i,NULL,&adder,NULL)!=0) return i+1;
printf("joining\n");
for (i=0; i
{
if (pthread_join(threads[i],NULL)!=0) return NTHREADS+i+1;
printf("Sum computed: %d\n",sum);
}
return 0;
}
The use of OpenMP is the parallel loop. Here, all iterations can be executed independently and in any order. The pragma CPP directive then conveys this fact to the compiler. A sequential code can be easily parallelized this way.
#include
#include
#include "pthread.h"
int sum=0;
void adder() {
int sum = 0;
int t = sum; sleep(1); sum = t+1;
return;
}
#define NTHREADS 50
int main() {
int i;
pthread_t threads[NTHREADS];
printf("forking\n");
#pragma omp for
for (i=0; i
if (pthread_create(threads+i,NULL,&adder,NULL)!=0) return i+1;
}
printf("joining\n");
for (i=0; i
{
if (pthread_join(threads[i],NULL)!=0) return NTHREADS+i+1;
printf("Sum computed: %d\n",sum);
}
return 0;
}