First, remember that different processes keep their own data in distinct address spaces. Threads, on the other hand, explicitly share their entire address space with one another. Although this can make things a lot faster, it comes with the cost of making programming a lot more complicated.
In Unix/POSIX, the threads API is composed of two main calls:
pthread create(), which starts a separate thread;
pthread join(), which waits for a thread to complete.
Notice that the general syntax for using these is:
pid = pthread_create(&tid, NULL, function_ptr, argument);
pthread_join(tid, &result);
Example:
void *run(void *d)
{
int q = ((int) d);
int v = 0;
for (int i=0; iv = v + some_expensive_function_call();
return (void *) v;
}
int main()
{
pthread_t t1, t2;
int *r1, *r2;
int arg1=100;
int arg2=666;
pthread_create(&t1, NULL, run, &arg1);
pthread_create(&t2, NULL, run, &arg2);
pthread_join(t1, (void **) &r1);
pthread_join(t2, (void **) &r2);
cout << "r1= " << *r1 << ", r2=" << *r2 << endl;
}
Notice that the above threads maintain different stacks and different sets of registers; except for those, however, they share all their address spaces. Also notice that if you were to run this code in a 2 core machine, it would be expected that it ran roughly twice as fast as it would in a single core machine. If you ran it in a 4 core machine, however, it would run as fast as in the 2 core machine, since there would be no suf?cient threads to exploit the available parallelism.