Parallel Programming Using Threads 355
int val = 0;49
ThreadD a ta arg ;50
arg . intptr = & val ;51
arg . mlock = & mlock ;52
pthread_t tid [ N UMBER_T HREAD ];53
int rtv ; // return value of pth r ead_cr eate54
int ind ;55
for ( ind = 0; ind < NUMB E R_THRE AD ; ind ++)56
{57
rtv = p thread _creat e (& tid [ ind ] , NULL ,58
threadfunc , ( void *) & arg ) ;59
i f ( rtv != 0)60
{61
printf (" pth read_c reate () fail %d n" , rtv ) ;62
return EXIT _FAILUR E ;63
}64
}65
for ( ind = 0; ind < NUMB E R_THRE AD ; ind ++)66
{67
rtv = p thread_ join ( tid [ ind ] , NULL ) ;68
i f ( rtv != 0)69
{70
printf (" pth r ead_joi n () fail %d n" , rtv ) ;71
return EXIT _FAILUR E ;72
}73
}74
pth read_ mute x _des troy (& mlock );75
return EXIT _SUCCES S ;76
}77
The program has a structure ThreadData that includes two pointers: one for the integer’s
address (i.e., the shared memory) and the other for the mutex’s address. For a critical section
to work as intended, all the threads must attempt to lock and unlock the same mutex.
Hence a pointer to the mutex is passed in ThreadData. If each thread has its own mutex,
then this would be like the library having a key available for every student, even though
there is only one study room. All of them can enter the room and this will create problems.
The critical section includes the code that reads and writes the shared variable. Each
thread obtains the lock by calling pthread mutex lock right after entering the while block.
If the thread cannot lock the mutex (because some other thread is in the critical section),
then that thread will be waiting at pthread mutex lock until the thread can obtain a lock.
The mutex is unlocked by calling pthread mutex unlock at the end of the while block.
The main function creates a single ThreadData object shared by all threads. Before call-
ing pthread mutex lock or pthread mutex unlock, the lock must be initialized by calling
pthread mutex init. This is done in the main function. What is the output of this pro-
gram? Nothing. The if condition in threadfunc is never true and nothing is printed. This
means that all threads keep running indefinitely.
There is much more to say about critical sections of code, and thread synchronization.
This chapter is only an introduction, and covers the most important concepts. Writing
correct multi-threaded programs can be challenging, and developing better tools and pro-
gramming languages for this purpose is an ongoing topic of research. When writing multi-
threaded programs, it is important to identify critical sections and make them atomic.
356 Intermediate C Programming
Failure to do so will result in unpredictable and difficult-to-reproduce bugs. It is usually
difficult to test and detect synchronization problems. You must think very carefully before
writing the programs.
21.7 Amdahl’s Law
A typical multi-thread program starts with one thread (the main thread). This thread
may do some work (such as reading data from a file) before creating more threads. Then,
multiple threads are created for computing the results. The results from multiple threads
are later combined into the final result. If a program is structured this way, then there is an
obvious problem: The initialization and the finalization steps are sequential and can become
the bottleneck.
Consider the following scenario. A sequential (single threaded) program takes 100 sec-
onds. The program runs on a computer with an infinite number of cores. The initialization
and the finalization steps account for 1% of the execution time. The remaining 99% of ex-
ecution time is divided between two threads. What is the performance improvement of the
two-threads solution? The new execution time is 1 +
99
2
= 50.5 seconds. Now suppose that
the remaining 99% of execution time is divided between three threads. The execution time
becomes 1 +
99
3
= 34 seconds.
What is the execution time when 99 threads are used? 1 +
99
99
= 2 seconds. How about
990 threads? 1 +
99
990
= 1.1 seconds. The reduction in execution is less than 50% (2 seconds
1.1 seconds) after increasing the number of threads by an order of magnitude. In fact, if
the program uses an infinite number of threads, the execution is still 1 second.
What does this mean? The initialization and the finalization steps seem like a small por-
tion of the total execution for the single-thread program. They dominate the total execution
time as more and more threads are created. As more threads are used, the initialization
and the finalization steps become the bottleneck. The mathematical model for this is called
Amdahl’s Law. The model says that adding more threads has diminishing returns. In this
model, the total execution time becomes shorter as more threads are used. In reality, dou-
bling the number of threads rarely shortens the total execution time by half. Consider the
subset sum program again. We ran this program on a 32-core server using different num-
bers of threads. As the number of threads is increased from one to 32, the program’s total
execution time is reduced by 72%. Even though this is noticeable improvement, it is far
from the ideal improvement of (1
1
32
= 97% reduction).
This chapter provides an introduction to the subject of parallel programming using
threads. You can find many books and courses about this important topic.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset