Two important measures of the quality of parallel algorithms are speedup and efficiency :
If Ts is the time taken to run the fastest serial algorithm on one processor and if Tp is the time taken by a parallel algorithm on N processors then
NOTE![]()
If the best known serial algorithm takes 8 seconds i.e. Ts = 8, while a parallel algorithm takes 2 seconds using 5 processors, theni.e. the parallel algorithm exhibits a speedup of 4 with 5 processors giving an 80% efficiency.
- SN = Ts / Tp = 8 / 2 = 4 and
- EN = SN / N = 4 / 5 = 0.8 = 80%
Care should be taken as to exactly what is meant by Ts i.e. the time taken to run the fastest serial algorithm on one processor - which processor ?
A slightly different definition of speedup also exists.
The time taken by the parallel algorithm on one processor divided by the time taken by the parallel algorithm on N processors.
However this is misleading since many parallel algorithms contain extra operations to accomodate the parallelism (e.g the communication)
so the result is Ts is increased thus exaggerating the speedup.
Which ever definition is used the ideal is to produce linear speedup i.e. produce a speedup of N using N processors and an efficiency of 1 ( 100% ).
However in practice the speedup is reduced from its ideal value of N (the efficiency is bounded from above by 1 ).
Because of this, a goal of the parallel algorithm designer should be make the grain size ( relative amount of work done between synchronizations - communications) as large as possible, while keeping all the processors busy.
The effect of communication on speedup is reduced, in relative terms, as the grain size increases.
Another important parameter when considering communication is the machine dependant ratio
4. Amdahls Law
This states that the speedup of a parallel algorithm is effectively limited by the number of operations which must be performed sequentially, i.e its Serial Fraction
Let S be the amount of time spent (by one processor) on serial parts of the program and P be the amount of time spent ( by one processor ) on parts of the program that could be done in parallel.
i.e. Tseq = S + P and Tpar = S + P/N ............... ( 0 )
where N is the number of processors.
Therefore Speedup = Tseq / Tpar
SPEEDUP = S + P / (S + P/N ) ( 1 )
![]()
Say we have a program containing 100 operations each of which take 1 time unit.
If 80 operations can be done in parallel i.e. P = 80
and 20 operations must be done sequentially i.e. S = 20
then using 80 processors
Speedup = 100 / (20 + 80/80) = 100 / 21 < 5
i.e. a speedup of only 5 is possible no matter how many processors are available.
If we define the serial fraction F to be.
SPEEDUP = 1 / (F + (1 - f)/N ) ( 2 )
If 1% of a parallel program involves serial code, the maximum speedup that can be attained is 9 using 10 processors, but only 91 using 1024 processors.
Therefore Amdahls Law tells us that the serial fracrion F places a severe constraint on the speedup as the number of processors increase.
Since most parallel programs contain a certain amount of sequential code, a possible conclusion of Amdahls Law is that it is not cost effective to build systems with large numbers of processors because sufficient speedup will never be produced.
However most of the important applications that need to be parallelised contain very small fractions ( < 0.001)
i.e. Tseq = S + P and if F = S / Tseq is the serial fraction.
So if we solve the same problem on a parallel machine i.e. the problem size is fixed, then the speedup can be predicted by Amdahls Law as
However we could argue that run time is fixed i.e. problem size is not fixed but increases in proportion to N
AND
we could argue that serial fraction is not constant but decreases as the problem size increases i.e. S is constant.
Therefore Amdahls Law tells us that the serial fraction F places a severe constraint on the speedup as the number of processors increase
Since most parallel programs contain a certain amount of sequential code the conclusion of Amdahls Law is that it is not cost effective to build systems with large numbers of processors because sufficient speedup will never be produced.
Amdahls Law is valid for problems in which the serial fraction F does not vary with the problem size
i.e. as the problem increases the time Tseq and S increase keeping F = S / Tseq constant.
NOTE
Applications that parallelise well are ones with very small serial fractions.
i.e. speedup = 1 / F + (1 - F) / N
So if we run the program and find the actual speedup from Tseq / Tpar we can rearrange ( 2 ) to find the actual serial fraction F
i.e. F = 1/Speedup - 1/N ------------------ 1 - 1/NThe value of F is useful because equation ( 0 ) i.e. Tpar = S + P/N is idealised.
Load balancing effects are likely to result in an irregular change in F as N increases.
Say we have 12 pieces of work each taking the same amount of time.
We have perfect load balancing for N = 2,3,4,6 & 12 processors but not for N = 5,7,8,9,10,11Since a larger load imbalance results in a larger F, problems can be identified not apparent from speedup or efficiency.
The overhead of communication & synchronization tends to increase as N increases.
Since increasing the overhead decreases speedup the value of F increases smoothly as N increases. So a smoothly increasing F is a warning that the grain size is too small.
Consider the following resultsN Speedup Efficiency F 2 1.95 97 0.024 3 2.88 96 0.021 4 3.76 94 0.021 8 6.96 87 0.021Without looking at the serial fraction we cannot tell whether the results are good or not
e.g. Why does efficiency decrease ? Since F is almost constant, we can conclude it is due to limited parallelism of the program.
![]()
![]()