This page contains a few examples of distributed computing done in Linda. Our focus is not on time or space complexity, but on the more qualitative issues. The solutions can easily be improved. A few times I chose to 'describe' in English rather than write pseudo-code in order to explain the ideas. I am sure you can all fill-in the details and code it in whatever-Linda. Not all solutions are complete.
Most of the difficulty is in discovering a solution outline suitable for Linda, i.e., with many processes, with almost no synchronization. Message passing, and shared memory are easy in Linda. Since this is your first exposure to Linda, efficiency is relatively unimportant, but correctness must be preserved at all costs.
Any C code can be embedded in C-linda. So, it is important that you don't just write a C program using Linda primitives just to claim that you did. Also, recall the CEG 7370 restrictions mentioned in the slides.
Typography: Tuples are written preferably as parenthesized; some examples below use angular brackets.
There several prime number generators referred to in CEG 7370. Recall the Eratosthenes Sieve done in CSP.
Result parallelism version from Linda bookIn a bag B of nb integers, it is known that each element appears exactly d times. We wish to delete the duplicates resulting in a set of only nb/d distinct integers that occurred in the bag. It is guaranteed that d > 0, nb > 0. Unfortunately, the value d is unavailable and needs to be computed by examining the bag B. Design and implement a solution in C-Linda. Assume that the tuple space already contains ("B", xi)for all xi in B, and the size of the bag ("BSZ", nb). You lose 5 points for each use of inp or rdp.
Solution Outline only.
Bag Element == ("B", e). Set Element == ("S", e).
1. The number of duplicates d is not given. We must compute it. Do in("B", ?x) once in the main process. Then in a loop do inp("B", x) until it fails. This gives the count d. We will out("S", x) at the end -- not now. Lost 5 points :-(
2. Number of workers to use is not specified. We choose p. E.g., let p be about nb/1000 assuming bsz is in the millions. Each worker will process (nb -- d) / p bag elements. We make sure that rounding this number is such that all elements are processed.
3. Main process does an out("S", x).
Given a finite bag B of numbers and the size nb of the bag B, find the sum of all numbers in B.
Linda problems should first try result-parallelism. If we have p processes, each should sum up nb/p numbers and leave the result in the TS.
The p was to be our choice. To chose p=nb/2 seems good, but what about the next "iteration"? That is after nb/2 processes each added a pair of numbers, what should these nb/2 processes do? Terminate? Half of them should. The other half (now nb/4 in number) repeat this computation. Eventually we should one process adding 2 numbers.
What if nb is not a power of 2? Pad the bag with extra zeros.
How do we decide which processes terminate in each iteration? Give them PIDs, and use the bit pattern of the PIDs: in the first iteration, all odd numbered processes die, and so on.
Using a "lock" as you did, in general, will reduce concurrency. We should have as many autonomous processes as we can. And, they should "communicate/synchronize/mutex" as little as possible.
Problem: Given a finite bag B of numbers, as well as the size nb of the bag B, produce a set S of numbers so that no two numbers in S divide each other, and it is the largest such set that is a subbag of B.
Linda Solution:
Note that the set S for a given B is not unique. So we are entitled to produce one or more such S as long as each of the sets is a valid answer. I make no claims that the solution outlined below is the "best"; it certainly is a "good" one. It is quite likely that you will find several details missing.)
We say that a set s is partial solution, if no two numbers in s divide each other, and s is a subbag of B. But s is perhaps not a largest such bag.
We add tuples of the form <"nps", i, 0> for i:1..nb giving us the number of partial solutions of length i constructed so far. These counters are incremented as we build up a solution. When finished, we rd("nps", i, ?j) for i:nb .. 0 until we find a non-zero j. We then read an/the i-long solution from the tuple space TS.
Our computation proceeds in "waves": the i-th wave computes (or tries to) all partial solutions of length i. The last wave is for i == nb.
We take (in) one element b of B, and take (in) one partial solution s of length i-1, and see if we can extend it to a solution of length i. If it divides or is divided by an element of s, s cannot be extended, and no partial solution gets deposited. Otherwise, we add b to s, and deposit the resulting (partial) solution into the TS, and increment the c in <"nps", i, c>. After this is done for each element of B, we move on to the next wave.
Wave 1 starts with s == empty set. So we should have done {out("ps", 0, empty-set) nb times; out("nps", 0, 1);}
Taking one element b of B without getting blocked is not too hard. Maintain a count of items in the current bag, <"nB", i, ni>, where is the same i as in the waves. Take (in) this counter, verifying that it is > 0, in("B", i, ?b) and then deposit out("nB", i, ni-1).
Generating the bag for the next wave is not too hard. After obtaining b, do {in("nB", i+1, ?n); out("B", i+1, b); out("nB", i+1, n+1)}. So we better have <"nB", i, 0> tuples for i:1..nb right after beginning.
A worker process will not proceed to the next wave until after the current i-th wave is finished; i.e., a rd("nB", i, ?n) gives n==0. There is a next wave provide the present wave produced >0 solutions. If i == nb, the process terminates, otherwise it increments i, and begins on the next wave.
In all of these exercises: You lose 5 points for each use of inp or rdp.