Dana Vrajitoru
B424 Parallel and Distributed Programming

Divide and Conquer

Parallel prefix computation.
Given an array a of n elements, compute a0 º a1 º  ...   º an-1 where º is any operation that is commutative and associative.

Example: finding the minimum in an array.

Sequential algorithm

result = a[0];
for (i=1; i< result = result º a[i];

Data division:

// Shared memory mode
Divide_receive_data()
{
  if (id == 0)
    read size, limit;
  Barrier();
}

// Distributed mode
Divide_the_data_among_processes(id)
{
  if (id == 0)
    read size, limit;
  Broadcast(size, 0);
  Broadcast(limit, 0);
}

Actual computation example for the problem of finding the minimum.

Computation()
{
  Generate_array(array, size, limit);
  my_min = Find_min(array, size);
}

Process_data(prev_min, my_min)
{
  if (recv_min < my_min)
    my_min = recv_min;
}

Process_final_result(my_min)
{
  Barrier(); // for shared memory models
  output "The global minimum is ", my_min; 
}

Linear communication. Each process receives the data from the next one (save for the last process). Then each process sends the data to the previous one (save for the first process). The process number 0 (master) processes the final data.

Master_Slave()
{
  if (id != no_proc-1) {
    Recv(prev_data, id+1);
    Process_data(prev_data, my_data);
  }
  if (id != 0)
    Send(my_data, id-1);
  else
    Process_final_result(my_data);
}

Ring communication. Each process receives the data from the next process, processes it, then sends it to the previous one. The last process receives the data from the first one (0). The first process sends the data first, then receives it, then processes the final result. The last process sends the data to the first one.

Master()
{
  Send(my_data, no_proc-1);
  Recv(prev_data, 1);
  Process_data(prev_data, my_data);
  Process_final_result(my_data);
}

Slave()
{
  if (id == no_proc-1)
    Recv(prev_data, 0);
  else
    Recv(prev_data, id+1);
  // Recv(prev_data, (id+1) % no_proc);
  Process_data(prev_data, my_data);
  Send(my_data, id-1);
}

Hierarchical Broadcast

Master_Slave()
{
  if (id > 0)
    Recv(data, low((id-1)/2));
  child = 2*id+1;
  if (child < no_proc)
    Send(data, child);
  if (child+1 < no_proc)
    Send(data, child+1);
}

Hierarchical Data Collecting

Master_Slave()
{
  child = 2*id+1;
  if (child < no_proc) {
    Recv(recv_data, child);
    Process_data(recv_data, my_data);
  }
  if (child+1 < no_proc) {
    Recv(recv_data, child+1);
    Process_data(recv_data, my_data);
  }
  if (id > 0)
    Send(my_data, low((id-1)/2));
  else
    Process_final_result(my_data);
}

Variation on linear communication. Each process receives the data first, processes it, then sends it to the next. The first process doesn't receive any data. The last process doesn't send any data.

Master_Slave()
{
  if (id != no_proc-1)
    Receive(prev_data, id+1);
    Process_data(prev_data, my_data);
  if (id != 0)
    Send(my_data, id-1);
  else
    Process_final_result(my_data);
}

Variation on the ring communication. Each process sends the data to the next one. Each process receives the data first, processes it, then sends it to the next. The first process sends the data first, then receives it from the last process, then processes the final result. The last process sends the data to the first one.

Master_Slave()
{
  if (id != 0)
    Receive(prev_data, id-1);
    Process_data(prev_data, my_data);
  if (id != no_proc-1)
    Send(my_data, id+1);
  else
    Send(my_data, 0);
  if (id == 0)
    Receive(prev_data, no_proc-1);
    Process_data(prev_data, my_data);
    Process_final_result(my_data);
}