CS333 Lab 1: Distributed Minimum Spanning Trees
Assigned: Tuesday, September 24
Due: Tuesday, October 8
Goals:
By the end of this lab, you should...
-
understand the Gallagher-Humblet-Spira minimum spanning tree algorthm.
-
refresh your memory on how to open and communicate over sockets.
-
have a new appreciation for the difficulty of designing and implementing
correct algorithms in a distributed asynchronous setting.
-
have experience writing software that uses a common protocol to
interacts with software written by others.
Before starting:
Review the Gallagher-Humblet-Spira minimum spanning tree algorthm
described in class on September 12 and 17. For your convenience,
the pseudocode presented on September 17 is linked from the
examples page.
Study the file Message.txt that
defines the message type that you will use. These messages will be
serialized for communication across ObjectInputStreams and
ObjectOutputStreams. On October 8, the class will go to the lab and
you will run your program as one node in a large distributed
computation. Therefore, avoid making changes to the message
file so your program will be able to communicate with the
solutions developed by other students. (You have also been provided
with a Message.class file that everyone will use. Do not
recompile the Message source code. (The source file is named
"Message.txt" instead of "Message.java" to help prevent
recompilation.)
Study the file MSTviewer.java
that will be the communication server for this lab. Note that all
messages will be sent through the server to be forwarded to other
nodes in the computation. In "real life," the nodes would send
their messages directly to each other, but having the server allows
us to see the communication pattern, which is useful for both
understanding and debugging the algorithm implementation.
Download this zip file that contains the
files described above.
Collaboration: In general in CS333, collaboration is
encouraged without restriction, provided that you credit sources.
However, for this lab, you can discuss anything with anyone,
but you should write your own code. The reason for this is
that part of the experience is seeing what it's like to get different
versions of the software to work together. (If you all write the
same code, it kind of defeats the purpose!)
Part I. Setting Up
Before going too far, it's a good idea to create the skeleton of
your MST code to make sure you can connect with
the server. The server expects you to connect at port 12345.
For now, just run the server and your MST nodes on the same machine,
using "localhost" as the IP address in the Socket constructor.
The server expects you to send a REGISTRATION message with your name as
the serverData.
After you click the "start" button, the server will reply to each node
with a REGISTRATION message that contains the node's unique identifier
(UID) as the destination, and a TreeMap as the serverData.
The TreeMap provides information about the incident edges to that
node in the graph. In particular, it maps Integer objects (edge costs)
to other Integer objects (the UID of the other endpoint node of that edge).
Write a class, say MST.java, that connects to the server in
its constructor as described above.
In your "main" method, write a loop that creates several instances of
your MST class in order to populate the graph with nodes. Each will
connect with the server.
Retrieve the UID and TreeMap, and print their contents
to make sure you got them. (Remember to click the "start" button after
all the nodes show up in the server window.)
Part II. Implementing the Algorithm
Your main job, of course, is to correctly implement the
Gallagher-Humblet-Spira minimum spanning tree algorithm.
After the REGISTRATION exchange with the server,
your MST node should will go into a loop receiving messages
and processing them. I strongly recommend using
a separate reactToMessage method that takes a Message as
a parameter and contains a switch statement to process it
according to message type.
You can use the pseudocode from class
to help you get started, but it is not a complete
correct implementation.
You will need to think about it
carefully to make sure that everything works correctly...
Important notes:
- In response to messages, you will often
need to send out other messages. Be sure that the destination of
each message is correct. If your node needs to send a message
to itself, you can just call your reactToMessage method recursively,
rather than sending it through the server.
- Wakeup: If you click on a node, the server will send it
a Wakeup message. In response, you'll want to change your
state to FOUND and send a CONNECT message to the lowest cost outgoing edge
(if any). Note, however, that you shouldn't yet make the edge a
branch edge... the other endpoint needs to send you an INITIATE, INFORM,
or CONNECT message before you adopt the edge as a branch. (See below.)
- Being woken up by other nodes: Nodes do not send
wakeup messages, but if a sleeping node receives some other
message from another node, it should do the wakeup procedure (above)
before processing the message.
- Merge:
To avoid interfering MWOE computations, two fragments should
merge across an edge only after both endpoints have sent a
CONNECT message to each other. Therefore, you'll need some extra
state (perhaps a list) to keep track of who you have already sent
CONNECT messages to, and those you have received CONNECT messages
from, so that the endpoing with the higher UID knows when to
initiate the MWOE computation. Although the pseudocode says
otherwise, don't actually add the edge to your
branch list until the second connect message is detected. Otherwise,
you may "leak" a MWOE computation out of the fragment.
- Absorb: In this case, you go ahead and add the branch
edge immediately upon receiving the CONNECT that causes the
absorb. This is because there is not a new MWOE computation (only
perhaps an old one that is continued into the absorbed fragment).
However, for the reasons described in "merge" above, you don't want to
add an edge as a branch when first sending the CONNECT
message. The node in the lower level fragment
finds out about an "absorb" taking place is when the INITIATE or INFORM
message arrives. So, whenever receiving an INITIATE or INFORM message
received along an edge, the edge should be added to the branch set, if
it hasn't already.
- ChangeRoot: Again, don't add the edge as a branch immediately.
Make sure a connect has gone both directions (or that the other end point
is absorbing you).
- Another note on CONNECT messages: Sometimes you'll get a
CONNECT message while you're in the middle of your MWOE computation.
Once your level number goes up, you may be able to absorb those
fragments, rather than waiting for your MWOE to be that edge, so it
can be helpful to check your list of CONNECT messages received each
time your level number increases.
- Waiting for responses: The pseudocode indicates that you
need to wait for a reponse to each test before proceeding to the next
test. This is true, but you can't actually "wait" there because you
need to process other messages that may arrive. Therefore, you'll
need to find a way for the receipt of the ACCEPT or REJECT message to
trigger the next steps in the computation. In other words, you won't
actually write it as a loop, but as a separate method that gets called
during processing of various messages, whenever there's a chance that
the loop would have gone to the next step or that its termination
code would need to run. (Hint: Use a boolean "testing" to keep track
of whether you have already sent out a test message and have not yet
received a response. The "waitingForReport" state is part of this as well.)
- Making your code robust:Since your code will need to
work with others who may have made some different implementation decisions,
you want to guard against things that could corrupt your state.
For example, you'll want to ignore the following kinds of messages:
- INITIATE messages with level numbers lower than your own.
- CONNECT messages for edges that are already branch edges.
- REJECT messages for edges that are already rejected.
- Broadcasting: It's helpful to have a method that will
go through a list and send a message to everyone in the list.
Be sure to create a new message for each destination, substituting
the appropriate destination UID. It also helps to have this method
take as a parameter the UID of a node to skip,
particularly when forwarding a
message received from one neighbor on to the rest of your neighbors.
- Making the code readable: To simplify the "send" calls,
it helps to have several "send" methods that take various parameters
corresponding to the various Message constructors. The methods could
then create the message object and send it on the ObjectOutputStream
to the server.
- Data structures: The form that you get the data from
the server may not be the easiest for you to work with.
When getting the edge data from the server,
I'd suggest creating a HashMap from neighbor UIDs to edge costs.
In addition, create a list of neighbor UIDs, in increasing order of edge cost,
to use as the initial state of your "basic" edges.
Part III. Testing
Careful reasoning is the most important thing,
but thorough testing is also key.
Here are a few suggestions.
- Start with just a few nodes (3). The MSTviewer won't work with
less than three. Also, be sure to set the DEGREE constant in the MSTviewer
to be appropriate. For example, with only 3 nodes, the maximum
degree is 2.
- Pause the execution and look at the message trace to spot problems.
It often takes some careful thought. Often,
you won't see things "at a glance."
- Vary the number of nodes and the degree.
- Test interoperability with other implementations.
Once you're confident about your own implementation,
try testing with a friend.
Start up one server and each of you can run your code, connecting with
several nodes each.
- Start early and be patient with yourself!
What To Turn In:
On October 8, we will meet in Lopata 401 for live individual and
combined demos at the regularly scheduled
class time. Bring a printed copy of the code for your MST
node to turn in. (Don't print out the message class or the server.)
Be sure your name is on it, and that you credit
any help you may have received.
Be prepared to make run your code and
to make modifications to it during class.
Kenneth J. Goldman (kjg@cs.wustl.edu)