Communications#
UPSTAGE provides a built-in classes for passing communications between actors. There is point-to-point communication which is highly simplified and flexible for message-passing when you want to abstract away any routing or other concerns. There is also a communications manager based on routing tables that allows pre- defined routing to be followed. That same manager is subclassable for more complicated routing schemes and behaviors.
Point to Point Communications#
The PointToPointCommsManager
class
allows actors to send messages while allowing for simplified retry attempts
and timeouts. It also allows for communications blocking to be turned on
and off on a point to point basis.
The Message
class is used to
describe a message, although strings and dictionaries can also be passed as
messages, and UPSTAGE will convert them into the Message
class. The
message will include information about the sender, mode, and other data.
The communications manager needs to be instantiated and run, and any number of them can be run to represent different modes of communication. For simplicity, communications stores on an actor can have multiple modes to receive communications on. Each mode needs its own manager which can then determine the right store to send the message to.
The following code shows how create an actor class that has two communication interfaces, and then start the necessary comms managers.
import upstage_des.api as UP
class Worker(UP.Actor):
walkie = UP.CommunicationStore(modes=["UHF"])
intercom = UP.CommunicationStore(modes=["loudspeaker"])
with UP.EnvironmentContext() as env:
w1 = Worker(name="worker1")
w2 = Worker(name="worker2")
uhf_comms = PointToPointCommsManager(name="Walkies", mode="UHF")
loudspeaker_comms = PointToPointCommsManager(name="Overhead", mode="loudspeaker")
UP.add_stage_variable("uhf", uhf_comms)
UP.add_stage_variable("loudspeaker", loudspeaker_comms)
uhf_comms.run()
loudspeaker_comms.run()
The PointToPointCommsManager
class allows for explicitly connecting actors and the
store that will receive messages, but using the CommunicationStore
lets the manager auto-discover the proper store for a communications mode,
letting the simulation designer only need to pass the source actor, destination
actor, and message information to the manager.
To send a message, use the comm manager’s make_put
method to return an UPSTAGE
event to yield on to send the message.
class Talk(UP.Task):
def task(self, *, actor: Worker):
uhf = self.stage.uhf
friend = self.get_actor_knowledge(actor, "friend", must_exist=True)
msg_evt = uhf.make_put("Hello worker", actor, friend)
yield msg_evt
class GetMessage(UP.Task):
def task(self, *, actor: Worker):
get_uhf = UP.Get(actor.walkie)
get_loud = UP.Get(actor.loudspeaker)
yield UP.Any(get_uhf, get_loud)
if get_uhf.is_complete():
msg = get_uhf.get_value()
print(f"{msg.sender} sent '{msg.message}' at {msg.time_sent}")
else:
get_uhf.cancel()
...
Stopping Communications#
Communications can be halted for all transmissions of a single manager by setting
comms_degraded
to be True
at any time. Setting it back to False will allow
comms to pass again, and any retries that are waiting (and didn’t exceed a timeout)
will go through.
Additionally, specific links can be stopped by adding/removing from blocked_links
with a tuple of (sender_actor, destination_actor)
links to shut down. The same
timeout rules will apply. There is a blocked_nodes
list that can have single
Actors
added to it if you want to block all paths to and from that actor rather
than just one link.
Routing Table Communications#
UPSTAGE has a RoutingTableCommsManager
that
routes comms according to a pre-defined network. Nodes (which are Actors
) must be
explicitly connected, and this manager will route through shortest number of hops.
An example creation of the manager is given below:
class CommNode(Actor):
messages = CommunicationStore(modes=None)
with EnvironmentContext() as env:
nodes = {
name: CommNode(name=name, messages={"modes":["cup-and-string"]})
for name in "ABCDEFGH"
}
mgr = RoutingTableCommsManager(
name="StaticManager",
mode="cup-and-string",
send_time=1/3600.,
retry_max_time=20/3600.,
retry_rate=4/3600.,
global_ignore=False,
)
for u, v in ["AB", "BC", "AD", "DE", "EF", "FG", "GH", "HC", "EB"]:
mgr.connect_nodes(nodes[u], nodes[v], two_way=False)
Note how this manager uses connect_nodes()
to define explicit edges in the
routing graph. You can optionally set two_way
to True
if you want the
edge to go back and forth.
The manager is still invoked the same by any actor wanting to send a message.
Use make_put
and yield on the returned event to put the message into the
network.
The reason this is called a routing table method, even though it uses a graph,
is because the underlying select_hop
method only tells the current node
where to send the message once. Once the message is passed along, it’ll re-check
what node to go to next.
This manager allows for degraded comms and comms retry like the point-to-point manager. If a link is degraded, after the retry fails the network will re-plan a route assuming the intermediate destination node is no longer available.
Note
Message routing does not depend on the Actors used for routing. No messages are sent to the stores on the Actors connected, and the actors do not need tasks or processes to handle message routing. The routing manager moves the messages in time only until it reaches the desired destination.
The behavior is:
Ask for transmit from SOURCE to DEST
Set CURRENT to SOURCE
Find the NEXT in the shortest path from CURRENT to DEST
If there is no path, stop trying to send and end.
Attempt to send to NEXT (this is the degraded comms/retry step)
If it can send, do so. Set CURRENT = NEXT. If NEXT is DEST, Goto 8. Otherwise, Goto 3.
If it can’t send, drop NEXT from the route options. Goto 3
Place message in DEST and end.
Since this is time-based, a link can re-open during transmission. If the network has paths:
A -> B -> C
A -> D -> E -> F -> G -> H -> C
E -> B -> C
and we want to send a message from A to C, but B is blocked, a retry will have the network eventually take the long way through ADEFGHC. If B comes back online after the message gets to E, the routing will choose ADEBC instead.
If B does not come back online, the router will still try to go to B from E
since that is shorter. If B is still down, it will take longer due to the
retry. Set the input global_ignore
to True
to ignore a bad node
for the entire routing and avoid this behavior.
Stopping Communications#
The same two options for stopping a message link exist for this manager. The
blocked_links
list and the blocked_nodes
list on the manager can be
updated to prevent comms along a link or to/from a specific node. Note that
if global_ignore
is True
that a blocked link will result in effectively
blocking any comms to the destination node even if you intended only one link
to go down.
Routing With Multiple Modes#
The routing table manager does not allow for hops across different nodes, even if in practice you could radio someone and that person makes an announcement on an intercom. This may become a future feature of UPSTAGE. For now, see the section below for how to make your own router.
Make Your Own#
The intent of the RoutingTableCommsManager
class
is to provide an example of a more dynamic comms routing feature. It is based on the
RoutingCommsManagerBase
class, which holds most
of the work the manager does. This includes managing retries and the behavior steps described
above. The RoutingTableCommsManager
only implements enough features to build, store, and
call the network to determine the next hop.
To make your own, you only have to implement the select_hop
method, which returns the next
actor to send a message to. Currently, there are not placeholders in the base class for
running other processes (such as acknowledgment, network discovery, etc.) on failures in the
built-in message transmission process. Acknowledgment is implied through the retry features,
but anything more advanced is not built-in.
It is possible to create a network discovery protocol in effect by creating a process that
determines which links should exist at a given time step. Then, as long as the data structure
you modify there is used in select_hop
, you can have more complicated network behaviors
approximated without large amounts of explicit message passing.