The Power Of Self-Organizing Systems

Pegasus-I was a system with the overall goal to broadcast telemetry and control flight operations in real-time. The system was a combination of organized and self-organized sub-systems that interacted with Pegasus-I during the flight. Let’s examine how we structured this and role of self-organizing systems in the mission as well as how we will leverage them for Pegasus-II and Pegasus-III in the near future.

A system is an organized set of things that form a complex whole designed to achieve a goal. The system designed and used for Pegasus-I included seven (7) interconnected actors, Pegasus-I, Ground Station, Chase Vehicle, Web site, and three (3) Azure Blob Storage containers. The nodes could transmit, receive, or transmit and receive information depending on their function. The overall system was composed on nine (9) different sub-systems that managed telemetry and flight operations. These sub-systems communicated with the actors to execute specific tasks within a sub-system.

Three (3) sub-systems were organized, meaning we had predetermined and configured paths for the information to flow. These sub-systems were associated with storing telemetry received from Pegasus-I by the Ground Station and the Chase Vehicle as well as storing the location of the Chase Vehicle itself. These sub-systems were required to be organized because our storage containers in Azure Blob Storage do not connect to Piraeus. Piraeus requires a durable subscription to be configured to any passive receiver. Below in [Figure 1] shows the system actors and organized sub-system graphs for telemetry and location. You will notice that the only organized sub-systems simply ingest telemetry and location from either the Ground Station or Chase Vehicle to Azure Blob Storage. The Web site does not send or receive any information through these sub-systems.

Figure 1

Figure1

Four (4) of sub-systems were self-organizing, which means they organize themselves on-the-fly to create sub-systems that did not exist before. If these new sub-systems create ephemeral subscriptions to receive information, then those subscriptions will only last for the duration of the connection by the caller. They are disposed upon disconnect. The reason we made this choice for Pegasus-I was that certain sub-systems were only concerned with “now”, not any past history of events to or from the inflight craft. This also reduced the complexity of the system design because we could depend on Piraeus to create the sub-system graphs in Orleans on demand and connect them to the appropriate parties immediately.

When the Web Site connected to the Piraeus gateway, it was entitled to subscribe to the same topics grains in Orleans as the Azure Blob Storage containers. This allowed the Web site to create new sub-systems on-the-fly to receive telemetry and location, shown in [Figure 2]. If the Web site disconnected, both the new subscriptions and their respective observers would be disposed and upon reconnection the sub-systems would be recreated by Piraeus within Orleans.

Figure 2

Figure2

This takes care storing and displaying telemetry without requiring complex configuration or maintenance, but what about those critical flight operations for delivery system release and parachute deployment? We used the same ability for self-organization for those also. We only need to configure topics [Figure 3] for the Web site to send specific commands to the responsible parties, Ground Station and Chase Vehicle.

Figure 3

Figure3

Once the Ground Station and Chase Vehicle connection to the Piraeus gateway, the sub-systems were created [Figure 4] and communications enabled from the Web site to these parties.

Figure 4

Figure4

When you look at the entire system and its sub-systems [Figure 5], it is a complex system. However, using self-organizing systems within Piraeus, we were able to take advantage of creating the sub-systems without the need to configure all of them in advance. We only configured seven (7) topic grains and three (3) subscription grains in Orleans to design the entire system and enable communications and storage so users could view the flight in real-time and allow Mark and I to control flight operations.

Figure 5

Figure5

Instead of describing the various uses of self-organizing systems, which are numerous, I want to communicate how the planning for Pegasus-II and Pegasus-III will use them. We want to be able to bring the excitement of high altitude science to people in a very personal way and allow people actively participate in the experiment in real-time. In our own way, it is like being onboard the Calypso with Jacques Cousteau. Doing this requires that we leverage not only a Web site, but also phone apps to broadcast to users real-time telemetry, maps, and streaming video through Pegasus’s eye-in-the-sky. These phone apps will receive telemetry, location and update rapidly from 100K feet in the upper atmosphere to user’s eye. Additionally, we working on concepts to get some user-defined personalized information onboard the craft during flight such that users can directly communicate with Pegasus and see personal message during flight. However, we cannot control the number of people using the phone apps or when the users choose to turn the apps on or off, or simply a phone dropping a connection due to poor reception. Therefore, we need a self-organizing system for these users, and Piraeus supports just that for this type of experience.

-Matt Long

Orleans Above the Cloud – Piraeus Overview

Disclaimer

The “Internet of Things”, IOT, is a big topic and involves many important concepts about systems, e.g., open vs closed, organized, self-organizing, closed and open-loops. All are worthy topics to understand to put a foundation under IOT. However, I will resist these topics and only discuss Piraeus and the role that Orleans has in the architecture for the sake of brevity.

Piraeus is a multi-channel, multi-protocol, in-memory event broker. It enables edge devices or services to connect and transmit or receive information from other system entities without any coupling and without system entities having direct knowledge of each other. It is a high throughput, low latency, and linearly scalable Operational Technology that simplifies the ability for an open-system to achieve its goal.

Diagram of the physical architecture used in Pegasus-I, i.e., how we got the real-time out Piraeus and Orleans.

PIArchitecture

The Piraeus Architecture (Operational Technology)  that was used for Pegasus-I

PiraeusArchitecture

A system can be modeled as a directed graph and such Piraeus uses this simple construct to enable communications through its gateway to components within a system or sub-system.  Information enters Piraeus through its gateway, then is distributed throughout the graph.  The leaves of the graph are connected to either active or passive system components that receive the information.

The components of Piraeus are relatively simple. A gateway exists that can receive information through a variety of channels and protocols. Once this information is received by the Piraeus gateway it is fed into the appropriate graph for that specific information topic. The node that receives this information is a virtual actor, called a grain inside of the Orleans host process. This type of grain is considered a “topic” within Piraeus and topics are graphed to “subscription” grains, which Orleans can communicate. Once the information enters the topic grain is fanned out to all subscription grains associated with the topic. The subscription grains are observable and feed information directly and immediately to the specific channel and protocol that is associated with the subscription.  There is no polling in the entire Piraeus architecture…ever.

That is a high-level summary, so let’s dig deeper into what is happening under the covers. Topic grains are always provisioned prior to communications being established. They describe the head node or ingress resource of a sub-graph as part of a system or sub-system. The topic grain’s job is to act as a resource for subscription grains to attach and fulfill the sub-graph enabling transmitter and receiver to communicate unidirectional. This gives us consistency with how a generalized system would behave.

The subscription grains can be either durable or ephemeral. Durable subscription grains are configured with metadata and attached to a topic grain. The metadata is part of the subscription grain-state within Orleans and therefore requires no orthogonal look up or database to manage that would impact performance. Durable subscriptions are used by either active or passive receivers. An active receiver is one that creates a connection to the Piraeus gateway. Once that active connection is established, the identity of the actor is used to create and associate “observables” with the specific subscriptions grain(s) for that identity.  When information flows into the subscription grain from the topic grain, the observable for the subscription is already established and associated with the channel and protocol of the receiver. This creates a direct pipeline between transmitter and receiver.  A passive receiver is a service that does not connect directly to the Piraeus gateway. When information is passed to these durable subscriptions, the subscription grain will forward the information immediately to the service. Piraeus supports RESTful Web services, Event Hubs, Azure Blob Storage, and Service Bus for these passive subscriptions.  This means it is possible to create, e.g., to create Web service, and start receiving information from Piraeus without doing anything else.

Ephemeral subscriptions can be used with self-organizing systems.  These are systems that can organically change and system actors can enter and exit arbitrarily.  Ephemeral subscriptions only exists for the duration of the active channel.  Once the channel is dropped, the subscription grain and its observable are removed and resources are disposed.  This enables a self-system to organize without prior knowledge of edge system components.  Of course, access control plays a major component on what is allowed, which is not a topic for this post.

Orleans is a radically different type of technology, which is a key enabler for the Pegasus-I mission to near space. It makes possible some of the key concepts around flowing information at low latency and organizing graphs through its concept of virtual in-memory actors, i.e., grains.   Observables in Orleans also give Piraeus a simple way to mate specific receivers to their respective subscription grains without need to leverage any store and forward technologies and makes for a remarkably elegant end-2-end communications story that is simple to use. We were able to get telemetry from Pegasus-I inflight to the Web site for a user to view in about 20 milliseconds on average, going from a radio transmission from Pegasus to our field gateways and over MiFi on our phones to Piraeus.  Of course, we also proved it also works at 85,000 feet, 2.2% atmosphere, and -60 degrees Fahrenheit.

Notable Technical Details

  • Orleans is a high throughput, low latency, and linearly scalable technology that enables Piraeus to easily manage a system graph or sub-graph to enable communications between system components.
  • Subscription grains are either durable or ephemeral in the context of Piraeus.
  • Subscription grains can be associated with either active or passive receivers.
  • Piraeus enables systems to be either organized or self-organizing.
  • Information flow is in-memory by default, but can be persisted if required.
  • Piraeus uses no databases or storage accounts to perform its operations.
  • Orleans grain-state is persisted in a customized Orleans Storage Provider that leverages Redis.
  • Piraeus has a multi-channel and multi-protocol gateway that does not couple channel and protocol between transmitter and receiver.
  • There is no intrinsic limitation to message size within Piraeus

There is a tremendous amount more about the technical detail, which is far to overwhelming for this overview post.  I will try to post more fined-grained detail at a later date for those interested.

Best Wishes,

-Matt Long