I suppose everyone has heard a lot about streaming in Akka — about its importance and the possible areas of application: banking transactions, IoT, exchange trading and in general any area with fast-changing / fast-generating data or events.
Last year I had a chance to work for a short time in a company that uses Akka Streams a lot in production. Some microservices looked like this:
A lot of different streams flowing in different directions, mixing and creating new streams. Everything implemented in Scala with untyped Actors and Akka-Http. Effective and intimidating :)
Almost any article I’ve ever seen regarding Akka Streams used as a source some external service or some part of an existing project that could be hard to reproduce (the situation is relaxed with docker). So for some of my experiments with Akka Stream I’ve created a small Scala application that emulates Conway’s Game of Life (https://github.com/fedor-malyshkin/conway-life-stream-server) with actors as cells that able to generate a stable flow of events (local autonomous source of a stream with configurable speed and volume).
It’s a basic application that contains two parts in fact — the group of actors which implement Conway’s Game of Life and Akka Stream that publish current field’s state through a WebSocket or an HTTP Stream. At the moment it’s published on a small Kubernetes cluster as a single pod (mostly for experiments with Kubernetes and GitHub Actions’ abilities to replace some mature CI/CD solutions).
It wasn’t very informative at the beginning to look at the stream from the application in textual format, trying to understand how it works, correct or not. So apart from the creation of some tests, I’ve created a primitive React app (https://github.com/fedor-malyshkin/conway-life-stream-web-client)(it is published on Heroku (http://secret-retreat-65283.herokuapp.com/) to overlook currents state of the game’s field and also a fancy picture as well :)). You can press a “Start” and get some statistics from it (if something breaks — please let me know).
A couple of words about cells in the picture — I’ve seen this idea somewhere else and found it pretty interesting — each cell has a series of states after its death (state of decay maybe)
It allows to create some fancy trail on the field, like this one:
Some technicalities about implementation:
- Everything can be compiled, built and run with gradle (https://gradle.org/), as I use gradle-wrapper nothing is needed to be downloaded or set up in advance;
- Used libraries (downloaded automatically): Akka, Akka Http, Akka Streams.
Further, I will provide the steps to work with Akka Stream, however, if you are looking for any info about Akka Actors itself (especially typed) — I can recommend an excellent series from Manuel Bernhardt “Tour of Akka Typed: Protocols and Behaviors”.
The first and most important thing for us is the ability to distribute our messages to the dynamic list of clients (future connections) — each client (connection) should get the same information (the same events).
It could be achieved by code like this on:
There we have:
Source(in 1st line) for our broadcaster that will be populated by a special kind of queue we will create shortly;
- the mentioned queue of type
- and source of type
Source[String, NotUsed]to which we can connect any number of times we want.
An important note about our BrodcastHub — it always has to have some connected consumer. Otherwise, it will “backpressure the upstream producer until subscribers arrive”. So I’ve created such in line 8.
The next step will be a bridge between our HTTP layer (Akka Http) and our streams:
As it is said in official documentation any code you write when you create a pipeline in Akka Streams forms some form of architect’s blueprint that can be reused later (maybe many times). This approach with the separation of stream composition and its materialization allows us to create the different types of streams for various clients. For some we can publish our stream as JSON, for some as XML — it is seen in lines 11–12.
One more aspect when you stream something — a client could want to have some digest at the beginning (right after connecting to our stream). There it is a state of the field with all creatures. In this situation, we get help from the fact that each stream is a separate part that is built dynamically during connection and has its own lifecycle and could be configured differently according to a client.
So in this scenario the valve #1 is open at the beginning, then a digest message from this pipeline is sent, then the pipeline is closed forever and eventually the valve # 2 is open for the further flow of messages.
A couple of words about slow clients (with low bandwidth for instance) that could consume you stream much slower that you produce it.
It could be easily emulated by command:
curl --limit-rate 1k http://IP:PORT/stream
It also could be easily fixed by such modification:
Of course, we aren’t going, create something that provides a full picture for such clients (in this case)— we only send them the freshest info we will be able to send (with some gaps).
The last but not the least is the ability to transform messages on-fly according to different conditions (i.e. content and time window). There we want to reduce the client’s overhead on processing all possible messages one by one. We can compose similar messages in a package and start a new package if a new type of message arrived or we aggregated it too long.
It could be achieved with something like this (by changing our
for this we use operator
groupWithin, that pack our flow “into groups of elements received within a time window, or limited by the given number of elements” and
mapConcat that allows us to pack one sequence to another according to the logic from method
Ok. I think this is the enough for the description of the Akka Stream pipeline for such basic stream generator.