I’d like to start a discussion around deploying stateful applications to Istio. Typically, applications deployed to Istio are stateless, any state that they do work with is coordinated through a database or some other technology, not by the application itself. Stateful applications in contrast hold and manage state themselves. An example of a technology used to build stateful applications is Akka, which allows managing state resiliently and at scale using CRDTs, sharding and a number of other techniques.
Stateful applications expect to communicate with other services, both inbound and outbound, like any other application - through the service mesh. There are no special requirements there. However, to manage state, stateful applications have an additional requirement of being able to form a cluster within a single replica set, and doing this requires making addressed connections to other pods in that set. Through this cluster and the connections involved, in Akka for example, the state of CRDTs is replicated across all pods in the replica set, messages are routed to and from sharded entities, and other communication such scalable pubsub (scalable to millions of active subscribers on millions of topics) are implemented. This requirement for communication between pods makes them similar to distributed databases and queuing technologies (like Kafka), but once again with a few significant differences - stateful applications can be elastically scaled up and down on demand in response to load (you can’t do that with something like Cassandra or Kafka, scaling requires careful planning and forethought), and stateful applications are developed by an organisations developers and hence need to support a deployment workflow like a regular service, which is quite different to how you would manage the operations of a database. Hence, what works for these databases (eg, stateful sets) doesn’t work so well for stateful applications.
The requirement for pods in a single replica set to communicate with each other through addressed communication (that is, communicating with a specific pod, rather than communicating with whatever pod the service mesh routes you to) is at odds with Istios current feature set, since Istio expects everything to not care which particular pod you talk to, rather to let the service mesh handle that. There are ways to make it work in a quite hacky way, ways which require creating service entries that are overly permissive in what communication is allowed. But I don’t believe there is any fundamental reason why Istio couldn’t or shouldn’t support this use case “natively” - allowing pods in the same replica set to communicate with each other, potentially with full authentication/authorization applied (enforcing that only pods of the same replica set are allowed to connect to each other with mTLS etc).
Has there been any discussion before about supporting this class of application? Would the Istio maintainers be willing to either implement or accept contributions for improvements that make it straight forward to configure and deploy stateful applications?