Circuit Breaker: Client vs. Server Side?

Hi all,

I have a question regarding the design decision for Istio’s current circuit breaking strategy.

As far as I can see in the docs, the concrete circuit breaking behavior can be expressed as trafficPolicy inside the DestinationRule resource. This implies that properties such as maxConnections are expressed at the server-side (the destination). However, in most library-based white box approaches such as Netflix the conditions for opening the breaker are expressed on the client-side. On one hand the consequence of this strategy is that it is not possible (with Istio) to specify conditions specific to a specific client consuming the service. On the other hand, it is possible to reason on maxConnections from the service’s point of view with the server-side strategy.

The pattern described above does not only apply for circuit breaking but also for timeouts for example which are expressed in the virtual service (again server/service-side).

Based on those observations, would it make sense to extend Istio to express properties or constraints not only for destinations (DestinationRule) but also for sources (SourceRule?) or even a more powerful ruleset where I could have selectors based on destination AND source? What do you think?

Thanks,

Benjamin

1 Like

From playing around with it in the past, if you set max connections: 5 and have 10 clients, your server will get 50 connections to it total.

The Destination works on client side @Benjamin_Schmeling

@hzxuzhonghu In SDK usage, each client’s config is different, such as the circuit breakers params or connection pool params. Although destinationrule works on the client side, the destinationrule is server oriented and is designed to be a generic configuration. If clients use different config, one subset needs to be created for each client, while the labels are all the same. so i don’t think destinationrule is designed for this scene.

IMHO making client site circuit-breaking is not good approach. It doesn’t fully protect service. If we implement the CB on each endpoint (at server-side) independently, and we will have metrics on current load it will much better. All CB’s I know uses errors, and/or probably responsetime. If server returns error or connection unpredictable closed, it is quite probably that just our request has just caused server problem (overload them, consmed too much memory, or even crashed).