How to: Horizontally scale subscribers with StatefulSets

Learn how to subscribe with StatefulSet and scale horizontally with consistent consumer IDs

Unlike Deployments, where Pods are ephemeral, StatefulSets allows deployment of stateful applications on Kubernetes by keeping a sticky identity for each Pod.

Below is an example of a StatefulSet with Dapr:

  1. apiVersion: apps/v1
  2. kind: StatefulSet
  3. metadata:
  4. name: python-subscriber
  5. spec:
  6. selector:
  7. matchLabels:
  8. app: python-subscriber # has to match .spec.template.metadata.labels
  9. serviceName: "python-subscriber"
  10. replicas: 3
  11. template:
  12. metadata:
  13. labels:
  14. app: python-subscriber # has to match .spec.selector.matchLabels
  15. annotations:
  16. dapr.io/enabled: "true"
  17. dapr.io/app-id: "python-subscriber"
  18. dapr.io/app-port: "5001"
  19. spec:
  20. containers:
  21. - name: python-subscriber
  22. image: ghcr.io/dapr/samples/pubsub-python-subscriber:latest
  23. ports:
  24. - containerPort: 5001
  25. imagePullPolicy: Always

When subscribing to a pub/sub topic via Dapr, the application can define the consumerID, which determines the subscriber’s position in the queue or topic. With the StatefulSets sticky identity of Pods, you can have a unique consumerID per Pod, allowing each horizontal scale of the subscriber application. Dapr keeps track of the name of each Pod, which can be used when declaring components using the {podName} marker.

On scaling the number of subscribers of a given topic, each Dapr component has unique settings that determine the behavior. Usually, there are two options for multiple consumers:

  • Broadcast: each message published to the topic will be consumed by all subscribers.
  • Shared: a message is consumed by any subscriber (but not all).

Kafka isolates each subscriber by consumerID with its own position in the topic. When an instance restarts, it reuses the same consumerID and continues from its last known position, without skipping messages. The component below demonstrates how a Kafka component can be used by multiple Pods:

  1. apiVersion: dapr.io/v1alpha1
  2. kind: Component
  3. metadata:
  4. name: pubsub
  5. spec:
  6. type: pubsub.kafka
  7. version: v1
  8. metadata:
  9. - name: brokers
  10. value: my-cluster-kafka-bootstrap.kafka.svc.cluster.local:9092
  11. - name: consumerID
  12. value: "{podName}"
  13. - name: authRequired
  14. value: "false"

The MQTT3 protocol has shared topics, allowing multiple subscribers to “compete” for messages from the topic, meaning a message is only processed by one of them. For example:

  1. apiVersion: dapr.io/v1alpha1
  2. kind: Component
  3. metadata:
  4. name: mqtt-pubsub
  5. spec:
  6. type: pubsub.mqtt3
  7. version: v1
  8. metadata:
  9. - name: consumerID
  10. value: "{podName}"
  11. - name: cleanSession
  12. value: "true"
  13. - name: url
  14. value: "tcp://admin:public@localhost:1883"
  15. - name: qos
  16. value: 1
  17. - name: retain
  18. value: "false"

Next steps

Last modified October 11, 2024: Fixed typo (#4389) (fe17926)