Prometheus
Prometheus is an open source monitoring system and time series database. You can use Prometheus with Istio to record metrics that track the health of Istio and of applications within the service mesh. You can visualize metrics using tools like Grafana and Kiali.
Installation
Option 1: Quick start
Istio provides a basic sample installation to quickly get Prometheus up and running:
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.8/samples/addons/prometheus.yaml
This will deploy Prometheus into your cluster. This is intended for demonstration only, and is not tuned for performance or security.
Option 2: Customizable install
Consult the Prometheus documentation to get started deploying Prometheus into your environment. See Configuration for more information on configuring Prometheus to scrape Istio deployments.
Configuration
In an Istio mesh, each component exposes an endpoint that emits metrics. Prometheus works by scraping these endpoints and collecting the results. This is configured through the Prometheus configuration file which controls settings for which endpoints to query, the port and path to query, TLS settings, and more.
To gather metrics for the entire mesh, configure Prometheus to scrape:
- The control plane (
istiod
deployment) - Ingress and Egress gateways
- The Envoy sidecar
- The user applications (if they expose Prometheus metrics)
To simplify the configuration of metrics, Istio offers two modes of operation.
Option 1: Metrics merging
To simplify configuration, Istio has the ability to control scraping entirely by prometheus.io
annotations. This allows Istio scraping to work out of the box with standard configurations such as the ones provided by the Helm stable/prometheus
charts.
While prometheus.io
annotations are not a core part of Prometheus, they have become the de facto standard to configure scraping.
This option is enabled by default but can be disabled by passing --set meshConfig.enablePrometheusMerge=false
during installation. When enabled, appropriate prometheus.io
annotations will be added to all data plane pods to set up scraping. If these annotations already exist, they will be overwritten. With this option, the Envoy sidecar will merge Istio’s metrics with the application metrics. The merged metrics will be scraped from /stats/prometheus:15020
.
This option exposes all the metrics in plain text.
This feature may not suit your needs in the following situations:
- You need to scrape metrics using TLS.
- Your application exposes metrics with the same names as Istio metrics. For example, your application metrics expose an
istio_requests_total
metric. This might happen if the application is itself running Envoy. - Your Prometheus deployment is not configured to scrape based on standard
prometheus.io
annotations.
If required, this feature can be disabled per workload by adding a prometheus.istio.io/merge-metrics: "false"
annotation on a pod.
Option 2: Customized scraping configurations
The built-in demo installation of Prometheus contains all the required scraping configuration. To deploy this instance of Prometheus, follow the steps in Customizable Install with Istioctl to install Istio and pass --set values.prometheus.enabled=true
during installation.
This built-in deployment of Prometheus is intended for new users to help them quickly getting started. However, it does not offer advanced customization, like persistence or authentication and as such should not be considered production ready. To use an existing Prometheus instance, add the scraping configurations in prometheus/configmap.yaml
to your configuration.
This configuration will add scrape job configurations for the control plane, as well as for all Envoy sidecars. Additionally, a job is configured to scrape application metrics for all data plane pods with relevant prometheus.io
annotations:
spec:
template:
metadata:
annotations:
prometheus.io/scrape: true # determines if a pod should be scraped. Set to true to enable scraping.
prometheus.io/path: /metrics # determines the path to scrape metrics at. Defaults to /metrics.
prometheus.io/port: 80 # determines the port to scrape metrics at. Defaults to 80.
TLS settings
The control plane, gateway, and Envoy sidecar metrics will all be scraped over plaintext. However, the application metrics will follow whatever Istio configuration has been configured for the workload. In particular, if Strict mTLS is enabled, then Prometheus will need to be configured to scrape using Istio certificates.
One way to provision Istio certificates for Prometheus is by injecting a sidecar which will rotate SDS certificates and output them to a volume that can be shared with Prometheus. However, the sidecar should not intercept requests for Prometheus because the Prometheus’s model of direct endpoint access is incompatible with Istio’s sidecar proxy model.
Add the following annotations to the Prometheus deployment to inject a sidecar that will write a certificate to a shared volume, but without configuring traffic redirection:
spec:
template:
metadata:
annotations:
sidecar.istio.io/inject: "true"
traffic.sidecar.istio.io/includeInboundPorts: "" # do not intercept any inbound ports
traffic.sidecar.istio.io/includeOutboundIPRanges: "" # do not intercept any outbound traffic
proxy.istio.io/config: | # configure an env variable `OUTPUT_CERTS` to write certificates to the given folder
proxyMetadata:
OUTPUT_CERTS: /etc/istio-output-certs
sidecar.istio.io/userVolume: '[{"name": "istio-certs", "emptyDir": {"medium":"Memory"}}]' # mount the shared volume
sidecar.istio.io/userVolumeMount: '[{"name": "istio-certs", "mountPath": "/etc/istio-output-certs"}]'
To use the provisioned certificate, mount the shared volume for the Prometheus container and set the scraping job TLS context as follow:
volumeMounts:
- mountPath: /etc/prom-certs/
name: istio-certs
scheme: https
tls_config:
ca_file: /etc/prom-certs/root-cert.pem
cert_file: /etc/prom-certs/cert-chain.pem
key_file: /etc/prom-certs/key.pem
insecure_skip_verify: true # Prometheus does not support Istio security naming, thus skip verifying target pod ceritifcate
Best practices
For larger meshes, advanced configuration might help Prometheus scale. See Using Prometheus for production-scale monitoring for more information.
See also
Reworking our Addon Integrations
A new way to manage installation of telemetry addons.
Information on how to integrate with Grafana to set up Istio dashboards.
How to integrate with Jaeger.
Information on how to integrate with Kiali.
Remotely Accessing Telemetry Addons
This task shows you how to configure external access to the set of Istio telemetry addons.
How to integrate with Zipkin.