Custom Tracing span tags on Istio 1.10

Hello all:

We’re upgrading our K8S clusters on GCP and we also have been upgrading the Istio version.

After our rollout on the development environment, where we moved our K8s from 1.16 and Istio 1.4.5 to K8s version 1.19 and Istio 1.10.1. So far, so good.

Apart from that, we use GCP Trace. Previously, we have to do customizations on Istio Mixer to have stack driver enabled on Istio and also spanning tags. Here is how the configuration looked like when we were configuring with Mixer:


apiVersion: "config.istio.io/v1alpha2"
kind: handler
metadata:
  name: stackdriver-tracing-handler-example
  namespace: istio-system
spec:
  compiledAdapter: stackdriver
  params:
    trace:
      sampleProbability: 1
---
apiVersion: "config.istio.io/v1alpha2"
kind: instance
metadata:
  name: stackdriver-span-example
  namespace: istio-system
spec:
  compiledTemplate: tracespan
  params:
    traceId: request.headers["x-b3-traceid"]
    spanId: request.headers["x-b3-spanid"] | ""
    parentSpanId: request.headers["x-b3-parentspanid"] | ""
    spanName: destination.service.name + request.path
    startTime: request.time
    endTime: response.time
    clientSpan: (context.reporter.kind | "inbound") == "outbound"
    spanTags:
      http.method: request.method | ""
      http.status_code: response.code | 200
      http.url: request.path | ""
      destination_service_name: destination.service.name | "unknown"
      destination_service_namespace: destination.service.namespace | "unknown"
      destination_port: destination.port | 0
      request_operation: conditional((context.protocol | "unknown") == "grpc", request.path | "unknown", request.method | "unknown")
      request_protocol: context.protocol | "unknown"
      api_version: api.version | "unknown"
      api_name: api.service | "unknown"
      response_code: response.code | 0
      service_authentication_policy: conditional((context.reporter.kind | "inbound") == "outbound", "unknown", conditional(connection.mtls | false, "mutual_tls", "none"))
      source_workload_namespace: source.workload.namespace | "unknown"
      source_workload_name: source.workload.name | "unknown"
      source_owner: source.owner | "unknown"
      destination_workload_namespace: destination.workload.namespace | "unknown"
      destination_workload_name: destination.workload.name | "unknown"
      destination_owner: destination.owner | "unknown"
      http_url: request.path | ""
      request_size: request.size | 0
      response_size: response.size | 0
      source_ip: source.ip | ip("0.0.0.0")
---

apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
  name: stackdriver-tracing-rule-example
  namespace: istio-system
spec:
  match: (context.protocol == "http" || context.protocol == "grpc") && conditional(match(request.path, "*.php"), false, true) && destination.service.namespace == "example" && request.path != "/heartbeat" && request.path != "/"  && request.path != "/metrics"
  actions:
    - handler: stackdriver-tracing-handler-example
      instances:
        - stackdriver-span-example
---

Since Istio 1.8, Mixer support was dropped and some configurations that were done with CRDs such as rule must be done on top of envoy using envoyfilter CRD. I didn’t think I would need to port those resources on Istio 1.10.

However, looking at the GCP Trace Dashboard, I can notice some differences, wherein the old cluster I can see the tags are being correctly being spanned, but with the newer cluster, I don’t see in the same format as previously. Here is a screenshot showing how it looks like:

The effects, in short, are the following:

  • Previously we have our tracing labels as “Sent.SOMETHING” and “Recv.SOMETHING”. It’s all now starting with FQDN such as “example.default.svc.cluster.local”.
  • There are duplications on the tracing. You can see that the “example.default.svc.cluster.local” appears twice. I guess that’s the part where we should see “Sent.SOMETHING” and “Recv.SOMETHING”.
  • We no longer have our spanned tags like we used to have.

I tried to find a good example on how to expand those tags back on this link or this link but nothing seemed to be very clear on how to accomplish that.

Can you please advise me on how should I create those tags back?

Thank you,
Willian

Willian,

Unfortunately, you’ve discovered one of the main drawbacks to the migration away from Mixer-based telemetry. While we have support for custom tags in Istio (doc), including with the new pre-alpha Telemetry API (doc), the customization does not allow access to the full set of attributes that Mixer once provided.

As the Telemetry API matures, we plan on providing CEL-based access to attributes for custom tags. That work must first be supported in upstream Envoy, but I believe @kuat is motivated to work on adding that support.

Which of the now missing attributes is most critical for your environment? Request Header-based custom tags via the new Telemetry API may get you a good bit of the way there. Additionally, the istio.* tags may cover the need for workload metadata.

Hope that helps,
Doug.

Hi Douglas.

One of the things I noticed is that the duplication on the tracing with ‘example.default.svc.cluster.local’ actually is the Ingress Gateway. Then the second tracing with ‘example.default.svc.cluster.local’ was actually the Envoy Proxy at the pod level. I should have taken a closer look at what were each tag value then I knew why there are two spans like that.

In short, what we would like to maintain was the same format that we previously had back with Istio Mixer.

But regarding on my attempts to customise the tags, one of the configuration that I tried was creating the following configuration on the installed-state resource under io K8s API:

  meshConfig:
    enableTracing: true
    defaultConfig:
      tracing:
        custom_tags:
          my_tag_header:
            header:
              name: host
        max_path_tag_length: 256

I couldn’t see this config being pushed to the Envoy proxies using the istioctl proxy-config command. However, setting at the deploy level make it push, but then I lose the tracing because I have to implement it at the first envoy proxy, the Evoy-Ingress. I wish I didn’t have to do it in different parts though.

Since we couldn’t customise our custom tags, we’re moving away from implementing it using Stack Driver and we will use Jaeger for that. Unfortunately, I don’t have the same scenario that I had previously when I was attempting to customise the tags.