istio服务网格可观测性平台搭建之调用链跟踪

举报
可以交个朋友 发表于 2023/12/15 16:03:37 2023/12/15
【摘要】 调用链追踪(tracing)是通过记录和关联一次分布式调用中每个阶段的细节,帮助运维人员快速定位发生故障的位置或找出导致性能下降的原因。Istio 利用 Envoy 的分布式追踪功能提供了开箱即用的追踪集成,可通过配置envoy代理来自动发送追踪 Span 到分布式追踪系统服务

一、 背景&简介

为什么需要链路追踪?
调用链追踪(tracing)是通过记录和关联一次分布式调用中每个阶段的细节,帮助运维人员快速定位发生故障的位置或找出导致性能下降的原因。
istio使用Jaeger作为调用链的后端引擎,它是一个来自Uber的开源的分布式调用链跟踪系统,包含用于存储、可视化和跟踪的组件。
image.png


二、 特别说明&约束

Istio一直以来都说自己是无侵入的服务网格方案,即不需要用户修改业务代码。但是需要注意的是在处理调用链跟踪时,虽然不需要用户在业务代码进行埋点,但是仍然需要对请求头进行拦截转发
原因官方文档也给了说明:https://istio.io/latest/zh/docs/tasks/observability/distributed-tracing/overview/
尽管 Istio 代理能够自动发送 Span,但需要一些附加信息才能将这些 Span 加到同一个调用链。 所以当代理发送 Span 信息的时候,应用程序需要附加适当的 HTTP 请求头信息,这样才能够把多个 Span 加到同一个调用链。 要做到这一点,每个应用程序必须从每个传入的请求中收集请求头,并将这些请求头转发到传入请求所触发的所有传出请求。 具体选择转发哪些请求头取决于所配置的跟踪后端。要转发到请求头的设置在每个追踪系统特定的任务页面进行说明


关于请求头的说明:

  • x-request-id : 这是Envoy专用的请求头,用于对日志和分布式追踪的唯一采样标志
  • x-b3-traceid: Trace的ID地址,在第一个span生成时生成TraceId,然后在一个请求中一直向后传递。通过这个字段可以关联多个请求span
  • x-b3-spanid: 在创建span时分配ID地址
  • x-b3-parentspanid: 父SpanId,为本级调用的上一级Span
  • x-b3-sampled: 表示采样结构,1表示上报该span,0表示不上报该span。采样判定一般在根span上进行,赋值后会在后续调用中一直向调用方传递,保证整个trace上的span同时被上报或者不上报
  • x-b3-flags

借助这些信息,在span将所有数据推送到服务端后,服务端就能根据这些信息进行重组,然后在界面上进行展示。


关于istio-proxy是如何生成span信息的呢?

  • 首先对于Inbound流量: 经过sidecar流入应用程序的流量,如果此时Header中没有任何跟踪相关的信息,envoy则会创建一个根span,traceID由这个spanID拼接,然后再将请求传递给业务容器;如果此时请求Header中包含trace相关信息,则sidecar从中提取trace的上下文信息并发送给应用容器

  • 然后对于Outbound流量: 经过sidecar流出的流量,如果此时Header中没有任何跟踪相关的信息,envoy则会创建根span,并将跟该span相关上下文信息放在请求头中传递给下一个调用的服务。如果请求Header中存在trace信息,sidecar从Header中提取span相关信息,并基于这个span创建子span,并将新的span信息加在请求头中进行传递。

注意:上面的描述中,我们一直强调通过请求头Header中获取信息,或者在Header中塞入信息。所以就要求我们的服务之间调用走的restful API, 即需要http通信协议才能使用。


三、 部署Jeager&测试demo

需要提前准备kubernetes 和istio 运行环境。搭建方式参考官方指导。
本人使用的是1.25版本的kubernetes & 1.19版本的istio

  1. Istio默认不启动Jaeger,需要手动创建
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: jaeger
      namespace: istio-system
      labels:
        app: jaeger
    spec:
      selector:
        matchLabels:
          app: jaeger
      template:
        metadata:
          labels:
            app: jaeger
            sidecar.istio.io/inject: "false"
          annotations:
            prometheus.io/scrape: "true"
            prometheus.io/port: "14269"
        spec:
          containers:
            - name: jaeger
              image: "docker.io/jaegertracing/all-in-one:1.46"
              env:
                - name: BADGER_EPHEMERAL
                  value: "false"
                - name: SPAN_STORAGE_TYPE
                  value: "badger"
                - name: BADGER_DIRECTORY_VALUE
                  value: "/badger/data"
                - name: BADGER_DIRECTORY_KEY
                  value: "/badger/key"
                - name: COLLECTOR_ZIPKIN_HOST_PORT
                  value: ":9411"
                - name: MEMORY_MAX_TRACES
                  value: "50000"
                - name: QUERY_BASE_PATH
                  value: /jaeger
              livenessProbe:
                httpGet:
                  path: /
                  port: 14269
              readinessProbe:
                httpGet:
                  path: /
                  port: 14269
              volumeMounts:
                - name: data
                  mountPath: /badger
              resources:
                requests:
                  cpu: 10m
          volumes:
            - name: data
              emptyDir: {}
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: tracing
      namespace: istio-system
      labels:
        app: jaeger
    spec:
      type: ClusterIP
      ports:
        - name: http-query
          port: 80
          protocol: TCP
          targetPort: 16686
        # Note: Change port name if you add '--query.grpc.tls.enabled=true'
        - name: grpc-query
          port: 16685
          protocol: TCP
          targetPort: 16685
      selector:
        app: jaeger
    ---
    # Jaeger implements the Zipkin API. To support swapping out the tracing backend, we use a Service named Zipkin.
    apiVersion: v1
    kind: Service
    metadata:
      labels:
        name: zipkin
      name: zipkin
      namespace: istio-system
    spec:
      ports:
        - port: 9411
          targetPort: 9411
          name: http-query
      selector:
        app: jaeger
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: jaeger-collector
      namespace: istio-system
      labels:
        app: jaeger
    spec:
      type: ClusterIP
      ports:
      - name: jaeger-collector-http
        port: 14268
        targetPort: 14268
        protocol: TCP
      - name: jaeger-collector-grpc
        port: 14250
        targetPort: 14250
        protocol: TCP
      - port: 9411
        targetPort: 9411
        name: http-zipkin
      - port: 4317
        name: grpc-otel
      - port: 4318
        name: http-otel
      selector:
        app: jaeger
    
    部署成功:
    image.png

  1. Istio默认的采样率只有1%,便于采样,最好调整成100%
    调整方式为: kubectl edit iop installed-state -n istio-system
    spec:
      values:
        pilot:
          traceSampling: 100  #默认1,修改为100
    

  1. 部署测试应用bookinfo
    此demo为istio官方示例微服务
    # Copyright Istio Authors
    apiVersion: v1
    kind: Service
    metadata:
      name: details
      labels:
        app: details
        service: details
    spec:
      ports:
      - port: 9080
        name: http
      selector:
        app: details
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: bookinfo-details
      labels:
        account: details
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: details-v1
      labels:
        app: details
        version: v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: details
          version: v1
      template:
        metadata:
          labels:
            app: details
            version: v1
        spec:
          serviceAccountName: bookinfo-details
          containers:
          - name: details
            image: docker.io/istio/examples-bookinfo-details-v1:1.18.0
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 9080
    ---
    ##################################################################################################
    # Ratings service
    ##################################################################################################
    apiVersion: v1
    kind: Service
    metadata:
      name: ratings
      labels:
        app: ratings
        service: ratings
    spec:
      ports:
      - port: 9080
        name: http
      selector:
        app: ratings
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: bookinfo-ratings
      labels:
        account: ratings
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: ratings-v1
      labels:
        app: ratings
        version: v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: ratings
          version: v1
      template:
        metadata:
          labels:
            app: ratings
            version: v1
        spec:
          serviceAccountName: bookinfo-ratings
          containers:
          - name: ratings
            image: docker.io/istio/examples-bookinfo-ratings-v1:1.18.0
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 9080
    ---
    ##################################################################################################
    # Reviews service
    ##################################################################################################
    apiVersion: v1
    kind: Service
    metadata:
      name: reviews
      labels:
        app: reviews
        service: reviews
    spec:
      ports:
      - port: 9080
        name: http
      selector:
        app: reviews
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: bookinfo-reviews
      labels:
        account: reviews
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: reviews-v1
      labels:
        app: reviews
        version: v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: reviews
          version: v1
      template:
        metadata:
          labels:
            app: reviews
            version: v1
        spec:
          serviceAccountName: bookinfo-reviews
          containers:
          - name: reviews
            image: docker.io/istio/examples-bookinfo-reviews-v1:1.18.0
            imagePullPolicy: IfNotPresent
            env:
            - name: LOG_DIR
              value: "/tmp/logs"
            ports:
            - containerPort: 9080
            volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: wlp-output
              mountPath: /opt/ibm/wlp/output
          volumes:
          - name: wlp-output
            emptyDir: {}
          - name: tmp
            emptyDir: {}
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: reviews-v2
      labels:
        app: reviews
        version: v2
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: reviews
          version: v2
      template:
        metadata:
          labels:
            app: reviews
            version: v2
        spec:
          serviceAccountName: bookinfo-reviews
          containers:
          - name: reviews
            image: docker.io/istio/examples-bookinfo-reviews-v2:1.18.0
            imagePullPolicy: IfNotPresent
            env:
            - name: LOG_DIR
              value: "/tmp/logs"
            ports:
            - containerPort: 9080
            volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: wlp-output
              mountPath: /opt/ibm/wlp/output
          volumes:
          - name: wlp-output
            emptyDir: {}
          - name: tmp
            emptyDir: {}
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: reviews-v3
      labels:
        app: reviews
        version: v3
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: reviews
          version: v3
      template:
        metadata:
          labels:
            app: reviews
            version: v3
        spec:
          serviceAccountName: bookinfo-reviews
          containers:
          - name: reviews
            image: docker.io/istio/examples-bookinfo-reviews-v3:1.18.0
            imagePullPolicy: IfNotPresent
            env:
            - name: LOG_DIR
              value: "/tmp/logs"
            ports:
            - containerPort: 9080
            volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: wlp-output
              mountPath: /opt/ibm/wlp/output
          volumes:
          - name: wlp-output
            emptyDir: {}
          - name: tmp
            emptyDir: {}
    ---
    ##################################################################################################
    # Productpage services
    ##################################################################################################
    apiVersion: v1
    kind: Service
    metadata:
      name: productpage
      labels:
        app: productpage
        service: productpage
    spec:
      ports:
      - port: 9080
        name: http
      selector:
        app: productpage
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: bookinfo-productpage
      labels:
        account: productpage
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: productpage-v1
      labels:
        app: productpage
        version: v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: productpage
          version: v1
      template:
        metadata:
          annotations:
            prometheus.io/scrape: "true"
            prometheus.io/port: "9080"
            prometheus.io/path: "/metrics"
          labels:
            app: productpage
            version: v1
        spec:
          serviceAccountName: bookinfo-productpage
          containers:
          - name: productpage
            image: docker.io/istio/examples-bookinfo-productpage-v1:1.18.0
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 9080
            volumeMounts:
            - name: tmp
              mountPath: /tmp
          volumes:
          - name: tmp
            emptyDir: {}
    ---
    
    部署成功:
    image.png

四、 访问测试&观察现象

为了测试方便直接在集群中访问productpage 微服务,它会调用其他微服务。依赖关系为:
image.png

  1. 获得入口服务的clusterIP,然后curl 触发访问请求即可: 访问成功
    image.png

  2. 查看Jeager的UI界面
    span信息已经生成:
    image.png
    详细span图: 总span7条,涉及服务4个,深度 5层
    image.png
    调用链topo图:
    image.png


五、 分析链路

试图探究sidecar是如何传递span信息的,同时又是如何组装成一条完整的线路的。

  1. 将bookinfo对应的负载日志级别设置为trace
    spec:
      template:
        metadata:
          annotations:
            sidecar.istio.io/logLevel: trace
    

  1. 分析span信息
    productpage有三条spans信息,inbound流量一条,outbound流量两条。 reviews有两条spans信息,inbound流量一条,outbound流量一条。 ratings一条inbound span信息 details一条inbound span信息 共计7条spans信息,和Jeager 面板统计的数据保持一致
    由于在productpage的inbound流量中,请求头并未携带任何trace相关信息,此时envoy拦截流量后会自行生成X-B3-TraceId 、X-B3-TraceId等,可通过查询envoy访问日志获取
    2023-11-09T01:55:02.075102Z     trace   envoy http external/envoy/source/common/http/http1/codec_impl.cc:547    [Tags: "ConnectionId":"101"] completed header: key=X-B3-TraceId value=235aa67f7b07cc91bbdd05f23bcd7bab  thread=23
    2023-11-09T01:55:02.075106Z     trace   envoy http external/envoy/source/common/http/http1/codec_impl.cc:547    [Tags: "ConnectionId":"101"] completed header: key=X-B3-SpanId value=bbdd05f23bcd7bab   thread=23
    2023-11-09T01:55:02.075109Z     trace   envoy http external/envoy/source/common/http/http1/codec_impl.cc:547    [Tags: "ConnectionId":"101"] completed header: key=X-B3-Sampled value=1 thread=23
    2023-11-09T01:55:02.075114Z     trace   envoy http external/envoy/source/common/http/http1/codec_impl.cc:547    [Tags: "ConnectionId":"101"] completed header: key=x-request-id value=8bb99618-1793-9fbd-a84b-9118c7f82e92      thread=23
    
    根据日志发现 productpage 中span对应的trace信息:
    'x-b3-traceid': '235aa67f7b07cc91bbdd05f23bcd7bab' , 该traceid整个链路保持一致 'x-b3-spanid': 'bbdd05f23bcd7bab'

  1. 收集span信息
    根据trace信息可得如下关系:
    image.png
    同时还可借助可视化后端Jaeger,查看每个span的耗时:
    image.png

六、 业务代码处理逻辑解析

调用链埋点逻辑是在 Sidecar 代理中完成,应用程序不用处理复杂的埋点逻辑,但应用程序需要配合在请求头上传递生成的 Trace 相关信息。下面抽取服务代码来印证下结论,并看下业务代码到底是怎么修改的。
以productpage.py为例:

  1. 根据下面代码可以看到,根据Productpage 请求的 Restful 方法中从 Request 中提取 Trace 相关的 Header

    @app.route('/productpage')
    @trace()
    def front():
        product_id = 0  # TODO: replace default value
        headers = getForwardHeaders(request)
        ......
    
    
  2. 根据如下代码获取trace信息,header的填充

    
    def getForwardHeaders(request):
        headers = {}
    
        # x-b3-*** headers can be populated using the opentracing span
        span = get_current_span()
        carrier = {}
        tracer.inject(
            span_context=span.context,
            format=Format.HTTP_HEADERS,
            carrier=carrier)
    
        headers.update(carrier)
    
        # We handle other (non x-b3-***) headers manually
        if 'user' in session:
            headers['end-user'] = session['user']
        ......    
    
        return headers
    
    
  3. 最后构造一个请求发出去,请求 /details 服务接口。可以看到请求中包含收到的 Header。

    def getProductDetails(product_id, headers):
        try:
            url = details['name'] + "/" + details['endpoint'] + "/" + str(product_id)
            res = requests.get(url, headers=headers, timeout=3.0)
    ......
    
    

再来看看Details微服务

  1. 根据收到的请求,获取header信息
    server.mount_proc '/details' do |req, res|
        pathParts = req.path.split('/')
        headers = get_forward_headers(req)
    
  2. 在headers中添加trace信息
    def get_forward_headers(request)
      headers = {}
    
      # Keep this in sync with the headers in productpage and reviews.
      incoming_headers = [
          # All applications should propagate x-request-id. This header is
          # included in access log statements and is used for consistent trace
          # sampling and log sampling decisions in Istio.
          'x-request-id',
    
          # Lightstep tracing header. Propagate this if you use lightstep tracing
          # in Istio (see
          # https://istio.io/latest/docs/tasks/observability/distributed-tracing/lightstep/)
          # Note: this should probably be changed to use B3 or W3C TRACE_CONTEXT.
          # Lightstep recommends using B3 or TRACE_CONTEXT and most application
          # libraries from lightstep do not support x-ot-span-context.
          'x-ot-span-context',
    
          # Datadog tracing header. Propagate these headers if you use Datadog
          # tracing.
          'x-datadog-trace-id',
          'x-datadog-parent-id',
          'x-datadog-sampling-priority',
    
          # b3 trace headers. Compatible with Zipkin, OpenCensusAgent, and
          # Stackdriver Istio configurations.
          'x-b3-traceid',
          'x-b3-spanid',
          'x-b3-parentspanid',
          'x-b3-sampled',
          'x-b3-flags',
    
          # Application-specific headers to forward.
          'end-user',
          'user-agent',
    
          # Context and session specific headers
          'cookie',
          'authorization',
          'jwt'
      ]
    
      request.each do |header, value|
        if incoming_headers.include? header then
          headers[header] = value
        end
      end
    
      return headers
    end
    
【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。