Pattern: Distributed tracing

pattern   observability   service design  

Context

You have applied the Microservice architecture pattern. Requests often span multiple services. Each service handles a request by performing one or more operations, e.g. database queries, publishes messages, etc.

Problem

How to understand the behavior of an application and troubleshoot problems?

Forces

  • External monitoring only tells you the overall response time and number of invocations - no insight into the individual operations
  • Any solution should have minimal runtime overhead
  • Log entries for a request are scattered across numerous logs

Solution

Instrument services with code that

  • Assigns each external request a unique external request id
  • Passes the external request id to all services that are involved in handling the request
  • Includes the external request id in all log messages
  • Records information (e.g. start time, end time) about the requests and operations performed when handling a external request in a centralized service

This instrumentation might be part of the functionality provided by a Microservice Chassis framework.

Examples

The Microservices Example application is an example of an application that uses client-side service discovery. It is written in Scala and uses Spring Boot and Spring Cloud as the Microservice chassis. They provide various capabilities including Spring Cloud Sleuth, which provides support for distributed tracing. It instruments Spring components to gather trace information and can delivers it to a Zipkin Server, which gathers and displays traces.

The following Spring Cloud Sleuth dependencies are configured in build.gradle:

dependencies {
  compile "org.springframework.cloud:spring-cloud-sleuth-stream"
  compile "org.springframework.cloud:spring-cloud-starter-sleuth"
  compile "org.springframework.cloud:spring-cloud-stream-binder-rabbit"

RabbitMQ is used to deliver traces to Zipkin.

The services are deployed with various Spring Cloud Sleuth-related environment variables set in the docker-compose.yml:

environment:
  SPRING_RABBITMQ_HOST: rabbitmq
  SPRING_SLEUTH_ENABLED: "true"
  SPRING_SLEUTH_SAMPLER_PERCENTAGE: 1
  SPRING_SLEUTH_WEB_SKIPPATTERN: "/api-docs.*|/autoconfig|/configprops|/dump|/health|/info|/metrics.*|/mappings|/trace|/swagger.*|.*\\.png|.*\\.css|.*\\.js|/favicon.ico|/hystrix.stream"

This properties enable Spring Cloud Sleuth and configure it to sample all requests. It also tells Spring Cloud Sleuth to deliver traces to Zipkin via RabbitMQ running on the host called rabbitmq.

The Zipkin server is a simple, Spring Boot application:

@SpringBootApplication
@EnableZipkinStreamServer
public class ZipkinServer {

  public static void main(String[] args) {
    SpringApplication.run(ZipkinServer.class, args);
  }

}

It is deployed using Docker:

zipkin:
  image: java:openjdk-8u91-jdk
  working_dir: /app
  volumes:
    - ./zipkin-server/build/libs:/app
  command: java -jar /app/zipkin-server.jar --server.port=9411
  links:
    - rabbitmq
  ports:
    - "9411:9411"
  environment:
    RABBIT_HOST: rabbitmq

Resulting Context

This pattern has the following benefits:

  • It provides useful insight into the behavior of the system including the sources of latency
  • It enables developers to see how an individual request is handled by searching across aggregated logs for its external request id

This pattern has the following issues:

  • Aggregating and storing traces can require significant infrastructure
  • Log aggregation - the external request id is included in each log message

See also

  • Open Zipkin - service for recording and displaying tracing information
  • Open Tracing - standardized API for distributed tracing

pattern   observability   service design  


Copyright © 2024 Chris Richardson • All rights reserved • Supported by Kong.

About www.prc.education

www.prc.education is brought to you by Chris Richardson. Experienced software architect, author of POJOs in Action, the creator of the original CloudFoundry.com, and the author of Microservices patterns.

Upcoming public workshops: Microservices and architecting for fast flow

In-person: Berlin and Milan

DevOps and Team topologies are vital for delivering the fast flow of changes that modern businesses need.

But they are insufficient. You also need an application architecture that supports fast, sustainable flow.

Learn more and register for one of my upcoming public workshops in November.

NEED HELP?

I help organizations improve agility and competitiveness through better software architecture.

Learn more about my consulting engagements, and training workshops.

LEARN about microservices

Chris offers numerous other resources for learning the microservice architecture.

Get the book: Microservices Patterns

Read Chris Richardson's book:

Example microservices applications

Want to see an example? Check out Chris Richardson's example applications. See code

Virtual bootcamp: Distributed data patterns in a microservice architecture

My virtual bootcamp, distributed data patterns in a microservice architecture, is now open for enrollment!

It covers the key distributed data management patterns including Saga, API Composition, and CQRS.

It consists of video lectures, code labs, and a weekly ask-me-anything video conference repeated in multiple timezones.

The regular price is $395/person but use coupon CCMHVSFB to sign up for $95 (valid until November 8th, 2024). There are deeper discounts for buying multiple seats.

Learn more

Learn how to create a service template and microservice chassis

Take a look at my Manning LiveProject that teaches you how to develop a service template and microservice chassis.

Signup for the newsletter


BUILD microservices

Ready to start using the microservice architecture?

Consulting services

Engage Chris to create a microservices adoption roadmap and help you define your microservice architecture,


The Eventuate platform

Use the Eventuate.io platform to tackle distributed data management challenges in your microservices architecture.

Eventuate is Chris's latest startup. It makes it easy to use the Saga pattern to manage transactions and the CQRS pattern to implement queries.


Join the microservices google group