Search

Jaeger Tracing

Updated: May 8

What is Jaeger?

Jaeger is an open source software for tracing transactions between distributed services. It’s used for monitoring and troubleshooting complex microservices environments.

It can be used for monitoring microservices-based distributed systems:

  • Distributed context propagation

  • Distributed transaction monitoring

  • Root cause analysis

  • Service dependency analysis

  • Performance / latency optimization






What is distributed tracing?

Distributed tracing is a way to see and understand the whole chain of events in a complex interaction between microservices.

Modern, cloud-native software development relies on microservices: independent services that each provide a different core function. When a user makes a request in an app, many individual services respond to produce a result.

A single call in an app can invoke dozens of different services that interact with each other. How can developers and engineers isolate a problem when something goes wrong or a request is running slow? We need a way to keep track of all the connections.


That’s where distributed tracing comes in. It’s often run as part of a service mesh, which is a way to manage and observe microservices.

Jaeger uses distributed tracing to follow the path of a request through different microservices. Rather than guessing, or relying on metrics or logs, we can see a visual representation of the call flows.

Organized information about transactions is useful for debugging and optimization. Jaeger includes tools to monitor distributed transactions, optimize performance and latency, and perform root cause analysis (RCA), a method of problem solving.


Span and traces


In distributed tracing, a trace is a view into a request as it moves through a distributed system. Multiple spans represent different parts of the workflow and are pieced together to create a trace. A span is a named, timed operation that represents a piece of the workflow.

Each trace consists of spans. A span is a description of an action/operation that occurs in our system; for example an HTTP request or a database operation that spans over time

A trace is a tree/list of spans representing the progression of requests as it is handled by the different services and components in our system. For example, sending an API call to user-service resulted in a DB query to users-db.

Traces show us how long each request took, which components and services it interacted with, and the latency introduced during each step, giving you a complete picture, end-to-end.









Major Components of Jaeger


Jaeger Client Libraries — Jaeger clients are language-specific implementations of the OpenTracing API. These can be used manually or with a variety of open source frameworks.

Jeager Agent — The Jaeger agent is a network daemon that listens for spans sent over UDP, which it batches and sends to the collector. It is designed to be deployed to all hosts as an infrastructure component. The agent abstracts the routing and discovery of the collectors away from the client.

Jeager Collector — The Jaeger collector receives traces from Jaeger agents and runs them through a processing pipeline. Currently, the pipeline validates traces, indexes them, performs transformations, and finally, stores them. Jaeger’s storage is a pluggable component which currently supports Cassandra, Elasticsearch, and Kafka.

Query — Query is a service that retrieves traces from storage and hosts a UI to display them.

Ingester — Ingester is a service that reads from Kafka topic and writes to another storage backend (Cassandra, Elasticsearch).

Jaeger Console is a user interface that lets you visualize your distributed tracing data.



Pros and cons


Pros:

  • Very easy to install

  • Easy to configure with an datastore of your choice back end

  • Open Source

  • Feature Rich UI

  • CNCF Project

What Jaeger lacks in maturity, it makes up for in speed and flexibility, and its newer, more dispersed parallel architecture. It’s also more performant and easier to scale. Jaeger has better official language support than its older rival, and you can also look at its CNCF support as a badge of approval.


Cons:

Jaeger’s relative immaturity as a disadvantage. Jaeger’s choice of Go as its main language illustrates this point. Although the Gophers are extending their community fast they are far from being as common as Java. If you’re not familiar with Go, this can make your learning process longer.


Another area that is both a blessing and a curse for Jaeger is its more modern architecture. This architecture offers benefits in terms of performance, reliability and scalability, but it’s also far more complex and harder to maintain.



Alternatives


Jaeger VS DataDog


DataDog is one of the major SaaS vendors in the APM space. On the other hand, Jaeger is a popular open-source distributed tracing tool that graduated from Cloud Native Computing Foundation. The differences between the tools arise from this genesis.

Some of the key differences between DataDog and Jaeger are:

1. Correlation of trace data

DataDog lets you connect your trace data to a lot of other performance metrics like infrastructure and host metrics, as it is not limited to distributed tracing. Jaeger collects trace data which can give you insights on latencies of requests. You can't use Jaeger for collecting metrics for hosts, networks, etc.

2. Code Instrumentation

Instrumentation is the process of generating telemetry data from your application. Jaeger uses OpenTracing APIs for code instrumentation. The data format of telemetry data generated is vendor-neutral in the case of Jaeger, and you can also use other back-end analysis tools. DataDog provides DataDog agents which run on your host to collect events and metrics. In the case of proprietary instrumentation agents, your monitoring stack gets locked into a vendor soon. DataDog also supports ingestion from open-source standards like OpenTelemetry, but it's not a first-class citizen.

3. Data Storage

Jaeger offers two popular open-source databases for storing trace data: Cassandra and Elasticsearch. DataDog is a third-party cloud vendor where your data gets stored in DataDog's servers.

4. Web UI

DataDog is a SaaS tool that offers a much smoother and more elaborate dashboarding experience, including many customizations. Jaeger's web UI is limited, although it can serve the purpose of distributed tracing.

The decision between DataDog and Jaeger comes down to whether your organization has the budget to go for a paid SaaS tool like DataDog or does your organization have the engineering bandwidth to run an open-source tool like Jaeger. In addition, as Jaeger is limited to just distributed tracing, your decision also needs to account for whether you need to monitor other components of your application.

The lack of great user experience in open-source tools has always been there. Also, what if there was an open-source tool that could provide the scope of experience of a great SaaS tool like DataDog.

Jaeger VS Zipkin



Both tools are good options for collecting and managing distributed tracing data. They are both remarkably similar, evenly matched, and will do the job; and they both support distributed tracing libraries, OpenCensus, OpenTracing and OpenTelemetry. In addition, Zipkin and Jaeger have a wide range of extensibility options and tool integration, and both support virtualization and containerization. Both tools rely on in-memory storage and face similar issues with data loss.

For those who don’t want to live on the bleeding edge, Zipkin is the better choice. It’s more mature and has a bigger and more mature community. Zipkin has wide industry support, and its Java roots make it suitable for the world of enterprise IT (where Java still rules).

What Jaeger lacks in maturity, it makes up for in speed and flexibility, and its newer, more dispersed parallel architecture. It’s also more performant and easier to scale. Jaeger has better official language support than its older rival, and you can also look at its CNCF support as a badge of approval.

Both Jaeger and Zipkin are strong contenders when it comes to a distributed tracing tool. But are traces enough to solve all performance issues of a modern distributed application? The answer is no. You also need metrics and a way to correlate metrics with traces with a single dashboard. Most SaaS vendors provide both metrics and traces under a single pane of glass. But the beauty of Jaeger and Zipkin is that they are open-source.


87 views0 comments

Recent Posts

See All