Summary of "Beyond Line Charts • Yao Yue • YOW! 2025"
Main ideas / lessons
-
Line charts dominate observability dashboards, but they are often a poor match for the questions people need to answer.
- Line charts connect points with straight segments, which implicitly interpolates—creating visual detail that may not reflect real underlying behavior.
- When many instances/series are shown together, noise multiplies, making it hard to see meaningful signal (“squinting based analysis”).
-
Good data visualization should make the insight obvious.
- The talk uses three motivating examples to show that, with common line-chart setups, it’s often unclear whether behavior changed or whether a deployment affected latency.
-
Data visualization should follow “form follows function.”
- Start from:
- Data shape (how noisy/busy the values are)
- Telemetry data type (counter, gauge, histogram)
- The question being answered (what you’re trying to decide; often not time-based)
- Start from:
-
Apply appropriate transformations or visualization choices to reduce misleading artifacts
- Windowed averages / smoothing: reduce noise for slowly changing metrics.
- Aggregate across instances (mean/max, etc.): reduce clutter and reveal trends.
- Local smoothing / regression (especially for tail quantiles like P99): reveal differences that were hidden by raw noise.
- Show raw data points instead of lines when lines imply interpolation that isn’t justified (e.g., before/after deploy comparisons).
-
Raw data can be more truthful than line interpolation
- The speaker argues for the “power of raw data,” emphasizing that line charts can “butcher” the underlying pattern by adding decorative interpolation.
-
Telemetry types imply better starting visualization semantics
- Counters (monotonic cumulative measurements; e.g., bytes written)
- Line charts are reasonable for counters themselves, but most monitoring uses delta counters (rates).
- For delta counters, the more semantically correct representation is:
- Step functions (since increments/rates accumulate discretely)
- Optionally include uncertainty (e.g., standard deviation shown as boxes).
- Gauges (spot measurements; e.g., queue depth at an instant)
- Connecting gauge readings with lines is misleading because the intermediate transition is unknown.
- Prefer dots over lines to avoid implying smooth transitions.
- Histograms (distribution metrics; e.g., latency quantiles)
- A “single latency number” is misleading.
- A histogram tracks many quantiles simultaneously (P50, P99, P99.9, etc.).
- Better representations include:
- Showing multiple quantile curves with less noise (since each quantile behaves like a gauge reading)
- Heatmap-style views: use color for value and y-axis for quantiles to visualize distributions across quantiles at once.
- Counters (monotonic cumulative measurements; e.g., bytes written)
-
Many operational/business questions are not “time series plots”; they’re comparisons and groupings
- Example decisions:
- Can we increase load without impacting SLO?
- Has a new version regressed?
- Can we deploy safely without harming reliability?
- Which SKU/VM type provides the best ROI?
- To answer these, you often need to transform data into a table / panel-data shape rather than a time-axis chart.
- Proposed framing:
- Transform telemetry into panel data / data frames (rows as entities/conditions, columns for variables).
- Time becomes just another column when needed, rather than always being the x-axis.
- This enables direct matching/grouping such as:
- Group latency values by load level (removes time, makes x-axis “load”).
- Group latency distributions by software version (compare distributions side-by-side).
- Combine throughput + pricing + SKU attributes to compute ROI.
- Example decisions:
-
Call to action
- Telemetry data should not be “imprisoned” in the TSDB: it should be extractable and transformable for deeper analysis and visualization.
- Tooling gaps to address:
- UIs should provide better guidance for visualization based on telemetry data type.
- Exploring data should require less friction (e.g., simpler statistical summaries without hand-written queries).
- Observability systems should support easier data egress to other analysis environments.
Methodology / instructions presented
A) Choose visualization based on a “data-to-question” workflow
-
Identify the data shape
- If the time series is overly busy/noisy:
- Apply windowed averaging (smoothing) and re-check whether the trend/uptick becomes interpretable.
- If still noisy:
- Apply local smoothing / regression to the noisy tail quantile (e.g., P99) specifically.
- If interpolation is misleading:
- Prefer dots (raw points) over connected lines.
- If the time series is overly busy/noisy:
-
Identify the telemetry type
- If counter:
- If using cumulative counter: line charts may work.
- If converting to delta counter / rate: use a step-function style representation (and optionally show uncertainty like boxes for standard deviation).
- If gauge:
- Use dots (avoid implying continuity/smooth transition with line segments).
- If histogram/distribution:
- Don’t summarize as a single number.
- Visualize multiple quantiles and use:
- Less noisy quantile displays and/or
- Heatmaps (quantiles on one axis, value encoded by color).
- If counter:
-
Identify the question being asked
- If the question is about relationships/decisions (load vs latency, version regression, ROI/SKU choice):
- Transform data away from “time as x-axis”
- Reshape into panel data / tables where grouping is explicit.
- If the question is about relationships/decisions (load vs latency, version regression, ROI/SKU choice):
B) Transform time-series telemetry into panel data (table/pivot-style)
-
Load vs latency
- Collect metrics for:
- Load variable(s)
- Latency distribution variable(s)
- Align timestamps only to associate load with latency at the same moments.
- Then:
- Group latency values by load level
- Remove time from the final visualization/table
- Result:
- A table/chart where:
- X-axis = load
- Y-axis = latency statistic (e.g., P99)
- Read off the maximum sustainable load meeting an SLO threshold.
- A table/chart where:
- Collect metrics for:
-
Version regression
- Collect latency metrics and the version label (old vs new).
- Then:
- Group/aggregate latency distributions by version
- Result:
- Side-by-side distributions for each version, enabling direct comparison (not “did it change over time?”).
-
Best ROI / SKU selection
- Collect:
- Throughput-related metrics (from telemetry)
- SKU/instance attributes (from telemetry labels)
- Pricing data (from external sources)
- Then:
- Mash/join tables into a combined dataset (price, skew, throughput, etc.)
- Result:
- A table where each cell answers “best option” directly under the chosen criterion (e.g., token rate per dollar).
- Collect:
Speakers / sources featured
-
Speaker: Yao Yue (primary presenter)
-
Referenced sources / quotes (not additional speakers)
- Grace Hopper: “The most dangerous phrase in the language is: ‘We’ve always done it this way.’”
- A meme about reliability engineers (referenced as a cultural idea, not a specific named source)
- Statistics background (local smoothing / regression described as a technique used for decades)
- Economics / pandas reference:
- Panel-data concept attributed to economics, with the pandas project named for the data-frame/panel-data shape
-
No other distinct speaker identities are present in the subtitles.
Category
Educational
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.