Summary of "Beyond Line Charts • Yao Yue • YOW! 2025"

Main ideas / lessons

Line charts dominate observability dashboards, but they are often a poor match for the questions people need to answer.
- Line charts connect points with straight segments, which implicitly interpolates—creating visual detail that may not reflect real underlying behavior.
- When many instances/series are shown together, noise multiplies, making it hard to see meaningful signal (“squinting based analysis”).
Good data visualization should make the insight obvious.
- The talk uses three motivating examples to show that, with common line-chart setups, it’s often unclear whether behavior changed or whether a deployment affected latency.
Data visualization should follow “form follows function.”
- Start from:
  1. Data shape (how noisy/busy the values are)
  2. Telemetry data type (counter, gauge, histogram)
  3. The question being answered (what you’re trying to decide; often not time-based)
Apply appropriate transformations or visualization choices to reduce misleading artifacts
- Windowed averages / smoothing: reduce noise for slowly changing metrics.
- Aggregate across instances (mean/max, etc.): reduce clutter and reveal trends.
- Local smoothing / regression (especially for tail quantiles like P99): reveal differences that were hidden by raw noise.
- Show raw data points instead of lines when lines imply interpolation that isn’t justified (e.g., before/after deploy comparisons).
Raw data can be more truthful than line interpolation
- The speaker argues for the “power of raw data,” emphasizing that line charts can “butcher” the underlying pattern by adding decorative interpolation.
Telemetry types imply better starting visualization semantics
- Counters (monotonic cumulative measurements; e.g., bytes written)
  - Line charts are reasonable for counters themselves, but most monitoring uses delta counters (rates).
  - For delta counters, the more semantically correct representation is:
    - Step functions (since increments/rates accumulate discretely)
    - Optionally include uncertainty (e.g., standard deviation shown as boxes).
- Gauges (spot measurements; e.g., queue depth at an instant)
  - Connecting gauge readings with lines is misleading because the intermediate transition is unknown.
  - Prefer dots over lines to avoid implying smooth transitions.
- Histograms (distribution metrics; e.g., latency quantiles)
  - A “single latency number” is misleading.
  - A histogram tracks many quantiles simultaneously (P50, P99, P99.9, etc.).
  - Better representations include:
    - Showing multiple quantile curves with less noise (since each quantile behaves like a gauge reading)
    - Heatmap-style views: use color for value and y-axis for quantiles to visualize distributions across quantiles at once.
Many operational/business questions are not “time series plots”; they’re comparisons and groupings
- Example decisions:
  - Can we increase load without impacting SLO?
  - Has a new version regressed?
  - Can we deploy safely without harming reliability?
  - Which SKU/VM type provides the best ROI?
- To answer these, you often need to transform data into a table / panel-data shape rather than a time-axis chart.
- Proposed framing:
  - Transform telemetry into panel data / data frames (rows as entities/conditions, columns for variables).
  - Time becomes just another column when needed, rather than always being the x-axis.
  - This enables direct matching/grouping such as:
    - Group latency values by load level (removes time, makes x-axis “load”).
    - Group latency distributions by software version (compare distributions side-by-side).
    - Combine throughput + pricing + SKU attributes to compute ROI.
Call to action
- Telemetry data should not be “imprisoned” in the TSDB: it should be extractable and transformable for deeper analysis and visualization.
- Tooling gaps to address:
  - UIs should provide better guidance for visualization based on telemetry data type.
  - Exploring data should require less friction (e.g., simpler statistical summaries without hand-written queries).
  - Observability systems should support easier data egress to other analysis environments.

Methodology / instructions presented

A) Choose visualization based on a “data-to-question” workflow

Identify the data shape
- If the time series is overly busy/noisy:
  - Apply windowed averaging (smoothing) and re-check whether the trend/uptick becomes interpretable.
  - If still noisy:
    - Apply local smoothing / regression to the noisy tail quantile (e.g., P99) specifically.
  - If interpolation is misleading:
    - Prefer dots (raw points) over connected lines.
Identify the telemetry type
- If counter:
  - If using cumulative counter: line charts may work.
  - If converting to delta counter / rate: use a step-function style representation (and optionally show uncertainty like boxes for standard deviation).
- If gauge:
  - Use dots (avoid implying continuity/smooth transition with line segments).
- If histogram/distribution:
  - Don’t summarize as a single number.
  - Visualize multiple quantiles and use:
    - Less noisy quantile displays and/or
    - Heatmaps (quantiles on one axis, value encoded by color).
Identify the question being asked
- If the question is about relationships/decisions (load vs latency, version regression, ROI/SKU choice):
  - Transform data away from “time as x-axis”
  - Reshape into panel data / tables where grouping is explicit.

B) Transform time-series telemetry into panel data (table/pivot-style)

Load vs latency
- Collect metrics for:
  - Load variable(s)
  - Latency distribution variable(s)
- Align timestamps only to associate load with latency at the same moments.
- Then:
  - Group latency values by load level
  - Remove time from the final visualization/table
- Result:
  - A table/chart where:
    - X-axis = load
    - Y-axis = latency statistic (e.g., P99)
  - Read off the maximum sustainable load meeting an SLO threshold.
Version regression
- Collect latency metrics and the version label (old vs new).
- Then:
  - Group/aggregate latency distributions by version
- Result:
  - Side-by-side distributions for each version, enabling direct comparison (not “did it change over time?”).
Best ROI / SKU selection
- Collect:
  - Throughput-related metrics (from telemetry)
  - SKU/instance attributes (from telemetry labels)
  - Pricing data (from external sources)
- Then:
  - Mash/join tables into a combined dataset (price, skew, throughput, etc.)
- Result:
  - A table where each cell answers “best option” directly under the chosen criterion (e.g., token rate per dollar).

Speakers / sources featured

Speaker: Yao Yue (primary presenter)
Referenced sources / quotes (not additional speakers)
- Grace Hopper: “The most dangerous phrase in the language is: ‘We’ve always done it this way.’”
- A meme about reliability engineers (referenced as a cultural idea, not a specific named source)
- Statistics background (local smoothing / regression described as a technique used for decades)
- Economics / pandas reference:
  - Panel-data concept attributed to economics, with the pandas project named for the data-frame/panel-data shape
No other distinct speaker identities are present in the subtitles.