Labels, Scales, and Customization
Learn how to customize plots with clear labels and titles. Control scales and axes. Format plots for professional presentations and publications.
- Add clear, informative labels to plots
- Customize axis titles and legends
- Control scale transformations and limits
- Format tick labels and numbers
- Create publication-ready visualizations
- Apply accessibility best practices
- How do I add titles and labels to my plots?
- How do I control axis scales and limits?
- How do I format numbers and dates on axes?
- How do I make my plots presentation-ready?
The Importance of Clear Labels
A visualization without proper labels is like a research paper without citations - the data might be there, but the message is unclear. Consider your audience:
- Policymakers need context and clear units
- Community members need plain language
- Colleagues need technical precision
- Everyone needs to understand what they’re looking at
Setting Up
import seaborn as sns
import seaborn.objects as so
import pandas as pd
import numpy as np
# Load data
penguins = sns.load_dataset("penguins").dropna()Adding Labels with .label()
The .label() method adds titles and axis labels to your plot:
# Basic plot without labels - not ready to share!
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g", color="species")
.add(so.Dot())
)
# Now with proper labels - ready for presentation!
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g", color="species")
.add(so.Dot())
.label(
title="Relationship Between Flipper Length and Body Mass in Penguins",
x="Flipper Length (mm)",
y="Body Mass (g)",
color="Species"
)
)Components of .label()
The .label() method accepts several arguments:
title- Main plot titlex- X-axis labely- Y-axis labelcolor- Legend title for color aestheticpointsize- Legend title for size aestheticmarker- Legend title for shape aesthetic
# Complex plot with multiple aesthetics labeled
(
so.Plot(
penguins,
x="bill_length_mm",
y="bill_depth_mm",
color="species",
pointsize="body_mass_g"
)
.add(so.Dot(alpha=0.6))
.label(
title="Penguin Bill Dimensions by Species and Body Mass",
x="Bill Length (mm)",
y="Bill Depth (mm)",
color="Penguin Species",
pointsize="Body Mass (g)"
)
)Multi-line Titles and Labels
For longer titles, use \n for line breaks:
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g", color="species")
.add(so.Dot())
.label(
title="Impact of Flipper Length on Body Mass:\nAnalysis of Three Penguin Species in Antarctica",
x="Flipper Length (mm)",
y="Body Mass (g)"
)
)Controlling Scales with .scale()
The .scale() method controls how data values map to visual properties.
Continuous Scales
# Default linear scale
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g")
.add(so.Dot())
.label(title="Linear Scale (Default)")
)
# Logarithmic scale (useful for data spanning orders of magnitude)
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g")
.add(so.Dot())
.scale(y="log")
.label(title="Logarithmic Y-axis")
)Setting Axis Limits
Control the range of your axes:
# Zoom in on a specific range
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g", color="species")
.add(so.Dot())
.scale(
x=(170, 230), # Only show flipper lengths between 170-230mm
y=(2500, 6500) # Only show body mass between 2500-6500g
)
.label(
title="Penguin Measurements (Zoomed)",
x="Flipper Length (mm)",
y="Body Mass (g)"
)
)Cutting off parts of your data range can be misleading! Always:
- Indicate if you’ve zoomed in
- Ensure you’re not hiding important patterns
- Consider starting axes at zero for bar charts
Color Scales
Control color palettes:
# Using a different color palette
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g", color="species")
.add(so.Dot(pointsize=8))
.scale(color="colorblind") # Colorblind-friendly palette
.label(
title="Colorblind-Friendly Palette",
x="Flipper Length (mm)",
y="Body Mass (g)",
color="Species"
)
)Common palettes for categorical data:
"colorblind"- Safe for colorblind viewers"deep"- Seaborn default"pastel"- Softer colors"dark"- Darker tones
For continuous data:
"viridis"- Perceptually uniform"rocket"- Sequential"coolwarm"- Diverging
Formatting Tick Labels
Custom Tick Locations
# Specify exactly which ticks to show
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g")
.add(so.Dot())
.scale(
x=so.Continuous().tick(at=[180, 200, 220]),
y=so.Continuous().tick(at=[3000, 4000, 5000, 6000])
)
)Number Formatting
For research data with different units:
# Create income data example
np.random.seed(42)
income_data = pd.DataFrame({
'education_years': np.random.uniform(0, 20, 100),
'annual_income': np.random.uniform(50000, 500000, 100),
'county': np.random.choice(['Nairobi', 'Kisumu', 'Mombasa'], 100)
})
# Format as thousands with 'K' suffix
from matplotlib.ticker import FuncFormatter
(
so.Plot(income_data, x="education_years", y="annual_income", color="county")
.add(so.Dot(alpha=0.6))
.label(
title="Income by Education Level Across Counties",
x="Years of Education",
y="Annual Income (KSh)",
color="County"
)
)Real Research Example: Program Impact
Let’s create a complete, publication-ready visualization:
# Create realistic program evaluation data
np.random.seed(123)
periods = [0, 6, 12, 18, 24] # Months
treatment_baseline = 45
control_baseline = 46
program_data = pd.DataFrame({
'month': periods * 2,
'outcome': (
[control_baseline, 47, 48, 49, 50] + # Control group
[treatment_baseline, 50, 55, 60, 65] # Treatment group
),
'lower_ci': (
[43, 44, 45, 46, 47] +
[43, 47, 52, 57, 62]
),
'upper_ci': (
[47, 50, 51, 52, 53] +
[47, 53, 58, 63, 68]
),
'group': ['Control'] * 5 + ['Treatment'] * 5
})
# Create publication-ready plot
(
so.Plot(program_data, x="month", color="group")
.add(so.Band(alpha=0.2), ymin="lower_ci", ymax="upper_ci")
.add(so.Line(linewidth=2.5), y="outcome")
.add(so.Dot(pointsize=10), y="outcome")
.scale(
x=so.Continuous().tick(at=[0, 6, 12, 18, 24]),
color=so.Nominal(["#E69F00", "#56B4E9"]) # Custom colors
)
.label(
title="Impact of Agricultural Training Program on Crop Yields\nRandomized Controlled Trial: 2021-2023",
x="Months Since Baseline",
y="Average Yield (bags per acre)",
color="Group"
)
)This plot is ready for:
- Stakeholder presentations
- Reports
- Academic papers
- Policy briefs
Best Practices for Labels
1. Be Specific with Units
❌ Bad: x="Income" ✅ Good: x="Monthly Income (KSh)"
❌ Bad: y="Distance" ✅ Good: y="Distance to Health Facility (km)"
2. Use Plain Language
❌ Bad: title="DV regressed on IV controlling for confounds" ✅ Good: title="Relationship Between Education and Income, Controlling for Age"
3. Provide Context
❌ Bad: title="Survey Results" ✅ Good: title="Household Food Security Survey Results: Kisumu County, 2023"
4. Capitalize Appropriately
- Title case for titles: “Impact of Agricultural Program”
- Sentence case for axes: “Years of education”
- Be consistent throughout
Accessibility Guidelines
Color Contrast
# Check your palette is colorblind-friendly
(
so.Plot(penguins, x="flipper_length_mm", y="body_mass_g", color="species")
.add(so.Dot(pointsize=8))
.scale(color="colorblind")
.label(
title="Colorblind-Safe Visualization",
x="Flipper Length (mm)",
y="Body Mass (g)"
)
)Combine Visual Cues
Use multiple aesthetics for critical distinctions:
# Color + shape for maximum accessibility
(
so.Plot(
penguins,
x="bill_length_mm",
y="bill_depth_mm",
color="species",
marker="species" # Shape also encodes species
)
.add(so.Dot(pointsize=8))
.label(
title="Penguin Bill Measurements (Accessible Design)",
x="Bill Length (mm)",
y="Bill Depth (mm)",
color="Species"
)
)Exercises
Create a plot with the penguins data that:
- Shows flipper length vs. body mass
- Uses color for species and size for bill length
- Has clear, descriptive labels for ALL aesthetics
- Has an informative title
# Your code here(
so.Plot(
penguins,
x="flipper_length_mm",
y="body_mass_g",
color="species",
pointsize="bill_length_mm"
)
.add(so.Dot(alpha=0.6))
.label(
title="Penguin Morphology: Flipper Length, Body Mass, and Bill Length by Species",
x="Flipper Length (mm)",
y="Body Mass (g)",
color="Species",
pointsize="Bill Length (mm)"
)
)Create sample data for household income (which often follows a log-normal distribution):
income_data = pd.DataFrame({
'households': range(100),
'income': np.random.lognormal(10, 1, 100)
})Create two plots:
- One with a linear y-axis
- One with a logarithmic y-axis
Which is more effective for showing the distribution?
# Linear scale
(
so.Plot(income_data, x="households", y="income")
.add(so.Dot())
.label(
title="Household Income (Linear Scale)",
x="Household ID",
y="Annual Income (KSh)"
)
)
# Log scale
(
so.Plot(income_data, x="households", y="income")
.add(so.Dot())
.scale(y="log")
.label(
title="Household Income (Log Scale)",
x="Household ID",
y="Annual Income (KSh, log scale)"
)
)The log scale is often more effective for income data because it:
- Shows relative differences more clearly
- Prevents a few high values from compressing the rest
- Makes the distribution easier to interpret
Imagine you’re preparing a plot for a policy brief on education outcomes. Create a plot that shows test scores across different schools with these requirements:
- Create sample data for 5 schools with 20 students each
- Use appropriate labels with units
- Use a colorblind-friendly palette
- Add an informative title that includes location and year
- Make sure legend labels are clear
# Create data
np.random.seed(42)
schools = ['School A', 'School B', 'School C', 'School D', 'School E']
test_data = pd.DataFrame({
'school': np.repeat(schools, 20),
'score': np.concatenate([
np.random.normal(75, 10, 20), # School A
np.random.normal(68, 12, 20), # School B
np.random.normal(82, 8, 20), # School C
np.random.normal(71, 11, 20), # School D
np.random.normal(79, 9, 20) # School E
])
})
# Create publication-ready plot
(
so.Plot(test_data, x="school", y="score", color="school")
.add(so.Bar(alpha=0.7))
.add(so.Dot(alpha=0.3), so.Jitter(0.3))
.scale(
color="colorblind",
y=(0, 100) # Test scores range from 0-100
)
.label(
title="Primary School Mathematics Test Scores: Nairobi County, 2023",
x="School",
y="Test Score (out of 100)",
color="School"
)
)Quick Reference: Customization Methods
| Method | Purpose | Example |
|---|---|---|
.label() |
Add titles and labels | .label(title="My Title", x="X Label") |
.scale() |
Control scales | .scale(y="log", color="colorblind") |
.scale(x=(min, max)) |
Set axis limits | .scale(x=(0, 100)) |
.scale(color=palette) |
Set color palette | .scale(color="viridis") |
- Always label your plots with title, axis labels, and legend titles
- Include units in axis labels (mm, KSh, %, etc.)
- Use
.label()to add all text elements - Use
.scale()to control axis limits, transformations, and palettes - Consider logarithmic scales for data spanning orders of magnitude
- Use colorblind-friendly palettes for accessibility
- Combine multiple visual cues (color + shape) for critical distinctions
- Write labels for your audience - use plain language
- A good plot should be understandable without additional explanation
In the next lesson, we’ll learn about faceting - creating multiple subplots to compare across categories. We’ll also explore how to layer multiple marks and create complex multi-panel figures for comprehensive data stories.