last updated 12/10/19
These principles are attributed to Edward Tufte.
These are examples of excellence and problems as well as design goal summaries.
Most visualization tools today demonstrate good design, but they can be abused, so these principles should be understood.
Tell the truth about the data -- above all else show the data
Show data variation, not design variation.
Make large data sets coherent
Reveal the data at several levels of detail from broad overview to the fine structure
Look closely at the baselines of the three charts.
Another similar graphic found October, 2008:
Numbers have magnitude and close values should be reflected.
Perception of area versus magnitude varies per person. The perceived area of a circle grows more slowly than the actual:
perceived area = (actual area)(.8±.3) So if the area is 4, the range of perceived area ranges from 2 =4(.5) to 6 =4(1.1) and if the circle's area grows to 8 then the perceived area could be from 3 =8(.5) to 15 =8(1.1). That is, if the area doubled, some might only see a 50% increase where others might see a 150% increase in the same visual.
Using two dimensional objects to represent scalars is naturally misleading-- especially if you use the diameter as proportional to the scalars to view. Growth of a scalar value becomes perceived to be squared. Be sure the areas are what you really represent if you use such objects.
Top row: number is proportional to the diameter
Bottom row: number is proportional to the area
Lie factor = size of effect shown in graphic / size of effect in data
So there's a 53% increase in fuel economy, but the line drawn has a 783% increase. The lie factor is 783/53 = 14.8.
The lie factor ought to be around 1. ~1 = truth.
there should be consistency across the entire graphic. Expections are set where you start looking.
Bad example from NY Times.
Keep the axis the same.
Lie factor is 59.4, considering the volume.
too much texture and vibration in this graphic -- a moire vibration
some experts say it's eye catching, therefore good.
An unintentional Necker Illusion-- the back planes optically flip to the front.
Easy cross hatching as in these samples
The background grid is generally classified as junk.
Train schedule in France (one of Tufte's book cover).
Here's the France train schedule with lightened grid lines.
...ink = non-white data-pixels. We want a high ratio of data presentation to the pixel/ink used. (White or background pixels are not counted.)
data-ink (pixels used directly for data)
= proportion of ink devoted to the non-redundant display of information
= 1 - proportion of graphic that can be erased without loss of information
Want something close to 1, such as the electroencephalogram. Every pixel represents data, except labels
Bad ratio, prediction of voter registration:
Better representation, same data as above.
Tufte principles: maximize the data-ink ratio, and erase non-data ink, within reason.
Some ink needs to be used for labeling and explanation.
A shaded, vertical, labeled bar chart displays the number up to 6 times:
An erasure example
Consider the repeating of the train schedule (the red line indicates the repeated portion). Here the wrap around helps the reader see the continuum from late night through morning.
Minimizing the amount of ink or pixels uses the pre-attentive capabilities of our eyes.
The left hand scale in this example serves as the y-axis and the points on the graph serve as the x-axis scale as well as the curve.
Example: a stem and leaf graph where the points themselves again are data.
With a stem and leaf, you get a histogram, and finding medians, quartiles, or percentiles is quickly possible.
A stem and leaf is a good hand drawn graphic. This works when the digits are fixed width because the line length has meaning of quantity.
This particular graphic may be better drawn, high to low to mimic a typical y-axis orientation.
Devoid of marks:
Varying degress of marks
Grid spacings: illogical and logical
Data ranges: illogical and logical
Iris example visualizations from XmdvTool (scatterplot matrix, star plot, and parallel coordinates plot)
A graphic should have at least three levels of viewing:
Consider the population densities from "http://upload.wikimedia.org/wikipedia/commons/thumb/9/90/USA-2000-population-density.gif/450px-USA-2000-population-density.gif"
Another graphic showing the GDP percentage over a decade for various countries.
Read what do you see
Here you remove the time series from the axis and attempt to see the relationship of two other dimensions.
Consider the inflation rate versus unemployment rates relationship with the time axis (z) projected or collapsed.
There are some comparison issues. What can you note?
Common type of graphic where the x axis is units of time.
New York City weather for 1980
Notice the level of detail:
the overall view:
Parallel time-series of three separate measures attempting to show relationships: Playfair's graphic of wheat prices, wages and reigns of royalty.
Minard's map of Napoleon's army march to Russia, again as the classic example
Of course interactive 3D graphics would be appropriate.
Consider collecting together lots of little statistics and breakdowns into a supertable.
Organize numerically laden text into tables and attempt to use graphics when the summary and revealing information warrants it.
|words spelled out, no mysterious encodings||many abbreviations requiring the reader to decode them|
|words run left to right as normal||words run vertically or several directions|
|little messages to help explain data (use terse phrases)||graphic requires repeated reference to scattered text in some narrative at some distance from the graphic|
|labels are on the graphic eliminating a separate legend-- a legend pattern that follows a logical pattern||obscure codings require consulting legend repeatedly, e.g. elaborate, encoding shadings, crosshatching and color codes|
|graphic attracts viewer, provokes curiosity, every visual characteristic has meaning||graphic is full of chartjunk|
|colors are chosen so that those with color blindness can make sense of the graphic (blue is best)||design insensitive to color blindness (red and green)|
|typefont is clear using upper and lower case with serif||all caps, sans serif|
Some final suggestions: