Skip to content

Data Visualisation

Aakash Pahwa edited this page Dec 2, 2018 · 2 revisions

Introduction

A good visualization wouldn't be the one which tries to put across a lot of data, rather the one which is simplistic. We try to follow some principles so that data is put across effectively.

Data-Ink Ratio

" Remove to Improve "


One of Tufte’s key principals is that good graphics present their message as simply as possible. To do this, he defined the “data-ink ratio” to turn this ‘so called’ simplicity into more practical ideas. Definition 1:

Data-ink: the non-erasable core of a graphic.

Definition 2:

Data-ink ratio =

1. data-ink divided by the total ink used to print the graphic.

2. the proportion of a graphic’s ink devoted to the non-redundant display of data information.

3. One minus the proportion of a graphic that can be erased without loss of data information.

Because the non-erasable portion of a graph is subjective, Tufte follows up his principal of data-ink with the Five Laws of Data-Ink:

Above all else show the data.
Maximize the data-ink ratio.
Erase non-data ink.
Erase redundant data-ink.
Revise and edit.

Maximize Data-Ink Ratio

Qualities of a Great Visualization

  • Truthfulness - It is important to be aware of your actions when cleaning, summarizing and manipulating data. Be ensured that you aren't misleading yourself ( self deception ) nor are you misleading your audience ( lie factor ).

  • Functionality - Functional graphs with interactive features are a major plus.
  • Beauty - Self explanatory quality, attracts crowds.

  • Insightful
  • Enlightening - A combination of the previous four, but with a social ethical responsibility.

Graphical Integrity

Visual representations of data must tell the truth.

Tufte shows a whole range of graphs that either over or under represent the effects in the data.

He does this by calculating a graph’s Lie Factor which can be calculated by dividing the size of the effect shown in the graphic by the size of the effect in the data.

If the Lie Factor is greater than 1 the graph overstates the effect.

Tufte goes on to list the following principles of graphical integrity:

  • The representation of numbers, as physically measured on the surface of the graph itself, should be directly proportional to the numerical quantities represented
  • Clear, detailed and thorough labeling should be used to defeat graphical distortion and ambiguity.Write out explanations of the data on the graph itself. Label important events in the data.
  • Show data variation, not design variation.
  • In time-series displays of money, deflated and standardized units of monetary measurement are nearly always better than nominal units.
  • The number of information carrying (variable) dimensions depicted should not exceed the number of dimensions in the data. Graphics must not quote data out of context.

Further Reading