Box and Whisker Plot|Definition & Meaning

Definition

A special type of figure representing first, second and third quartiles against some given data as a box shaped plot with lines protruding from its sides spanning the lowest and highest values.

A form of graph called a box and whisker plot connects the boxes that indicate the distribution of numerical data with lines (also known as whiskers). Box and whisker plots show how a set of data might vary. An appropriate depiction may also be provided by a histogram analysis, but a box and whisker plot provides additional information while allowing the display of multiple data sets on the same graph. An example is shown below: 

Drawing a Box and Whisker Plot

Figure 1: Example of Box and Whisker Plot

Box and whisker plots are very effective at visually summarizing data from various sources on a single graph. As such, these plots allow you to compare data from different categories easily, leading to efficient decision-making.

Some Real World Applications

When you have many data sets from different sources that are connected in some manner, consider box and whisker graphs. Here are several examples from the real world where they may prove helpful:

(a) Compiling the results of students from different institutions or for different courses.

(b) Suppose you suggest a modification in some industrial plant or process. Box and whisker plots can be used to depict the effect of this modification on production before and after this change.

(c) Different characteristics of a mechanical system

(d) Data coming from comparable devices yielding similar results

There are many other such applications that can be listed.

Statistical Information Inside a Box and Whisker Plot

The box and whisker plot shows the five summary statistics of the given numerical data.

(a) Lowest Value (Minimum)

(b) Median

(c) Highest Value (Maximum)

(d) Lower Quartile

(e) Upper Quartile

Consequently, the box and whisker plot may be constructed using the same five statistics listed above. A thorough understanding of all these parameters is a prerequisite for learning the box and whisker plots. Lets understand these characteristics one by one.

(a) Minimum value

The numerically smallest value in the given data-set or population. Its a simple minimum function.

(b) Median

If the given data is sorted in ascending order of numerical magnitude, then the median value is the number in the center of a set of values. It is usually the value in the middle in case of an odd number of samples. In the case of an even number of samples, the middle two values are averaged to find the median. Specifically, for an even number of samples, the median is the arithmetic mean of the middle two values.

(c) Highest Value (Maximum)

The numerically largest value in the given data-set or population. Its a simple maximum function.

(d) Lower Quartile

If the given data is sorted in ascending order of numerical magnitude, then the lower quartile is the numeral below which the data for the lowest 25% are included. It represents the lowest 25% outlier values of the data also called the lower tail.

(e) Upper Quartile

If the given data is sorted in ascending order of numerical magnitude, then the upper quartile is the numeral above which the data for the highest 25% are included. It represents the highest 25% outlier values of the data also called the higher tail.

Construction of Box and Whisker Plot

The construction of box and whisker plot seems simple and intuitive at first sight but it might get very confusing for the students not familiar with statistics or the ones generally not comfortable with graphs. The following set of paragraphs explain how to construct a box and whisker plot using the given data. For the sake of example, we will consider some example data given below:

Given Data = { 20, 50, 40, 30, 60, 90, 80, 70, 10 }

First step is to sort all the data points in ascending order of numerical magnitude. The resulting data sequence looks as follows:

Given Data = { 10, 20, 30, 40, 50, 60, 70, 80, 90 }

Second step is to find the Lowest Value (Minimum), Median, Highest Value (Maximum), Lower Quartile and Higher Quartile. For the given data sequence above, these values are listed below:

Lowest Value (Minimum) = 10

Median = 50

Highest Value (Maximum) = 90

Lower Quartile = 25

Upper Quartile = 75

Third step is to plot the Lowest Value (Minimum), Median, Highest Value (Maximum), Lower Quartile and Higher Quartile points on a chart in the form of vertical bars (for the case of horizontal box and whisker plot) as shown in the figure below:

Box and whisker plot median max min quartile

Figure 2: Marking the Lowest Value (Minimum), Median, Highest Value (Maximum), Lower Quartile and Higher Quartile on chart

Fourth step is to construct box by joining the Lower Quartile and Higher Quartile Bars as shown in the figure below:

Constructing the Box using Lower Quartile and Higher Quartile Bars

Figure 3: Constructing the Box using Lower Quartile and Higher Quartile Bars

Fifth and final step is to construct the whiskers by joining the centers of minimum and maximum value bars with the lower and higher quartile bars respectively as shown in the figure below:

Constructing the Whiskers

Figure 4: Constructing the Whiskers

This five step process is a comprehensive way of constructing or generating a box and whisker plot. Following is a numerical problem for further understanding.

Numerical Problems Related to Box and Whisker Plot

Construct a box and whisker plot for the following datasets containing marks of nine students in two different subjects:

Science = { 80, 50, 54, 70, 60, 82, 87, 75, 55 }

Mathematics = { 70, 80, 95, 80, 55, 80, 66, 88, 60 }

Solution

Sorting the given data sets:

Science = { 50, 54, 55, 60, 70, 75, 80, 82, 87 }

Mathematics = { 55, 60, 66, 70, 80, 80, 80, 88, 95 }

Calculating the statistical Values for Science subject data:

Lowest Value (Minimum) = 50

Median = 70

Highest Value (Maximum) = 87

Lower Quartile = 54.5

Upper Quartile = 81

Calculating the statistical Values for Mathematics subject data:

Lowest Value (Minimum) = 55

Median = 80

Highest Value (Maximum) = 95

Lower Quartile = 63

Upper Quartile = 84

Constructing the box and whisker plot for the given data points against results of students in mathematics and science subjects:

Box and whisker students and course example

Figure 5: Box and Whisker Plot of Students’ Marks in Mathematics and Science Subjects

All mathematical drawings and images were created with GeoGebra.

Bounds Definition < Glossary Index > Brackets Definition