A histogram is similar to a bar graph but instead displays the frequency of intervals of quantitative variables.
The "modality" of a histogram describe where the peaks occur and are described in 3 main categories:
The shape of a histogram describes how the "mass" of it falls, generally the shape can be described as one of the three following categories
Numerically when a histogram is "left skewed", it's median will be much greater than it's mean and conversely if it's "right skewed", it's median will be a lot smaller. If the median and mean are approximately the same the histogram is symmetric.
The centre of a histogram is the location where the values usually "cluster".
The mean of a distribution is it's long-running average value. Formally the mean is defined as the expected value which is calculated using the following formulae.
For a discrete random variable (ie. a randomized dice that can only be the integers 1-6):
Where the sum is taken over all possible values of
For a continuous random variable:
The mean of a distribution can be estimated be estimated when the exact probabilities of values are unknown.
Given a sample survey with
The numbers of hours spent studying for a subset of students are 4, 6, 8, 7, 5. Estimate the mean number of hours spent studying for students.
The mean of a distribution is a method of determining the centre of the given distribution. Given a distribution where
If
Given the numbers 12, 14, 15, 17, 20, 24, 24, 27, 29, find the median
Given the numbers 12, 14, 15, 17, 20, 24, 24, 27, 29, 30, find the median
The spread of a histogram describes how far most points usually are from the centre of the histogram.
An outlier is a single observation in a histogram that is visibly removed from the main "mass" of observations, in other words it's unusually far from the centre of the histogram.
Construct a histogram of the following numbers:
175, 192, 207, 212, 213, 214, 218, 225, 229, 230, 231, 235, 235, 237, 240, 240, 242, 248, 250, 253, 257, 260, 265, 265