Preloader




Research Basic: Mean

What is Mean?

The sum of the data divided by the number of data is called the mean. 


How to find the mean

Suppose we have a sample of 10 individuals, and we've recorded their annual purchase quantities of product brand X as follows:

A

B

C

D

E

F

G

H

I

J

3

4

6

4

5

3

2

4

5

6

The average annual purchase quantity for product brand X can be calculated using the following formula:

Average = (3+4+6+4+5+3+2+4+5+6) / 10 = 4.2

In Excel, this can be easily calculated using the AVERAGE function: "=AVERAGE (range)".


Notes on the use of averages.

While the mean is a useful tool for summarizing data, it is sensitive to extreme values or outliers. For instance, in our previous example, let's assume one individual (let's call them J) purchased an unusually high quantity of 40 units in a year.


A

B

C

D

E

F

G

H

I

J

3

4

6

4

5

3

2

4

5

40

 

In this case, the average annual purchase quantity of product brand X would be calculated as follows:

Average = (3+4+6+4+5+3+2+4+5+40) / 10 = 7.6 units

However, if we look at the data again, we can see that only individual J purchased 7 or more units annually. The other 9 individuals had purchase quantities between 2 and 6 units. It seems unusual to represent the annual purchase quantity of these 10 individuals as 7.6 units. In such cases, it might be more appropriate to use a different measure of central tendency, such as the median or mode, to summarize the data. Alternatively, if we suspect that there might be an error or anomaly in the data, we could consider using a trimmed mean, which is calculated by excluding a certain percentage of the highest and lowest values.

As shown in the figure below, when the data distribution is skewed, the mean can be pulled in one direction, so caution is necessary.


 

Additionally, when data exhibits a bimodal distribution, as shown in the figure below, it is difficult to summarize the data using the mean alone. A bimodal distribution is characterized by two distinct peaks.

 

 

Representative values other than the mean

In addition to the mean, there are other measures of central tendency.

Median: When data is arranged in ascending or descending order, the median is the middle value. If there is an even number of data points, the median is the average of the two middle values.

Mode: The mode is the value that appears most frequently in a data set. 




For full report, refer to this link.