Let's consider an example.1
A manufacturer of metal alloy parts wants to
address customer complaints concerning the non-uniformity in
the melting points of an alloy his/her company produces and markets.
Suppose we are given this problem as consultants.
How can we get quantitative measures of the variability (dispersion)
as well as the central tendency (location)
in the melting point?
Collect a sample representative of the population.
First of all, we need to collect data for
the melting point of the alloy parts produced in different batches.
The statistical information we seek here is for the entire
population of alloy parts which is distributed and sold
to the customers. However, it is obviously an impractical
task to measure the melting point of each one of the alloy parts
produced in the company. Hence, we need to collect a sample
which is representative of the entire population. In order for
the sample to faithfully represent the statistical
properties of the population,
the process of sample selection should be done very carefully,
by eliminating bias as much as possible. Let's denote such a sample
by X, the size of the sample by n and each one of the sample
elements by x1, x2,...,xn.
The following data are for a sample of size 50, i.e., it contains the
melting points of 50 alloy parts sampled randomly from the production line.
The melting point measurements are rounded off to the nearest integer
value in order to comply with the accuracy of the measuring procedure.
320 326 325 318 322 320 329 317 316 331 320 320 317 329 316
308 321 319 322 335 318 313 327 314 329 323 327 323 324 314
308 305 328 330 322 310 324 314 312 318 313 320 324 311 317
325 328 319 310 324
We will use this example to illustrate the key ideas and concepts of elementary statistical analysis.