LESSON 1 INTRODUCTION


Objectives:



At the end of the lesson, the students are able to
   1. Define the terms used in statistics.
   2. Determine the sample size from a given population.
   3. Categorize variables into nominal, ordinal, interval and ratio.

For many centuries man's possessions and belongings were written mostly on the barks of the trees, walls of his cave, and skins of animals. The different leaders in the past such as kings and queens also kept the records of their resources that included the manpower and the resources that were available in their kingdom.

The Bible mentioned a lot of records of the Israelites about the number of people in each tribe. For example the tribe of Reuben in Numbers 1:21, ...the tribe of Reuben, were counted forty and six thousand and five hundred.

In the government, the listings of the number of deaths, births, marriages and the number of personnel in the military were also recorded. This data provided them the needed information in running the affairs of the government.

Today Statistics is being used not only in the government but also in all organizations all over the world.

For example:

  1. The census of births, death, marriages are being kept in the NSO.
  2. In hospitals, the records of patients that are admitted and discharged are being tallied.
  3. In the laboratories, the laboratory results such as blood sugar, cholesterol, hemoglobin etc. are also being kept.
  4. In the academe, the Registrars keep the records of the students.
  5. In the PRC, they have the listing of the number of board passers in all profession.
  6. In sports, they all have the records of all the players in order for them to identify who is the best team or best player of the day and of the year.
  7. In industries, they have their own data and records to interpret if their sales that will tell them if there products are doing well in the market.
  8. In sports like basketball, there are tallies like apg, ppg rpg etc.
  9. In the supermarket there are records like sales of the day, existing stocks, new stocks, etc.

STATISTICS is defined as a method to collect, organizes, analyze and draw conclusions. As defined by Perguson (1989), statistics deals with the collection, classification, description, and interpretation of data obtained by the conduct of surveys and experiments. Its essential purpose is to describe and draw inferences about the numerical properties of population.

The collection, gathering, analyzes and drawing of conclusions follows some steps or formulas. This makes statistics a science.

There are two branches of statistics: the descriptive statistics and the inferential statistics.
1. Descriptive statistics refers to the methods used to collect, organize, summarize, or present data, usually to make the data easier to understand.
2. Inferential statistics is consists of generalizing from samples to populations, performing hypothesis testing, determining relationships among variables and making predictions.
Statistics is a very important subject.
1. Statistics provides us with the ways and means of expressing our thoughts in the most definite and exact way feasible.” Through statistics, we can compare which one is smaller and which is bigger.
2. STATISTICS can help us describe the size or magnitude of a given item. Like the height of the tallest student is 5 feet 7 inches; and the height of the smallest student is exactly 5 feet.
3. STATISTICS enables us to express the result of research activities in a meaningful way. A certain data can be presented in graphs or in figures to better understand its meaning.
4. STATISTICS allows us to draw inferences or conclusions. We can reject or accept null hypothesis through statistics. Like we can accept that there is no significant difference between the lecture method and the demonstration method.
5. STATISTICS enables us to predict the consequence of a certain phenomenon. There are statistical formulas that can be used in predicting the occurrence of future phenomenon.
6. STATISTICS enable us to determine the probably causes or reasons of an outcome. Through Statistics, we can identify the reasons for one’s success or failure.

In statistics, accuracy is very important. All the answers must always be close to the true value so that the message that the data conveys will be properly interpreted and applied. That is why there is a need to understand the meaning of each formula and how it is applied in the computation.

The following terms are very important in studying statistics.
Variable. It is a characteristic that can have different values that can vary.
Data is the collection of variables.
Qualitative Data are data that can be placed in categories like gender.
Quantitative Data are data that can be ordered and ranked.

Random sampling is a sampling technique where we select a group of subjects (a sample) for study from a larger group (a population). Each individual is chosen entirely by chance and each member of the population has a known, but possibly non-equal, chance of being included in the sample.

An experiment is any process or study that results in the collection of data, the outcome of which is unknown. In statistics, the term is usually restricted to situations in which the researcher has control over some of the conditions under which the experiment takes place.

A population is any entire collection of people, animals, plants or things from which we may collect data. It is the entire group we are interested in, which we wish to describe or draw conclusions about.

A sample is a group of units selected from a larger group (the population). By studying the sample it is hoped to draw valid conclusions about the larger group.

Now let us look into the 4 levels of measurements for us to have a better understanding on our subject.

There are four levels of measurements. These are the:
1. Nominal the type of measurement where the data can be classified into groups but no order or rank can be established.
For example:

gender, marital status, blood type.

In nominal level of measurement, 1 is not higher than 0 and 0 is not higher than
1. This means that the two numbers are equally the same. Usually for the variable gender, 1 is given to males and 0 is given to the females.
2. Ordinal this is the type of measurement where the data can be ordered or ranked, but a precise difference in the levels cannot be determined. Those items that can be ranked belong to this level of measurement. For example: Socio-economic status, academic rank, salary grade, etc.

Socio Economic StatusRankAcademic Rank Rank
Above Average4Professor4
Average3 Asst. Professor 3
Below Average2 Instructor2
Poor1Asst. Instructor1

3. Interval the data can be ordered and has an exact difference between any two units. For example: IQ test results, grade in science, temperature reading, etc.
4. Ratio the highest level of measurement. The data at this level can be ordered, has an exact difference between units.

Things that are counted are usually ratio level. Examples for this level of measurements include: area, speed, velocity, weight, age, etc.

The mean or average is one of several indices of central tendency that statisticians use to indicate the point on the scale of measures where the population is centered.

The mean is the average of the scores in the population. Numerically, it equals the sum of the scores divided by the number of scores.

Sample. It is a subset of a population. The sample is the data that is obtained from the population. The sample is carefully selected in order to obtain a data that is reliable and valid. The data should actually represent the total population. The most common formula that is used in getting the sample is the Slovin's formula.


n = N / 1+ N(e)²


     Where N is the total population
     n is the sample
     e is the margin of error
The margin of error can be
     0.001 or 0.1% (maximum)
     .01 or 1%
     .05 or 5% (minimum)
Example 1
In a certain study, the population of interest is composed of 25,000 individuals. At 5% margin of error what will be the acceptable sample size?

     Total population (N) 25000
     Margin of error .05


n = 25000 /(1 (+ 25000(.05)²))


     n = 25000 / 1+ 62.5
     n = 25000 /63.5
     n = 393.7 or 394
(Always round your answer to the next whole number
      394 is the acceptable sample size

  Example 2
      Total population (N)= 5000

     Margin of error = .05
      n = 5000 / 1+ 5000(.05)²

371 can represent the total population of 5000.

Advantages of the sampling method in gathering data over the use of the total population      1. Less expensive      2. Time efficient      3. Easier retrieval of data

The following are the methods that can be used in obtaining the desired number of sample.


1. Random Sampling This is a method of selecting a sample size from the total population. Each member of the population has an equal chance of being included in the sample.
a. Table of random numbers.
b. Stratified sampling- is a strategy for selecting samples in such a way that a specific subgroups or strata will have a sufficient number of representatives within the sample.

Example

Strata PopulationSample
Fourth Year 1500
Third Year 1226
Second Year 1220
First Year 1011
Total 4957

The sample can be obtained using the Slovin's Formula

=4957 / 1+ (4957 x (.05 )²
= 4957 / 13.3925 = 370.1325 or 371
371 is the acceptable sample size

Strata Population Sample
Fourth Year 1500 113
Third Year 1226 92
Second Year 1220 92
First Year 1011 76
Total 4957 373


(113 is obtained by 1500 4957 x 371 )
(92 is obtained by 1226 / 4957 x 371 )
(92 is obtained by 1220 / 4957 x 371 )
(76 is obtained by 1011 / 4957 x 371 )

This result shows that out of the total population of 4957 students,
     113 will come from the fourth year;
     92 from the third year;
     92 from the second year, and;
     76 from the first year.



c. Cluster sampling- refers to the selection of members of a sample rather than separate individuals. If the respondents will come from the Philippines, the cluster sampling method will be done by diving the area into regions, to cities, barangay and locality.

d. Fish bowl technique . The lottery sampling requires the listing of subjects that will be placed in the bowl. The required number of sample will be drawn from the bowl.

e. Systematic sampling

REFERENCES

http://www.cmh.edu/stats/definitions.asp

Sevilla et al. Research Methods. Rex Printing Company Inc. Quezon City, 1993 Stephens, Larry J. Beginning Statistics. Schaum’s Outline Series McGraw-Hill, USA, 1998

1. Elementary Statistics and Probability by Cristobal M. Pagoso

2. Sevilla et al. An Introduction to Research Methods. Rex Book Store. 1988

3. Pagoso et al. Introductory Statistics. Rex Book Store, Manila,1994

4. Pagoso et al. Fundamental Statistics for College Students. Rex Books Store. Manila, 1994

5. Fundamental Statistics for College Students by Cristobal M. Pagoso

6. Introduction to Statistics by Ronald Walpole

7. Bishop, Michael L., et. al. 1996. Clinical Chemistry: Principles, Procedures and Correlations, 3rd Edition, Ney York: Lippincott-Raven Publishers

8. Calbreath, Donald F. 1998, Clinical Chemistry: A Fundamental Textbook, 1st Edition, Philadelphia, PA: W.B. Saunders Company.

9. Kaplan, Alex et. al. 1995, Clinical Chemistry: Interpretation and Tchniques, 4th Edition, PA, USA, Williams and Wilkins Company.

10. http://mathworld.wolfram.com/topics/DescriptiveStatistics.html

11. http://faculty.vassar.edu/lowry/webtext.html

12. http://www2.chass.ncsu.edu/garson/pa765/correl.htm

13. http://www2.chass.ncsu.edu/garson/pa765/statnote.htm

14. http://www.cas.lancs.ac.uk/glossary_v1.1/main.html

15. http://www.cas.lancs.ac.uk/glossary_v1.1/Alphabet.html