1、Basic statistics,郜战莹外国语言学及应用语言学 硕士研究生,Content A brief description of statistics Description statistics Inferential statistics,1.A brief description of statistics 1.1 Definitions 1.2 Two important terms,1.1,Definitions,descriptive statistics,inferential statistics,1.1.1 Descriptive statistics If a re
2、seacher is interested only in describing a group from which the data are gathered,the statistics involved is called descriptive statistics.Note:In many cases data in the field of applied linguistics are priamrily descriptive in nature.,1.1.2 Inferential statistics If his or her interest goes beyond
3、describing the group from which the data are collected and s/he tries to draw conclusions about the population from which the group was selected,the statistics needed in this case in inferential statistics.Note:Its advantage is to allow the reseacher to make decisions about the population without st
4、udying the entire population.,1.2,Two terms,parameter,statistic,Parameter is a descriptive measure of the population,which is denoted by Greek letters such as Page 180.e.g:population mean population variance population standard deviation,Parameter,Statistic is a descriptive measure of a sample,which
5、 is denoted by a Roman letter,such as the examples in page 180 e.g.sample mean sample variance sample standarization deviation,Statistic,Diferentiation between parameter and statistic is important only in the use of inferential statistics.The statistician often wants to estimate the value of a param
6、eter.However,there are some factors can affect the values,especially:,Note,time,money,An example:Suppose you are an English teacher at Nanjing University and you want to know the relationship between the use of strategies and the English proficiency of all the Nanjing University students.There are m
7、ore than 10,000 students on two campuses.Obviously,you cannnot afford the money and time to take a census.What you may do is to survey 400 students as a sample by a questionnaire.Then,how do we do?descriptive statistics inferential statistics,You use descriptive statistics to get to know which strat
8、egies are more frequently used by these 400 students.The results you have got are on sample.You may use inferential statistics to examine the relation bewteen the use of strategies and English achievement,or to see whether the male students differ from the female students in their use of stategies b
9、ased on the notion probability,Descriptive analysis or statistics describes the general pattern or tendency emerging from the data collected.Inferential analysis or statistics,more complicated than descriptive,aims at predictions beyond the sample data.,Conclusion,Frequencies,The simplest way to org
10、anize the data is to describe their frequency distribution which can reduce and summarize data effectively and efficiently.,frequency distribution,bar graph,polygon,pie chart,If the total possible values are very limited.,for example:if the students responses to questionnaire items are no more than
11、five:1,2,3,4,5,frequency distribution can be obtained simply by tallying up all the responses,as shown in table 1.,Table 1:Frequency distribution,If the total possible values are various,such as peoples age and students test scores,we need to group the raw data into classes.,In this case,the frequen
12、cy distribution is concerned with classes rather than with each individual score.How do we describe the frequency distribution in terms of classes?There are no hard-and-fast rules to follow.,Generally speaking,we may construct the frequency distribution of these data in three steps:,1.Determine the
13、range of the raw data,2.Determine the number of the classes,3.Determine the width of the class interval,58,65,84,70,90,75,86,76,80,82,83,84,69,84,85,86,72,89,75,92,Attention,1.The range of the raw data:defined as the difference between the largest and smallest numbers.,2.The number of classes,too ma
14、ny:cannot achieve generalizationtoo few:cannot see the important differences,the appropriate number of classes,between 5 and 10,92-58=34,Attention,range3.the width of the class interval=the number of classes 组距=全距/组数,As usual,it is 5,10,15 and so on.,4.The frequency distribution must include all the
15、 data given.Therefore,the frequency distribution should start:at a value equal to or lower than the lowest number of the ungrouped data end:at a value equal to or higher than the highest number.,Table 2:Frequency distribution in terms of five classes,Table 3:Frequency distribution in terms of three
16、classes,If you want to know the relative standing of any particular score in a group of scores,you can show it by a percentile score.,formula:Cumulative F Percentile=(100)N,1.Cumulative frequency(F):refers to the frequency of the score or the scores within the class just below.2.N:represents the total numbel of scores.3.Percentile:defined as a number which represents the percent of scores that a particular raw score exceeds.,Table 4:Frequency distribution with cumulative frequencies and percenta