What is contingency table in statistics? What is the function of a contingency table? What is contingency table in a chi square test? Explain with example..
Contingency table is a type of table used in statistics. Cross tabulation, a two way table and crosstab are the different names it is known by. This table is in the matrix format, meaning it has rectangular array of numbers, expressions and symbols arranged in rows and columns. That matrix format shows the frequency distribution of the variables. They are mainly used widely in business intelligence, survey research, scientific research and engineering research. That table pictures the relationship between two variables and the interactions between them. Karl Pearson was the first one to use the term contingency table. He used it in on the Theory of Contingency and its relation to Association and Normal Correlation, which was a part of Draper’s Company Research Memoirs Biometric Series Ⅰ. It was published in 1904. Multivariate statistics is a sub part of statistics. It comprises of simultaneous observation and analysis of more than one outcome variable. The application of multivariate statistics is multivariate analysis. An important problem of multivariate statistics is finding out the dependence or the direct structure underlying the variables which are in the contingency tables of high dimension. The storage of data can be done in a smarter way, if some of the conditional independences are revealed. One can use information theory concepts to store data in a smart way. This helps to gain the information only from the distribution of probability. The distribution of probability can be expressed easily from the contingency table by the relative frequencies. Contingency tables can be created with the help of spreadsheet software by using pivot table. The entries in the cells can be frequency counts or relative frequencies. A two – way table is shown below. It shows the favorite leisure activities for 50 adults. These adults consists of 20 men and 30 women. Since the entries in the table are frequency counts, the table is a frequency table.
|
Dance |
Sports |
TV |
Total |
Men |
2 |
10 |
8 |
20 |
Women |
16 |
6 |
8 |
30 |
Total |
18 |
16 |
16 |
50 |
Suppose there are two variables. One consists of gender and the other consists of the leisure activities. Suppose that 50 individuals are randomly sampled from a large population as part of a study of leisure activities of people. A contingency table can be created to display the numbers of individuals who prefer to dance in their leisure time, play sports in their leisure time and watch T.V in the leisure time. Such a contingency table is shown above. The number of men, women and the way they spend their leisure time are called marginal totals. The grand total is the number in the bottom right corner. Grand total is the total number of individuals represented in the contingency table. The table allows the users to see at a glance that the proportion of men who dance in their free time and women who dance at their free time. The strength of the association can be measured by the odds ratio, and the population odds ratio estimated by the sample odds ratio. The significance of the difference between the two proportions can be assessed with a variety of statistical tests. These test includes Karl Pearson’s Chi – squared test, Barnard’s test, Fisher’s exact test and the G – test. The entries provided in the table represents individuals randomly sampled from the population about which conclusions are to be drawn. If the proportions of individuals in the different columns vary significantly between rows, it is said that there is a contingency between the two variables. Meaning that the two variables are not independent. If there is no contingency, it is said that the two variables are independent. In principle, any number of rows and columns may be used. There may be more than two variables. But higher order contingency tables are difficult to represent visually. The relationship between ordinal and categorical variables can be represented in the contingency tables. Contingency table consists of standard contents. Contingency table has multiple columns. Here each row refers to a specific sub – group in the population. In this table it refers to men and women. The columns are sometimes referred to as banner points or cuts. While the rows are referred to as stubs. Contingency test consists of significance tests. It is either column comparisons or cell comparisons. Column comparisons test, checks the difference between columns and displays these results using letters. Cell comparisons use color or arrows to identify a cell in a table that stands out in some way. Contingency table consists of nets or netts which are sub – totals. Contingency table consists of one or more than one of averages, percentages, row percentages, column percentages, and indexes.
During the 19th century, statistical analytical methods were mainly applied in biological data analysis. It was mandatory for the researchers to assume that observations followed a normal distribution. Those researchers are Sir George Airy and Professor Merriman. 1900 Karl Pearson criticized their works in his paper. Till the end of 19th century, he noticed the existence of significant skewness within some biological observations. From 1893 to 1916, he published a series of articles where he devised the Pearson distribution, in order to model the observations regardless of being skewed or normal. It was a family of continuous probability distributions, which includes the normal distribution and many skewed distributions, and proposed a method of statistical analysis consisting of using the Pearson distribution to model the observation and performing the test of goodness of fit to determine how well the model and the observation really fit. There are other chi –square tests. They are Cochran – Mantel – Haenszel chi – square test, binomial test and Fisher’s exact test. A chi – squared test is any statistical hypothesis test. In it the sampling distribution of the test statistic is a chi – square distribution when the null hypothesis is true. Without other qualification, chi – squared test is often used as a short for Pearson’s chi – squared test. The chi – squared test is used to find out if there is any important differences between the expected frequency and the observed frequencies in the categories. The categories can be more than one. In the standard applications of this test, the observations are classified into mutually exclusive classes. Along with this there is a theory of null hypothesis. The motive of the test is to evaluate how likely the observations that are made would be. Assuming the null hypothesis to be true.
Contingency Table
Diet |
Cancers |
Fatal heart disease |
Non – fatal heart disease |
Healthy |
Total |
AHA |
15 |
24 |
25 |
239 |
303 |
Mediterranean |
7 |
14 |
8 |
273 |
302 |
Total
|
22 |
38 |
33 |
512 |
605 |
Here Chi Square is used to test the relationship between nominal variables for significance. We need to find out if there is significant relationship between diet and outcome. The first step is to compute the expected frequency for each cell based on the assumption that there is no relationship. These expected frequencies are computed from the totals as follows. It is done by computing the expected frequency for the AHA Diet / Cancers combination. 22/ 605 people developed cancer. The proportion who developed cancer is 0.0364. If there were no relationship between diet and outcome, then it would be expected 0.0364 of those on the AHA diet to develop cancer. Since 303 people are on the AHA Diet, it is expected (0.0364) (303) = 11.02 cancers on the AHA diet. Since 302 In the same way, it is expected (0.0364) (302) = 10.98 cancers on the Mediterranean diet. In general, the expected frequency for a cell in the ith row and the jth column is equal to
Ei,j = Ti Tj /T
Where Ei,j is the expected frequency for cell i, j. Ti is the total for the ith row, Tj is the total for the jth column, and T is the number of observations. For the AHA diet / Cancer cell, i= 1, j = 1, Ti = 303, Tj =22 and T = 605.
Diet |
Cancers |
Fatal heart disease |
Non – fatal heart disease |
Healthy |
Total |
AHA |
15 (11.02) |
24 (19.03) |
25 (16.53) |
239 (256.42) |
303 |
Mediterranean |
7 (10.98) |
14 (18.97) |
8 (16.47) |
273 (255.58) |
302 |
Total
|
22 |
38 |
33 |
512 |
605 |
The table contains observed and the expected frequencies.
The significant test is conducted by computing Chi Square as :
X 23 = ∑ (E-0) 2 /E
∑ (observed – expected) 2 / expected
= 16.55
The number of degrees of freedom is equal to (r – 1) (c - 1)
= (number of rows – 1) (number of columns – 1)
= (2-1)(4-1)
=3
The probability value for a Chi Square of 16.55 with three degrees of freedom is equal to 0.0009. Hence, the null hypothesis of no relationship between diet and outcome can be rejected. A vital assumption of the Chi Square test is that each subject contributes data to only one cell. Hence , the sum of all cell frequencies in the table must be the same as the number of subjects in the experiment.
MyAssignmenthelp.com caters to thousands of students on a regular basis who need top quality homework help online. We aid students in completing their homework assignments without making holes in their pockets. Despite providing cheap homework help services, we never compromise with the quality of delivered help solutions. Our homework writers are instructed to use authentic and relevant references in writing to make the papers plagiarism-free. Hence, we have become the most authentic and effective homework writing service provider in USA.
Just share requirement and get customized Solution.
Orders
Overall Rating
Experts
Our writers make sure that all orders are submitted, prior to the deadline.
Using reliable plagiarism detection software, Turnitin.com.We only provide customized 100 percent original papers.
Feel free to contact our assignment writing services any time via phone, email or live chat. If you are unable to calculate word count online, ask our customer executives.
Our writers can provide you professional writing assistance on any subject at any level.
Our best price guarantee ensures that the features we offer cannot be matched by any of the competitors.
Get all your documents checked for plagiarism or duplicacy with us.
Get different kinds of essays typed in minutes with clicks.
Calculate your semester grades and cumulative GPa with our GPA Calculator.
Balance any chemical equation in minutes just by entering the formula.
Calculate the number of words and number of pages of all your academic documents.
Our Mission Client Satisfaction
I\'m very satisfied with my results. Really wasn\'t expecting my result would get almost a full mark. Would come back for ur service again in the future. Thanks for the help guys
Australia
I must appreciate the work as it has met all the requirement criteria as described in the order. Hope to score good marks??
Australia
the experts did a great job as usual. I was very pleased with the outcome and will use again
Australia
I love this I got a perfect score on my essay. Will for sure be using this again
Australia