MULTIVARIATE. STATISTICS. SECOND EDITION. Barbara G. Tabachnick. Linda S. Fidell. California State University, Northridge. HarperCollins Publishers. Request PDF on ResearchGate | On Jan 1, , Barbara. have been designated as possible outliers (Hair et al., ; Tabachnick & Fidell, ). Using multivariate statistics, 5th ed. Citation. Tabachnick, B. G., & Fidell, L. S. ( ). Using multivariate statistics (5th ed.). Boston, MA: Allyn & Bacon/Pearson .
|Language:||English, Spanish, Arabic|
|Genre:||Fiction & Literature|
|Distribution:||Free* [*Sign up for free]|
(Tabachnick & Fidell, ). For this study, maximum and minimum extreme values for all the study variables were produced using SPSS. A visual inspection of. , vol. 3 (2), p. Understanding Power and Rules of Thumb .. Comrey and Lee () (see Tabachnick & Fidell, ) give the following guide. schools. As another example, consider the analysis reported by Tabachnick and Fidell (, pp. ), using data described in the article by Fidell et al.
In popular multivariate statistics texts, the reader is recommended to use D2 for multivariate outlier detection, although as is described below, there are several alternatives for multivariate outlier detection that may prove to be more effective than this standard approach.
Prior to discussing these methods however, it is important to briefly discuss general qualities that make for an effective outlier detection method. Readers interested in a more detailed treatment are referred to two excellent texts by Wilcox , When thinking about the impact of outliers, perhaps the key consideration is the breakdown point of the statistical analysis in question.
The breakdown point can be thought of as the minimum proportion of a sample that can consist of outliers after which point they will have a notable impact on the statistic of interest.
In other words, if a statistic has a breakdown point of 0. Comparatively, a statistic with a breakdown point of 0. Of course, it should be remembered that the degree of this impact is dependent on the magnitude of the outlying observation, such that more extreme outliers would have a greater impact on the statistic than would a less extreme value.
A high breakdown point is generally considered to be a positive attribute. While the breakdown point is typically thought of as a characteristic of a statistic, it can also be a characteristic of a statistic in conjunction with a particular method of outlier detection. Thus, if a researcher calculates the sample mean after removing outliers using a method such as D2, the breakdown point of the combination of mean and outlier detection method will be different than that of the mean by itself.
Finally, although having a high breakdown point is generally desirable, it is also true that statistics with higher breakdown points e. Another important property for a statistical measure of location e. Location equivariance means that if a constant is added to each observation in the data set, the measure of location will be increased by that constant value.
Scale equivariance occurs when multiplication of each observation in the data set by a constant leads to a change in the measure of location by the same constant.
In other words, the scale of measurement should not influence relative comparisons of individuals within the sample or relative comparisons of group measures of location such as the mean. In the context of multivariate data, these properties for measures of location are referred to as affine equivariance. Affine equivariance extends the notion of equivariance beyond changes in location and scale to measures of multivariate dispersion. Covariance matrices are affine equivariant, for example, though they are not particularly robust to the presence of outliers Wilcox, A viable approach to dealing with multivariate outliers must maintain affine equivariance.
Following is a description of several approaches for outlier detection. For the most part, these descriptions are presented conceptually, including technical details only when they are vital to understanding how the methods work.
We're sorry! We don't recognize your username or password.
Please try again. The work is protected by local and international copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning.
You have successfully signed out and will be required to sign back in should you need to download more resources. Using Multivariate Statistics, 6th Edition. Barbara G. Fidell, California State University - Northridge. Learning Goals Upon completing this book, readers should be able to: Series This product is part of the following series. MyPsychLab Series. Provides hands on guidelines for conducting numerous types of multivariate statistical analyses Maintains a practical approach, still focusing on the benefits and limitations of applications of a technique to a data set — when, why, and how to do it Presents a comprehensive introduction to today's most commonly encountered statistical and multivariate techniques, while assuming only a limited knowledge of higher-level mathematics.
Datasets available at www. New to This Edition. Added commonality analysis to Multiple Regression chapter. Updated sample size considerations in Multiple Regression chapter.
Updated sample size considerations in Factor analysis chapter. Complete example of Factor Analysis redone.
Expanded discussion of classification issues In Logistic Regression, including receiver operating characteristics. Table of Contents In this Section: Univariate and multivariate skewness and kurtosis Different formulations for skewness and kurtosis exist in the literature. Joanes and Gill summarize three common formulations for univariate skewness and kurtosis that they refer to as g 1 and g 2, G 1 and G 2, and b 1 and b 2. Minitab reports b 1 and b 2, and the R package e Meyer et al.
The sample skewness G 1 can take any value between negative infinity and positive infinity.
For a symmetric distribution such as a normal distribution, the expectation of skewness is 0. Distributions with positive skewness have a longer right tail in the positive direction, and those with negative skewness have a longer left tail in the negative direction. The one in the middle is a normal distribution and its skewness is 0.
In psychology, typical response time data often show positive skewness because much longer response time is less common Palmer et al. The distribution on the right in Fig. For example, high school GPA of students who apply for colleges often shows such a distribution because students with lower GPA are less likely to seek a college degree.
In psychological research, scores on easy cognitive tasks tend to be negatively skewed because the majority of participants can complete most tasks successfully Wang et al. Kurtosis is associated with the tail, shoulder and peakedness of a distribution.
Generally, kurtosis increases with peakedness and decreases with flatness. However, as DeCarlo b explains, it has as much to do with the shoulder and tails of a distribution as it does with the peakedness. This is because peakedness can be masked by variance. Normal distributions with low variance have high peaks and light tails as in Fig.
Hence, peakedness alone is not indicative of kurtosis, but rather it is the overall shape that is important.