Test Construction
Terms
undefined, object
copy deck
- Item analysis is a procedure used to evaluate the:
- quantitative and/or qualitative properties of test, such as difficulty and discriminability of test items
- Criterion contamination occurs when:
- A rater knows how a ratee did on a predictor test and this knowledge affects the rating.
- A test that has high validity always has...
- High reliability
- What is the preferred method of calculating the reliability of a test?
- Alternate forms reliability - which is calculated by correlating two forms of the same test.
- Shrinkage refers to:
- ???
- For correlation coefficients, the more heterogeneous the group (i.e. the wider the variability) the _______ the coefficient will be.
- Higher
- Confidence intervals are used to:
- estimate true scores from obtained scores
- R-squared is used as an indicator of:
- how much your ability to predict is improved using the regression line
- Coefficent eta is an indicator of:
- the relationship between two variables that have a nonlinear relationship
- Degrees of freddom is an indicator of:
- the number of values that are free to vary in a statistical calculation
- When conducting a factor analysis, an oblique rotation is preferred when:
- the underlying traits are believed to be dependent
- Adding more easy to moderatly easy items to a difficult test will...
- increase the test's floor
- The important mesaure of validity for the licensing exam is:
- content validity
- Convergent & divergent validity are types of_________ validity
- Construct
- In computing test-retest reliability, to control for practice effects one would use:
- alternative forms reliability coefficient or split-half reliability coefficient
- Kappa coefficient is used to:
- Evaluate inter-rater reliability (low .90s is high)
- Raising the cutoff on a predictor test will result in:
- decreasing false positives
- Items with which level of difficulty maximize discrimination amongst test takers?
- .50
- Ways to assess Reliability
-
1)Test-Retest - give same test twice to same group
2)Alternate Forms - giving two equivalent forms to the same group (many consider this to be the best)
3)Internal Consistency - include split-half (split test in 2), Kuder-Richardson (for T?F tests), Cronbach (multiple response) - Spearman-Brown formula
- Used to remove the artifical lowering of relaibility due to split-half which shortens the test (& therefore lowers relaibility)
- Factors affecting Reliability
-
1)Length - longer = more reliable
2)Homogenity - decreases reliability (except for Kuder & Cronbach)
3)Moderate difficulty - too easy or hard lowers reliability
4) - In regards to reliability coefficient & SD, the standard error of measurment is:
- Inversely related to reliability & positively related to SD
- Factors affecting Validity
-
1)Heterogeneity of examinees - restricted range = lower scores
2)Reliability of predictor & criterion - if not reliable, not valid
3)Moderator variables - variable (like gender) that influences the relationship between 2 other variables (when present, have differential validity)
4)Cross-Validation - given to another sample - Tyoes of Rotation
-
Orthogonal = independent factors
Oblique = correltaed (dependent) factora - Prinicpal Components Analysis
- Reduces variables into factors like factor analysis, but differs by having only explained & error variance, & the factors are always UNCORRELATED
- Cluster Analysis
- A statistical procedure whereby people or items are grouped according to their similarity on measures of interest to the researcher. (Differs from FA, becasue can use nominal, ordinal levels, clusters are categories, not latent variables, & is used to develop taxonomies)
- Correction for Attentuation
- Formula used to estimate how much more valid a predictor test would be if it had perfect reliability
- P-level (item difficulty index) is represented in what type of scale?
- Ordinal (Anastasi, 1982)
- Standard Score
-
Express a raw scores distance from the mean in SD units (norm-referenced score)
1)Z-score - 1.0 - one SD from mean
2)T-score - Mean = 50, Sd = 10
3)Satnine - scores range from 1-9, mean = 5, SD = 2
4) Deviation IQ, Mean = 100, SD = 15