EPPP Test Construction 2

Start Studying!

Terms

undefined, object

copy deck

Reliability Coefficient: Measure of how much obtained score is true ability
-Interpret directly (70% means 70% is true ability, 30% error)
-A good test should have at least 0.7 or higher
Classical Test Theory: Results are:
1. True Score (Ability)
-true variance
2. Some Error (Fatigue...)
-error variance
Reliablity: -Establish reliability first (Test can be reliable but not valid.)
-Consistency
Validity: -Accuracy
Validity can not exceed...: the square root of reliablity
Types of Reliability: 1. Test-retest reliability (Coefficient of stability)
2. Alternate Forms (Considered the best but least used)
3. Internal Consistency (Compares test against itself)
Types of Internal Consistency Reliablity: 1. Split-Half (split test, problem is restricted range)
-can use Spearman-Brown Prophecy Formula to make it like 2 tests
2. Inter-Item Consistency (compare items on one test one against the other in a systematic way)
-can use Cronbach's Alpha (compare items on test individually against all others systematically) or Kuder Richardson Formula 20 (special version of Cronbach, use when you have true/false or yes/no dichotomous test items)
Kappa Coefficient: Inter rater reliability
Standard Error of Measurement: -Based on reliability coefficient
-Try to get an idea of what a person's true ability is
-Based on a person's single score but has properties of a normal curve
-the more reliable the test, the less the SE of measurement
Standard Error of Mean: -How will sample represent population?
It is best to have ___________ items and _____________ test takers for a test to be most reliable.: Homogeneous
Heterogeneous
Content Validity: Based on expert judgement
-academic tests
Criterion-Related Validity: Outcome
-look at relationship between predictor and outcome
-used most often in personnel psych (predicting job performance, etc)
-two types are predictive validity (who will become schizophrenic?, predicts future behavior) and concurrent validity (who is schizophrenic now?, test results NOW)
Construct Validity: Can not directly define
-Two types are convergent (compare new test with established test that measures same construct) and divergent (discriminant validity - you want your test to have nothing in common with another test of a different construct)
Multitrait-Multimethod Matrix: If it's a single trait, will establish convergent validity - need at HIGH monotrait number to establish convergent validity
-If it's a heterogeneous trait, will need a low trait number to establish divergent validity
Face Validity: Does the test make sense to the people who are taking it?
Cross Validation: Give test instrument again and again
-Shrinkage may occur (range of scores will shrink slightly when you initially cross validate instruments)
Incremental Validity: Can we increase that number of correct decisions we are already making?
Three things to establish Incremental Validity: 1. base rate - moderate (number of decisions you are already making correctly)
2. selection ratio - need low selection ratio (number of jobs available to number of applicants)
3. validity coefficient - high validity on predictor and criterion
Criterion-Referenced Scores: -Do not compare score to anyone else, just meeting a standard
Norm-Referenced Scores: -Score is compared to other individuals
-Two types: percentile ranks (not used as much now) and standard scores (transformed scores that allow you to compare)
Floor Effect: -bunch of test takers at bottom of test range
-need to have enough easy items
Ceiling Effect: -need to have enough difficult items to discriminate between best test takeers

Start Studying!

Deck Info

Number of cards 23