testing and assessment

Start Studying!

Terms

undefined, object

copy deck

Assessment is: A general term that includes the full range of procedures used to gain information about student performance
The differences between assessment and measurement: *Measurement is the quantitative numerical part of assessment. It tells how much, not how well.
*Assessment includes both measurement and how well
*It ends in a value judgment
Principles of Assessment: *Clearly specifying what is to be assessed has priority in the assessment process
*Assessment procedure should be selected because of its relevance characteristics or performance to be measured
*Comprehensive assessment requires a variety of procedures.
*Proper use of assessment procedures requires an awareness of their limitations
*Assessment is a means to an end, not an end in itself
5 steps in the instructional process: *Identify objectives
*Preassess learners needs
*Provide relevant instruction
*Assessing the intended learning outcomes
*Using the results
Type of testing & assessment procedures: Maximum performance - what an individual can do when they put forth their best effort.

Typical performance - what an individual will do under normal conditions.
Difference between fixed choice and complex performance.: Fixed choice - the student selects an answer from available options. (May include t/f, mult. choice, and matching.

Complex performance assessment - require students to solve problems of importance outside the classroom, such as written essays.
Advantages of fixed choice tests: *Objective scoring
*Students can respond to a large number of items in a short time.
*Can be machine scored
*High reliability
*Cost effective
Limitations of fixed choice tests: *tend to overemphasize factual knowledge and low level skills
*test drive instruction in ways that are inconsistent with current understandings of cognition and learning which emphasize the importance of engaging students in the construction of knowledge rather then discrete fact and procedural skills.
Advantages of complex performance tests: *Assess performance while students are engaged in problem solving and learning experiences that are valued in their own right.
*Useful in measuring student achievement
*May assess at higher levels of the taxonomy of objectives.
Advantages of one type of tests is: the disadvantage of the other
Neither testing type: does it all
Best type of test depends: on the objective
Limitations of complex performance tests: *difficult scoring
*more time consuming to administer and score
Classification of tests based upon uses of the test results: *Placement
*Formative
*Diagnostic
*Summative
Placement test: Determine student performance at the beginning of instruction
Formative testing: Monitor learning process during
Diagnostic testing: Diagnose learning difficulties during instruction
Summative: Assess achievement at the end of instruction. Certifies mastery and grades are given.
Differences between norm-referenced and criterion referenced test interpretation: *Norm-describes performance in terms of relative position in some known group. It's not how much you know but where you are in relation to others. It is popular/can't fail
*Criterion - Describes the specific performance demonstrated in terms of a clearly defined and delimited domain of learning tasks. (classroom, FCAT, Dr. Lic.)
Common characteristics of norm-referenced and criterion-referenced tests: Both:
*require specification of the achievement domain to be measured
*require a relevant and representative sample of test items.
*Use same types of test items
*Use same ruled for item writing
(except for item difficulty)
*Judged by the same quality of goodness (validity and reliability)
*are useful in educational assessment
Differences between norm-referenced and criterion-referenced tests: NRT - Covers large domain with just few measuring each specific tasks
CRT - Focuses on a delimited domain with large numbers of items measuring each specific task.
NRT - Emphasizes discrimination among individuals in terms of relative learning
CRT - Emphasizes description of what learning task individuals can/cannot perform
NRT - Favors items of average difficulty and omits very easy and very difficult items.
CRT - Matches items difficulty to learning tasks, without altering item difficulty or omitting easy/hard items
NRT - Interpretation requires a clearly defined group
CRT - Interpretation requires a clearly defined group and delimited achievement domain.
Identifies other classifications of tests: *Informal v. Standardized
extra time - no extra time
*Individual v. Group
2 people in room - student and test
*Mastery v. Survey
Specific - Broad
*Supply v. Survey
Short answer - Fixed answer
*Speed v. Power
- classroom
3 domains of the taxonomy of educational objectives: *Mostly - cognitive - knowledge outcomes and intellectual abilities and skills
*Affective - Attitudes, interests, appreciation, and modes of adjustment
- if done a lot is harmful
*Psycho-motor - Perceptual and motor skills
Gives examples of sources for lists of objectives: *Blooms taxonomy
*Professional association standards
*State content standards
*Methods books
*Yearbooks and subject matter
*Ency. of Educational research
*Curriculum frameworks and guides
*Test manuals
*Banks of Objectives
4 criteria for selecting appropriate objectives: *Do the objectives include all important outcomes of the course?
*Are the objectives in harmony with the content standards of the state or district and with general goals of the school?
*Are the objectives in harmony with sound principles of learning?
*Are the objectives realistic in terms of the abilities of the students and the time and facilities available?
Define validity: Validity refers to the adequacy and appropriateness of the interpretation made from assessments with regard to a particular use. It is always concerned with the specific use of assessment results and the soundness of interpretations of these results.
*Not in #'s or scores, it resides in use and interpretations. The way the test is used makes it valid or not valid.
*it only validates objectives
Describe how validity relates to assessment: *refers to the appropriateness of the interpretation of the results of an assessment procedure for a given group of individuals, not to the procedure itself
*Validity is a matter of degree, not a matter of "valid or invalid"
*validity is always specific to some particular use or interpretation.
Validity involves an overall evaluative judgment - cannot talk about validity without making a value judgment.
Validity tells us: Does the test measure what we want it to measure
Content: *Most important*
How well a sample of questions represents everything that has been learned. Do eyeball looks at items vs objectives of table of specifications.
Test Criterion: *Second most important*
How well the performance on the assessment predicts future performance or estimates (now) current performance on some valued measure other than the test itself (called criterion). SAT score/Office skills.
Construct: Exists in head only.
How well use of assessment results accomplishes intended purposes and avoids unintended results.
How well performance on the assessment can be interpreted as a meaningful measure of some characteristic or quality.
Not directly measurable.
Consequences: How well use of assessment results accomplishes intended purposes and avoids unintended results.
For administrators
Change in funding.
Defines content validation: The extent to which a set of assessment tasks provides a relevant and representative sample of the domain of tasks which interpretations of assessment results are made.
2 types of test criterion: Predictive - Future
Concurrent - Now
What is a table of specifications: Tool for ensuring content validity
Aids in obtaining a sample of tasks that represents both subject matter, content, and instructional objectives.
The more closely the test items correspond to the specified sample, the greater the likelihood of obtaining a valid measure of student learning.
Defines 2 types of test-criterion relationships: Test criterion relationships involve the degree to which test scores are related to some other valued measure other than the test itself (the "criterion"). They may involve studies of how well scores predict future performance (predictive validation) or estimate current performance (concurrent validation).
Explains the meaning of a correlation coefficient: it is a numerical summary of the degree of relationship between the scores on two tests. The value falls between -1 and 1. It is not a percent. It has 2 decimal places. 1.00 is perfect correlation which doesn't exist in testing and behavior.
0 is the lowest correlation
.30 is close to nothing
.60 is moderate
.90 is a good correlation
it is only used to measure test criterion
(Validity is called a validity coefficient and you need a criterion score)
State factors which influence the size of correlation coefficients: Increase it up to what it really is.. can underestimate them but it is impossible to overestimate them.
*Characteristics Measured - The more alike characteristics the higher the correlation
*Spread of scores - The wider the spread of scores, the higher the correlation.
*Stability of scores - the higher the stability of scores, the higher the correlation
*Time span between measures - The shorter he time span between measures, the higher the correlation.
What is the purpose of an expectancy table?: they indicate how well a test predicts future performance or estimate current performance on some criterion measure. They may be used to show the relationship between various types of measures and they provide a simple and direct means of indicating the predictive value of test results.
Define regression equation: Is a mathematical formula for converting scores obtained on a test into predicted criterion scores.
-predict one variable from the other.
-SAT score to predict college GPA
Y = a + bx
Define predicted criterion score: A predicted criterion score is an estimate of expected performance on some valued measure other than the test whose score was used to make the estimate.
Lists steps in the general process for construct validation.: *Identifying and describing by means of a theoretical framework the meaning of the construct to be measured.
*Deriving hypotheses regarding performance on an assessment from the theory underlying the construct.
*Verifying the hypotheses by logical and empirical means.
Describes factors in the test or assessment which may affect the validity of the test results.: *Unclear directions
*Reading vocabulary and sentence structure too difficult
*Ambiguity
*Inadequate time limits
*Overemphasis of easy-to-assess aspects of domain at the expense important, but hard-to-assess aspects.
*Poorly constructed test items
*Test items inappropriate for the outcome being measured
*Test too short.
*Improper arrangement of items
*Identifiable patterns of answers
Names factors in task function and teacher procedures which may affect validity.: *Previous teaching of solution results in measures of only memorized knowledge
*Teaching of mechanical stops for obtaining solutions
Define reliability: Reliability refers to the consistency of measurement, how consistent test scores or other assessment results are from one measurement to another.
Describe the general points of reliability in testing and assessment: *Reliability refers to the results obtained with an assessment
*An estimate of reliability always refers to a particular type of consistency.
***Reliability is a necessary but not sufficient condition for validity.
*Reliability is primarily statistical.
States the relationship between reliability and validity: ***Reliability is necessary but not sufficient for validity - if test results are valid, they are reliable but the inverse if not true.
*Can't have validity without reliability
Can have reliability without validity
Define correlation coefficient: # between 0 and + or - 1
A statistic that indicates the degree of relationship between any two sets of scores obtained from the same group of individuals (e.g., correlation between height and weight)
Define Validity coefficient: A correlation coefficient that indicates the degree to which a measure predicts of estimates performance on some criterion measure (e.g., correlation between scholastic aptitude scores and grades in school.)
Define reliability coefficient: A correlation that indicates the degree of relationship between two sets of scores intended to be measures of the same characteristic (e.g., correlations between scores assigned by two different raters or scores obtained from administrations of two forms of a test)
Both scores come from the same test.
Describes procedures used to estimate reliability: *Test-retest method
*Equivalent forms method
*Test-retest with equivalent forms
*Split-half method
*Kuder-Richardson method and coefficient Alpha
*Interrater method

Start Studying!

Deck Info

Number of cards 51