Psychology 859: Seminar
in Quantitative Psychology---Spring, 2010
(as of 2/18/2010)
Selected Topics in Item Response Theory
Time, Place: 9:00-11:30
Fridays, 347 Davie
Instructor: David
Thissen
Tentative Schedule:
|
Monday |
Topic/Readings |
Additional potential readings* |
|
January 15 |
Background & Overview Thissen, D. & Steinberg, L.
(2009). Item
response theory. In R. Millsap
& A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology. London:
Sage Publications. Bock,
R.D. (1997). A brief history of item response
theory. Educational Measurement: Issues and Practice.
16,
21-33. van der Linden, W.
& Ronald K. Hambleton. (1997) Item response theory: Brief history, common
models, and extensions. In W.J. van der Linden & Ronald K. Hambleton
(Eds), Handbook
of item response theory. New York: Springer-Verlag, (Ch. 1). Wainer, H., Bradlow,
E.T., and Wang, X., Testlet response theory and its applications. New York,
NY: Cambridge University Press, 2007. (Chapters 1-3) |
Lazarsfeld, P. F. (1950). The
logical and mathematical foundation of latent structure analysis. In S. A. Stouffer, L. Guttman, E. A. Suchman, P. F.
Lazarsfeld, S. A. Star, & J. A. Clausen, Measurement and Prediction (Pps.
362-412). New York: Wiley. Lord,
F.M. (1952). A theory of test scores. Psychometric Monographs, Whole No. 7. Lord, F.M. (1953). The
relation of test score to the trait underlying the test. Educational and Psychological Measurement, 13, 517-548. |
|
Links to download software and
documents that may be useful throughout the course |
Download (and try to
install) the IRTPRO beta, R (if you donÕt already have it), and the R-IRT
graphics functions; links are
provided on the page on downloading software.
|
Useful
throughout the course: Yen,
W.M. & Fitzpatrick, Anne, R. (2006). Item
response theory. In R.L. Brennan (Ed.) Educational Measurement (Fourth
Edition). Westport, CT: Praeger. |
|
January 22 |
Models for Binary Responses van der Linden, W.
& Ronald K. Hambleton. (1997). Item response theory: Brief history,
common models, and extensions. In W.J. van der Linden & Ronald K.
Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch.
1). Thissen, D., & Orlando, M.
(2001). Item response theory for items scored in two categories. In D.
Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum
Associates. (Ch. 3) Download ArmyDataFilesForHomework1.zip
to do the homework! |
Hambleton,
R.K. & Swaminathan, H. (1985) Item response theory: Principles and applications.
Boston: Kluwer-Nijhoff. Hulin, C.L., Drasgow, F., & Parsons, C.K. (1983). Item response
theory: Application to psychological measurement. Homewood, IL: Dow-Jones
Irwin. Lord, F.M. (1980). Applications of item response theory to practical
testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates. Lord, F. M. & Novick, M. R. (1968). Statistical
Theories of Mental Test Scores. Reading, MA: Addison-Wesley. |
|
January 29 |
Samejima's Graded Model Samejima,
F. (1997). Graded response model. In W.J. van der Linden & Ronald K.
Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch.
5). Thissen,
D., Nelson, L., Rosa, K., & McLeod, L.D. (2001). Item response theory for
items scored in more than two categories. In D. Thissen & H. Wainer
(Eds), Test
Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 4) |
Samejima, F. (1969). Estimation
of latent ability using a response pattern of graded scores. Psychometric Monograph, No. 17, 34, Part 2. Likert, R. (1932). A
technique for the measurement of attitudes. Archives of Psychology, (Whole No. 140). Hill, C.D., Edwards, M.C., Thissen,
D., Langer, M.M., Wirth, R.J., Burwinkle, T.M., & Varni, J.W. (2007). Practical issues in the application of item response
theory: A demonstration using items from the Pediatric Quality of Life
Inventoryª (PedsQLª) 4.0 Generic Core Scales. Medical Care, 45, S39-47. Thissen, D., Pommerich, M., Billeaud,
K., & Williams, V.S.L. (1995). Item
response theory for scores on tests including polytomous items with ordered
responses.
Applied
Psychological Measurement, 19, 39-49 |
|
February 5 |
Class-as-lab? (You
should have done this already, but, before class, download (and
try to install) Multilog and documentation files using links provided on the page on
downloading software.) In-class
help with IRTPRO and the R graphics package? |
|
|
February 12 |
Multidimensional Item Response Theory Edwards, M.C. & Edelen, Maria
Orlando. (2009). Special
Topics in Item response theory. In R.
Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in
psychology. London: Sage Publications. (MIRT section) Wirth,
R.J., & Edwards, M.C. (2007). Item
Factor Analysis: Current Approaches and Future Directions. Psychological Methods, 12, 58-79. McLeod,
L.D., Swygert, K., & Thissen, D (2001). Factor analysis for items scored
in two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence
Erlbaum Associates. (Ch. 5) Swygert,
K., McLeod, L.D., & Thissen, D (2001). Factor analysis for items scored
in more than two categories. In D. Thissen & H. Wainer (Eds), Test Scoring.
Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 6) Reckase,
M.D.(1997). A linear logistic multidimensional model
for dichotomous item response data. In W.J. van der Linden & Ronald K.
Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch.
16). |
Reckase,
M. (2009). Multidimensional
item response theory. New York, NY: Springer. [This book is available as
downloadable chapter pdf files through the UNC library.] Bock,
R.D., Gibbons, R. & Muraki, E. (1988). Full-information
item factor analysis. Applied
Psychological Measurement, 12, 261-280. Yung,
Y.F., McLeod, L.D., & Thissen, D. (1999). On
the relationship between the higher-order factor model and the hierarchical
factor model. Psychometrika,
64,
113-128. Steinberg, L. & Jorgensen, R.S.
(1996). Assessing the MMPI-vased Cook-Medley
Hostility Scale: The implications of dimensionality. Journal of Personality and Social Psychology, 70, 1281-1287. Download form2_emotion.zip to do the homework! |
|
February 19 |
Estimation Algorithms and Goodness of
Fit Statistics Bolt,
D.M. (2005). Limited- and full-information
estimation of item response theory models. In A. Maydeu-Olivares & J.J. McArdle (Eds), Contemporary
Psychometrics (Pp. 27-72). Mahwah, NJ: Erlbaum. (Ch. 2) Orlando, M., & Thissen, D. (2000). Likelihood-based item fit indices for dichotomous item
response theory models. Applied
Psychological Measurement, 24, 50-64. Orlando, M. & Thissen, D. (2003). Further investigation of the performance of S-X2: An item
fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27, 289-298. Maydeu-Olivares, A.,
& Joe, H. (2005). Limited and full
information estimation and testing in 2n contingency tables: A
unified framework. Journal of the American Statistical Association, 100,
1009–1020. Maydeu-Olivares, A. & Joe, H. (2006). Limited information goodness-of-fit
testing in multidimensional contingency tables. Psychometrika, 71, 713-732. |
Bock,
R.D., & Aitkin, M. (1981). Marginal maximum
likelihood estimation of item parameters: An application of the EM algorithm.
Psychometrika,
46,
443-449. Bock, R.D. & Lieberman, M. (1970). Fitting a response model for n
dichotomously scored items. Psychometrika, 35, 179-197. Thissen, D. (1982). Marginal
maximum likelihood estimation for the one-parameter logistic model. Psychometrika,
47,
201-214. Cai, L.
(2008). SEM of another flavour: Two new applications
of the supplemented EM algorithm. British Journal of Mathematical and Statistical
Psychology, 61, 309-329. Gibbons, R. D., & Hedeker, D. (1992). Full-information item bi-factor analysis.
Psychometrika, 57, 423-436. Gibbons, R.D., Bock, R.D., Hedeker, D., Weiss,
D.J., Segawa, E., Bhaumik, D.K., Kupfer, D.J., Frank, E., Grochocinski, V.J.,
& Stover, A. (2007). Full-information
item bifactor analysis of graded response data. Applied Psychological
Measurement, 31, 4-19. Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood
item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555. Cai, L.
(in press). High-dimensional exploratory item factor
analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika. Cai, L. (in
press). Metropolis-Hastings
Robbins-Monro Algorithm for Confirmatory Item Factor Analysis. Journal of
Educational and Behavioral Statistics. Albert,
J.H. (1992). Bayesian estimation of normal ogive
item response curves using Gibbs sampling. Journal of Educational Statistics, 17,
251-269. Patz,
R.J. & Junker, B.W. (1999). A straightforward
approach to Markov chain Monte Carlo methods for item response theory. Journal of
Educational and Behavioral Statistics, 24, 146-178. Edwards,
M.C. (in press). A Markov chain Monte Carlo
approach to confirmatory item factor analysis. Psychometrika. Baker,
F.B. & Kim, S.-H. (2004). Item response theory: Parameter estimation
techniques (2nd Edition, Revised and Expanded). New York:
Marcel Dekker. |
|
February 26 |
Bock's Nominal Model and Variations Thissen, D., Cai,
L., & Bock, R.D. (in press). The
nominal categories item response model. In M. Nering & R. Ostini
(Eds.), Handbook
of polytomous item response theory models: Developments and applications.
Bock,
R.D. (1997). The nominal categories model. In W.J. van der Linden &
Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch
2). Anderson,
E.B. (1997). The rating scale model. In W.J. van der Linden & Ronald K.
Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch
4). Masters,
G.N., & Wright, B.D. (1997). The partial credit model. In W.J. van der
Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York:
Springer-Verlag, (Ch 6). Muraki,
E. (1997). A generalized partial credit model. In W.J. van der Linden &
Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch
9). |
Bock, R. D. (1972). Estimating
item parameters and latent ability when responses are scored in two or more
latent categories. Psychometrika, 37,
29-51. Masters, G.N. (1982). A Rasch
model for partial credit scoring.
Psychometrika,
47, 149‑174. Masters, G.N., & Wright, B.D. (1984). The essential process in a family of measurement models. Psychometrika, 49, 529-544. Thissen, D. & Steinberg, L.
(1986). A taxonomy of item response models. Psychometrika, 51, 567-577. Thissen,
D. & Steinberg, L. (1988). Data
analysis using item response theory.
Psychological
Bulletin, 104,
385-395. |
|
March 5 |
Special Guest Lecture: PROMIS The
website for the Patient Reported Outcomes
Measurement Information Systems has information on all things PROMIS. Irwin, D.E., Stucky, B.D., Thissen, D.,
DeWitt, E.M., Lai, J.S.,Yeatts, K., Varni, J.W.,
DeWalt, D.A. (in press). Sampling Plan
and Patient Characteristics of the PROMIS Pediatrics Large-Scale Survey. Quality of Life
Research. Yeatts, K., Stucky, B.D., Thissen, D.,
Irwin, D., Varni, J., DeWitt, E.M., Lai, J.S., & DeWalt, D.A. (in press).
Construction of the Pediatric Asthma Impact
Scale (PAIS) for the Patient Reported Outcomes Measurement Information System
(PROMIS). Journal
of Asthma. Varni, J., Stucky, B.D., Thissen, D.,
DeWitt, E.M., Irwin, D., Lai, J.S.,Yeatts, K., & DeWalt, D.A. (in press). PROMIS Pediatric Pain Interference
Scale: An Item Response Theory Analysis of the Pediatric Pain Item Bank. Journal of Pain. Irwin, D.E., Stucky, B.D., Thissen, D.,
DeWitt, E.M., Lai, J.S., Varni, J.W., Yeatts, K., DeWalt, D.A. (in press). An Item Response Analysis of the
Pediatric PROMIS Anxiety and Depressive Symptoms Scales. Quality of Life
Research. DeWitt, E.M., Stucky, B.D., Thissen,
D., Irwin, D.E., Langer, M.M., Varni, J.W., Lai, J.S., Yeatts, K., DeWalt,
D.A. (under review). Construction
of the PROMIS Pediatric Physical Function Scales: Built using Item Response Theory. |
Reeve, B.B., Hays, R.D, Bjorner, J.B.,
Cook K.F., Crane, P.K., Teresi, J.A., Thissen, D., Revicki, D.A., Weiss,
D.J., Hambleton, R.K., Liu, H., Gershon, R., Reise, S.P., & Cella, D
(2007). Psychometric evaluation and calibration
of health-related quality of life items banks: Plans for the patient-reported
outcome measurement information system (PROMIS). Medical Care, 45, S22-31. Langer, M.M., Hill,
C.D., Thissen, D., Burwinkle, T.M., Varni, J.W., & DeWalt, D.A. (2008). Item response theory detected differential item
functioning between healthy and ill children in quality-of-life measures.
Journal of
Clinical Epidemiology, 61, 268-276. Revicki, D. A., Chen, W-H.,
Harnam, N., Cook, K., Amtmann, D., Callahan, L. F., Jensen, M. P., &
Keefe, F. J. (2009). Development and
psychometric analysis of the PROMIS pain behavior item bank. Pain,
146(1-2), 158-69. |
|
March 12 |
Spring Break—no class |
|
|
March 19 |
Local Dependence and Testlets Thissen,
D. & Steinberg, L. (2010). Using
item response theory to disentangle constructs at different levels of
generality. In
S. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches
(pp. 123-144). Washington, DC: American Psychological Association. Wainer, H., Bradlow, E.T., and Wang, X., Testlet Response Theory
and Its Applications. New York, NY: Cambridge University Press, 2007.
(Chapters 4-12) Steinberg,
L. & Thissen, D. (1996). Uses
of item response theory and the testlet concept in the measurement of
psychopathology, Psychological
Methods, 1,
81‑97. |
Hoskens, M. & De Boeck, P. (1997). A parametric model for local dependence among test items. Psychological Methods, 21, 261-277. Chen, W.H. & Thissen, D. (1997). Local
dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22,
265-289. Wang, X. Bradlow, E. T., &
Wainer, H. (2004). A userÕs guide for SCORIGHT (Version 3.0): A computer program for scoring tests built of testlets
including a module for covariate analysis (Research Report RR 04-49). Princeton, NJ: Educational Testing Services Thissen,
D., Steinberg, L. & Mooney, J.A. (1989). Trace lines for testlets: A use of
multiple-categorical-response models.
Journal of
Educational Measurement, 26, 247-260. |
|
March 26 |
Developmental Scales and Differential
Item Functioning Williams, V.S.L., Pommerich, M.,
& Thissen, D. (1998). A comparison of developmental
scales based on Thurstone methods and item response theory. Journal of
Educational Measurement, 35, 93-107. Edwards, M.C. & Edelen, Maria
Orlando. (2009). Special
Topics in Item response theory. In R.
Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in
psychology. London: Sage Publications. (DIF section) Steinberg, L., & Thissen, D.
(2006) Using Effect Sizes for Research
Reporting: Examples using Item Response Theory to Analyze Differential Item
Functioning. Psychological Methods, 11, 402-415. Bock, R.D. & Zimowski, M.F.
(1997). Multiple Group IRT. In W.J. van der Linden & Ronald K. Hambleton
(Eds), Handbook
of item response theory. New York: Springer-Verlag, (Ch. 25). Thissen,
D., Steinberg, L. & Wainer, H. (1993). Detection of differential item functioning using
the parameters of item response models . In P.W.
Holland & H. Wainer (Eds.), Differential item functioning. Hillsdale, NJ:
Lawrence Erlbaum Associates, 67-113. |
Camilli, G. & Shepard, L.A.
(1994). Methods
for identifying biased test items. Thousand Oaks, CA: Sage. Steinberg,
L. (1994). Context and serial-order effects in
personality measurement: Limits on the generality of measuring changes the
measure. Journal of Personality and Social Psychology,
66,
341-349. Steinberg,
L. (2001).The consequences of pairing questions:
Context effects in personality measurement Journal of Personality and Social Psychology, 81, 332-342. Edelen, M.O., Thissen, D., Teresi,
J.A., Kleinman, M., & Ocepek-Welikson, K. (2006). Identification of differential item
functioning using item response theory and the likelihood-based model
comparison approach: application to the Mini-Mental Status Examination. Medical Care, 44, S134-142. |
|
April 2 |
Holiday—no class |
|
|
April 9 |
Score Combination & Sub-scores Rosa, K., Swygert, K., Nelson, L.,
& Thissen, D. (2001). Item response theory applied to combinations of
multiple-choice and constructed-response items-scale scores for patterns of
summed scores. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence
Erlbaum Associates. (Ch. 7) Thissen, D., Nelson, L., &
Swygert, K. (2001). Item response theory applied to combinations of
multiple-choice and constructed-response items-approximation methods for
scale scores. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence
Erlbaum Associates. (Ch. 8) Wainer, H., Vevea, J.L., Camacho, F.,
Reeve, B.B., Rosa, K, Nelson, L., & Swygert, K. & Thissen, D.,
(2001). Augmented scores---"Borrowing strength" to compute scores
based on small numbers of items. In D. Thissen & H. Wainer (Eds), Test Scoring.
Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 9) CAT Edwards, M.C. & Edelen, Maria
Orlando. (2009). Special
Topics in Item response theory. In R.
Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in
psychology. London: Sage Publications. (CAT section) Choi, S. W., Reise, S.P., Pilkonis,
P.A., Hays, R.D., Cella, D. (2010). Efficiency
of static and computer adaptive short forms compared to full-length measures
of depressive symptoms. Quality of Life Research, 19, 125-136. |
Thissen, D., Wainer, H., & Wang,
X.B. (1994). Are tests comprising both multiple-choice and free-response
items necessarily less unidimensional than multiple-choice tests? An analysis
of two tests. Journal of
Educational Measurement, 31, 113-123. Lukhele, R., Thissen, D., &
Wainer, H. (1994). On the relative value of
multiple-choice, constructed-response, and examinee-selected items on two
achievement tests. Journal of
Educational Measurement, 31, 234-250. Flora, D.B., & Thissen, D.
(2002). UserÕs
guide for IRTSCORE: Item response theory score approximation Software.
Electronic Research Memorandum #2002-1. Chapel Hill, NC: University of North
Carolina, L.L. Thurstone Psychometric Laboratory. Obtain IRTScore and its documentation
from links in the PlotIRT section of this
page. Thissen, D., Reeve, B.B., Bjorner,
J.B., & Chang, C.-H. (2007). Methodological issues for building item banks and
computerized adaptive scales.
Quality of Life Research, 16, 109-116. Bjorner, J.B., & Chang, C.-H., Thissen, D., Reeve, B.B. (2007). Developing tailored instruments: item banking and
computerized adaptive assessment.
Quality of Life Research, 16, 95-108. |
|
April 16 April 23 |
Your presentations |
|
*Full-length books are not available electronically.
Libraries and (used) bookstores are good sources for such documents.
Requirements, grading, and stuff: There will be no tests. There will be homework assignments; most (or all) of these
will involve data analysis using computers, and a 2-4 page written (typed and printed,
please, thank you) report. The report must present the results in readable
English, as well as numbers and (maybe) graphics. These homework assignments
will appear in class every couple weeks, and will be part of the basis for your
grade. A reasonably substantial
paper (e.g., 10-20 pages) on some topic in IRT, or describing the (hypothetical
or real) item analysis and/or construction of an instrument for some sort of
psychological measurement, will be a major part of your grade; this paper will
be due April 23. These papers/projects may be done individually, or in pairs,
on topics of your choosing, with brief oral presentations on April 16 or 23. We
will discuss this aspect of the course in more detail in February.
Class participation: For
each week, the readings listed above will serve as the topical focus. Everybody
is to read the readings during the week (before class), and write (type) two
questions that can be the focus of clarifying discussion during class. Those
questions are to be emailed to me (dthissen@email.unc.edu), or handed in (to my
mailbox) by Noon on each Thursday preceding class. During class on each Friday
morning, we will (jointly) do our best to deal with the questions, either
through discussion or additional reading material.