Psychology 859: Seminar in Quantitative Psychology---Spring, 2010
(as of 2/18/2010)

Selected Topics in Item Response Theory

Time, Place:                9:00-11:30 Fridays, 347 Davie

Instructor:                     David Thissen

Tentative Schedule:

Monday

Topic/Readings

Additional potential readings*

January 15

 

Background & Overview

Thissen, D. & Steinberg, L. (2009). Item response theory. In R. Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology. London: Sage Publications.

Bock, R.D. (1997). A brief history of item response theory. Educational Measurement: Issues and Practice. 16, 21-33.

van der Linden, W. & Ronald K. Hambleton. (1997) Item response theory: Brief history, common models, and extensions. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 1).

Wainer, H., Bradlow, E.T., and Wang, X., Testlet response theory and its applications. New York, NY: Cambridge University Press, 2007. (Chapters 1-3)

Lazarsfeld, P. F. (1950). The logical and mathematical foundation of latent structure analysis. In S. A. Stouffer, L. Guttman, E. A. Suchman, P. F. Lazarsfeld, S. A. Star, & J. A. Clausen, Measurement and Prediction (Pps. 362-412). New York: Wiley.

Lord, F.M. (1952). A theory of test scores. Psychometric Monographs, Whole No. 7.

Lord, F.M. (1953). The relation of test score to the trait underlying the test. Educational and Psychological Measurement, 13, 517-548.

 

Links to download software and documents that may be useful throughout the course

Download (and try to install) the IRTPRO beta, R (if you donÕt already have it), and the R-IRT graphics functions; links are provided on the page on downloading software.

 

Useful throughout the course:

Yen, W.M. & Fitzpatrick, Anne, R. (2006). Item response theory. In R.L. Brennan (Ed.) Educational Measurement (Fourth Edition). Westport, CT: Praeger.

January 22

Models for Binary Responses

van der Linden, W. & Ronald K. Hambleton. (1997). Item response theory: Brief history, common models, and extensions. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 1).

Thissen, D., & Orlando, M. (2001). Item response theory for items scored in two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 3)

 

Download ArmyDataFilesForHomework1.zip to do the homework!

Hambleton, R.K. & Swaminathan, H. (1985) Item response theory: Principles and applications. Boston: Kluwer-Nijhoff.

Hulin, C.L., Drasgow, F., & Parsons, C.K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dow-Jones Irwin.

Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.

Lord, F. M. & Novick, M. R. (1968). Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley.

 

January 29

Samejima's Graded Model

Samejima, F. (1997). Graded response model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 5).

Thissen, D., Nelson, L., Rosa, K., & McLeod, L.D. (2001). Item response theory for items scored in more than two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 4)

 

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monograph, No. 17, 34, Part 2.

Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, (Whole No. 140).

Hill, C.D., Edwards, M.C., Thissen, D., Langer, M.M., Wirth, R.J., Burwinkle, T.M., & Varni, J.W. (2007). Practical issues in the application of item response theory: A demonstration using items from the Pediatric Quality of Life Inventoryª (PedsQLª) 4.0 Generic Core Scales. Medical Care, 45, S39-47.

Thissen, D., Pommerich, M., Billeaud, K., & Williams, V.S.L. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19, 39-49

February 5

Class-as-lab?

(You should have done this already, but, before class, download (and try to install) Multilog and documentation files using links provided on the page on downloading software.)

In-class help with IRTPRO and the R graphics package?

 

 

February 12

Multidimensional Item Response Theory

Edwards, M.C. & Edelen, Maria Orlando. (2009). Special Topics in Item response theory. In R. Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology. London: Sage Publications. (MIRT section)

Wirth, R.J., & Edwards, M.C. (2007). Item Factor Analysis: Current Approaches and Future Directions. Psychological Methods, 12, 58-79.

McLeod, L.D., Swygert, K., & Thissen, D (2001). Factor analysis for items scored in two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 5)

Swygert, K., McLeod, L.D., & Thissen, D (2001). Factor analysis for items scored in more than two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 6)

Reckase, M.D.(1997). A linear logistic multidimensional model for dichotomous item response data. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 16).

 

Reckase, M. (2009). Multidimensional item response theory. New York, NY: Springer. [This book is available as downloadable chapter pdf files through the UNC library.]

Bock, R.D., Gibbons, R. & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261-280.

Yung, Y.F., McLeod, L.D., & Thissen, D. (1999). On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64, 113-128.

Steinberg, L. & Jorgensen, R.S. (1996). Assessing the MMPI-vased Cook-Medley Hostility Scale: The implications of dimensionality. Journal of Personality and Social Psychology, 70, 1281-1287.

 

 

Download form2_emotion.zip to do the homework!

 

February 19

Estimation Algorithms and Goodness of Fit Statistics

Bolt, D.M. (2005). Limited- and full-information estimation of item response theory models. In A. Maydeu-Olivares & J.J. McArdle (Eds), Contemporary Psychometrics (Pp. 27-72). Mahwah, NJ: Erlbaum. (Ch. 2)

 

Orlando, M., & Thissen, D. (2000). Likelihood-based item fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50-64.

Orlando, M. & Thissen, D. (2003). Further investigation of the performance of S-X2: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27, 289-298.

 

Maydeu-Olivares, A., & Joe, H. (2005). Limited and full information estimation and testing in 2n contingency tables: A unified framework. Journal of the American Statistical Association, 100, 1009–1020.

Maydeu-Olivares, A. & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71, 713-732.

 

 

Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika, 46, 443-449.

Bock, R.D. & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179-197.

Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47, 201-214.

Cai, L. (2008). SEM of another flavour: Two new applications of the supplemented EM algorithm. British Journal of Mathematical and Statistical Psychology, 61, 309-329.

Gibbons, R. D., & Hedeker, D. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423-436.

Gibbons, R.D., Bock, R.D., Hedeker, D., Weiss, D.J., Segawa, E., Bhaumik, D.K., Kupfer, D.J., Frank, E., Grochocinski, V.J., & Stover, A. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 4-19.

Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.

Cai, L. (in press). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika.

Cai, L. (in press). Metropolis-Hastings Robbins-Monro Algorithm for Confirmatory Item Factor Analysis. Journal of Educational and Behavioral Statistics.

Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251-269.

Patz, R.J. & Junker, B.W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response theory. Journal of Educational and Behavioral Statistics, 24, 146-178.

Edwards, M.C. (in press). A Markov chain Monte Carlo approach to confirmatory item factor analysis. Psychometrika.

Baker, F.B. & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd Edition, Revised and Expanded). New York: Marcel Dekker.

February 26

 

Bock's Nominal Model and Variations

Thissen, D., Cai, L., & Bock, R.D. (in press). The nominal categories item response model. In M. Nering & R. Ostini (Eds.), Handbook of polytomous item response theory models: Developments and applications.

Bock, R.D. (1997). The nominal categories model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 2).

Anderson, E.B. (1997). The rating scale model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 4).

Masters, G.N., & Wright, B.D. (1997). The partial credit model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 6).

Muraki, E. (1997). A generalized partial credit model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 9).

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more latent categories. Psychometrika, 37, 29-51.

Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149174.

Masters, G.N., & Wright, B.D. (1984). The essential process in a family of measurement models. Psychometrika, 49, 529-544.

Thissen, D. & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567-577.

Thissen, D. & Steinberg, L. (1988). Data analysis using item response theory. Psychological Bulletin, 104, 385-395.

 

March 5

Special Guest Lecture: PROMIS

 

The website for the Patient Reported Outcomes Measurement Information Systems has information on all things PROMIS.

 

Irwin, D.E., Stucky, B.D., Thissen, D., DeWitt, E.M., Lai, J.S.,Yeatts, K., Varni, J.W., DeWalt, D.A. (in press). Sampling Plan and Patient Characteristics of the PROMIS Pediatrics Large-Scale Survey. Quality of Life Research.

 

Yeatts, K., Stucky, B.D., Thissen, D., Irwin, D., Varni, J., DeWitt, E.M., Lai, J.S., & DeWalt, D.A. (in press). Construction of the Pediatric Asthma Impact Scale (PAIS) for the Patient Reported Outcomes Measurement Information System (PROMIS). Journal of Asthma.

Varni, J., Stucky, B.D., Thissen, D., DeWitt, E.M., Irwin, D., Lai, J.S.,Yeatts, K.,  & DeWalt, D.A. (in press). PROMIS Pediatric Pain Interference Scale: An Item Response Theory Analysis of the Pediatric Pain Item Bank. Journal of Pain.

Irwin, D.E., Stucky, B.D., Thissen, D., DeWitt, E.M., Lai, J.S., Varni, J.W., Yeatts, K., DeWalt, D.A. (in press). An Item Response Analysis of the Pediatric PROMIS Anxiety and Depressive Symptoms Scales. Quality of Life Research.

DeWitt, E.M., Stucky, B.D., Thissen, D., Irwin, D.E., Langer, M.M., Varni, J.W., Lai, J.S., Yeatts, K., DeWalt, D.A. (under review). Construction of the PROMIS Pediatric Physical Function Scales:  Built using Item Response Theory.

Reeve, B.B., Hays, R.D, Bjorner, J.B., Cook K.F., Crane, P.K., Teresi, J.A., Thissen, D., Revicki, D.A., Weiss, D.J., Hambleton, R.K., Liu, H., Gershon, R., Reise, S.P., & Cella, D (2007). Psychometric evaluation and calibration of health-related quality of life items banks: Plans for the patient-reported outcome measurement information system (PROMIS). Medical Care, 45, S22-31.

 

Langer, M.M., Hill, C.D., Thissen, D., Burwinkle, T.M., Varni, J.W., & DeWalt, D.A. (2008). Item response theory detected differential item functioning between healthy and ill children in quality-of-life measures. Journal of Clinical Epidemiology, 61, 268-276.

 

Revicki, D. A., Chen, W-H., Harnam, N., Cook, K., Amtmann, D., Callahan, L. F., Jensen, M. P., & Keefe, F. J. (2009). Development and psychometric analysis of the PROMIS pain behavior item bank. Pain, 146(1-2), 158-69.

March 12

Spring Break—no class

 

March 19

Local Dependence and Testlets

Thissen, D. & Steinberg, L. (2010). Using item response theory to disentangle constructs at different levels of generality. In S. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches (pp. 123-144). Washington, DC: American Psychological Association.

Wainer, H., Bradlow, E.T., and Wang, X., Testlet Response Theory and Its Applications. New York, NY: Cambridge University Press, 2007. (Chapters 4-12)

Steinberg, L. & Thissen, D. (1996). Uses of item response theory and the testlet concept in the measurement of psychopathology, Psychological Methods, 1, 8197.

 

Hoskens, M. & De Boeck, P. (1997). A parametric model for local dependence among test items. Psychological Methods, 21, 261-277.

Chen, W.H. & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.

Wang, X. Bradlow, E. T., & Wainer, H. (2004). A userÕs guide for SCORIGHT (Version 3.0): A computer program for scoring tests built of testlets including a module for covariate analysis (Research Report RR 04-49). Princeton, NJ: Educational Testing Services

Thissen, D., Steinberg, L. & Mooney, J.A. (1989). Trace lines for testlets: A use of multiple-categorical-response models. Journal of Educational Measurement, 26, 247-260.

 

March 26

Developmental Scales and Differential Item Functioning

Williams, V.S.L., Pommerich, M., & Thissen, D. (1998). A comparison of developmental scales based on Thurstone methods and item response theory. Journal of Educational Measurement, 35, 93-107.

Edwards, M.C. & Edelen, Maria Orlando. (2009). Special Topics in Item response theory. In R. Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology. London: Sage Publications. (DIF section)

Steinberg, L., & Thissen, D. (2006) Using Effect Sizes for Research Reporting: Examples using Item Response Theory to Analyze Differential Item Functioning. Psychological Methods, 11, 402-415.

Bock, R.D. & Zimowski, M.F. (1997). Multiple Group IRT. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 25).

Thissen, D., Steinberg, L. & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models . In P.W. Holland & H. Wainer (Eds.), Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates, 67-113.

 

 

Camilli, G. & Shepard, L.A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.

Steinberg, L. (1994). Context and serial-order effects in personality measurement: Limits on the generality of measuring changes the measure. Journal of Personality and Social Psychology, 66, 341-349.

Steinberg, L. (2001).The consequences of pairing questions: Context effects in personality measurement Journal of Personality and Social Psychology, 81, 332-342.

Edelen, M.O., Thissen, D., Teresi, J.A., Kleinman, M., & Ocepek-Welikson, K. (2006). Identification of differential item functioning using item response theory and the likelihood-based model comparison approach: application to the Mini-Mental Status Examination. Medical Care, 44, S134-142.

 

April 2

Holiday—no class

 

April 9

Score Combination & Sub-scores

Rosa, K., Swygert, K., Nelson, L., & Thissen, D. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items-scale scores for patterns of summed scores. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 7)

Thissen, D., Nelson, L., & Swygert, K. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items-approximation methods for scale scores. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 8)

Wainer, H., Vevea, J.L., Camacho, F., Reeve, B.B., Rosa, K, Nelson, L., & Swygert, K. & Thissen, D., (2001). Augmented scores---"Borrowing strength" to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 9)

 

 

CAT

Edwards, M.C. & Edelen, Maria Orlando. (2009). Special Topics in Item response theory. In R. Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology. London: Sage Publications. (CAT section)

Choi, S. W., Reise, S.P., Pilkonis, P.A., Hays, R.D., Cella, D. (2010). Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Quality of Life Research, 19, 125-136.

 

Thissen, D., Wainer, H., & Wang, X.B. (1994). Are tests comprising both multiple-choice and free-response items necessarily less unidimensional than multiple-choice tests? An analysis of two tests. Journal of Educational Measurement, 31, 113-123.

Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple-choice, constructed-response, and examinee-selected items on two achievement tests. Journal of Educational Measurement, 31, 234-250.

 

Flora, D.B., & Thissen, D. (2002). UserÕs guide for IRTSCORE: Item response theory score approximation Software. Electronic Research Memorandum #2002-1. Chapel Hill, NC: University of North Carolina, L.L. Thurstone Psychometric Laboratory.

Obtain IRTScore and its documentation from links in the PlotIRT section of this page.

 

Thissen, D., Reeve, B.B., Bjorner, J.B., & Chang, C.-H. (2007). Methodological issues for building item banks and computerized adaptive scales. Quality of Life Research, 16, 109-116.

Bjorner, J.B., & Chang, C.-H., Thissen, D., Reeve, B.B. (2007). Developing tailored instruments: item banking and computerized adaptive assessment. Quality of Life Research, 16, 95-108.

 

April 16

April 23

Your

presentations

 

*Full-length books are not available electronically. Libraries and (used) bookstores are good sources for such documents.

Requirements, grading, and stuff: There will be no tests.  There will be homework assignments; most (or all) of these will involve data analysis using computers, and a 2-4 page written (typed and printed, please, thank you) report. The report must present the results in readable English, as well as numbers and (maybe) graphics. These homework assignments will appear in class every couple weeks, and will be part of the basis for your grade.  A reasonably substantial paper (e.g., 10-20 pages) on some topic in IRT, or describing the (hypothetical or real) item analysis and/or construction of an instrument for some sort of psychological measurement, will be a major part of your grade; this paper will be due April 23. These papers/projects may be done individually, or in pairs, on topics of your choosing, with brief oral presentations on April 16 or 23. We will discuss this aspect of the course in more detail in February.

Class participation: For each week, the readings listed above will serve as the topical focus. Everybody is to read the readings during the week (before class), and write (type) two questions that can be the focus of clarifying discussion during class. Those questions are to be emailed to me (dthissen@email.unc.edu), or handed in (to my mailbox) by Noon on each Thursday preceding class. During class on each Friday morning, we will (jointly) do our best to deal with the questions, either through discussion or additional reading material.