Psychology 859: Seminar in Quantitative Psychology---Spring, 2008
(as of 3/17/2008)

Selected Topics in Item Response Theory

Time, Place:               9:00-11:30 Fridays, 347 Davie

Instructor:                   David Thissen

Tentative Schedule:

Monday

Topic/Readings

Additional potential readings*

January 11

 

Background & Overview

Thissen, D. & Steinberg, L. (in press). Item response theory. In R. Millsap & A. Maydeu-Olivares, Handbook of quantitative methods in psychology. London: Sage Publications.

Bock, R.D. (1997). A brief history of item response theory. Educational Measurement: Issues and Practice. 16, 21-33.

van der Linden, W. & Ronald K. Hambleton. (1997) Item response theory: Brief history, common models, and extensions. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 1).

Wainer, H., Bradlow, E.T., and Wang, X., Testlet Response Theory and Its Applications. New York, NY: Cambridge University Press, 2007. (Chapters 1-3)

Lazarsfeld, P. F. (1950). The logical and mathematical foundation of latent structure analysis. In S. A. Stouffer, L. Guttman, E. A. Suchman, P. F. Lazarsfeld, S. A. Star, & J. A. Clausen, Measurement and Prediction (Pps. 362-412). New York: Wiley.

Lord, F.M. (1952). A theory of test scores. Psychometric Monographs, Whole No. 7.

Lord, F.M. (1953). The relation of test score to the trait underlying the test. Educational and Psychological Measurement, 13, 517-548.

Bock, R.D. (circa 1988). Concepts of behavioral measurement.

Bock, R.D. (circa 1992). Foundations.

January 18

Class-as-lab

Before class, download (and try to install) Multilog, R (if you donÕt already have it) and the R-IRT graphics functions; links are provided on the page on downloading software.

During class you may share help, and Michelle (one of the authors of the R graphics functions) will demo those and help with questions.

Useful throughout the course:

Yen, W.M. & Fitzpatrick, Anne, R. (2006). Item response theory. In R.L. Brennan (Ed.) Educational Measurement (Fourth Edition). Westport, CT: Praeger.

January 25

Models for Binary Responses

van der Linden, W. & Ronald K. Hambleton. (1997). Item response theory: Brief history, common models, and extensions. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 1).

Thissen, D., & Orlando, M. (2001). Item response theory for items scored in two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 3)

Hambleton, R.K. & Swaminathan, H. (1985) Item response theory: Principles and applications. Boston: Kluwer-Nijhoff.

Hulin, C.L., Drasgow, F., & Parsons, C.K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dow-Jones Irwin.

Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.

Lord, F. M. & Novick, M. R. (1968). Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley.

 

February 1

Samejima's Graded Model &
Multilog I

Samejima, F. (1997). Graded response model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 5).

Thissen, D., Nelson, L., Rosa, K., & McLeod, L.D. (2001). Item response theory for items scored in more than two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 4)

 

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monograph, No. 17, 34, Part 2.

Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, (Whole No. 140).

Hill, C.D., Edwards, M.C., Thissen, D., Langer, M.M., Wirth, R.J., Burwinkle, T.M., & Varni, J.W. (2007). Practical issues in the application of item response theory: A demonstration using items from the Pediatric Quality of Life Inventoryª (PedsQLª) 4.0 Generic Core Scales. Medical Care, 45, S39-47.

February 8

Scoring & Multilog II

(You should have done this already, but, before class, download (and try to install) Multilog and documentation files using links provided on the page on downloading software.)

 

Thissen, D., Pommerich, M., Billeaud, K., & Williams, V.S.L. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19, 39-49

February 15

Estimation Algorithms

Wainer, H. & Mislevy, R.J. (2000).  Item response theory, item calibration, and proficiency estimation. In  H. Wainer, N.J. Dorans, R. Flaugher, B.F. Green, R.J. Mislevy, L. Steinberg & D. Thissen, Computerized adaptive testing: A Primer (Pp. 61-100). Hillsdale, NJ: Erlbaum. (Ch. 4)

 

Orlando, M., & Thissen, D. (2000). Likelihood-based item fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50-64.

Orlando, M. & Thissen, D. (2003). Further investigation of the performance of S-X2: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27, 289-298.

Cai, L., Maydeu-Olivares, A., Coffman, D.L., & Thissen, D. (2006). Limited information goodness-of-fit testing of item response theory models for sparse 2p tables. British Journal of Mathematical and Statistical Psychology, 59, 173-194.

 

Download form2_emotion.zip to do the homework!

 

Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika, 46, 443-449.

Bock, R.D. & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179-197.

Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47, 201-214.

Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251-269.

Patz, R.J. & Junker, B.W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response theory. Journal of Educational and Behavioral Statistics, 24, 146-178.

Baker, F.B. & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd Edition, Revised and Expanded). New York: Marcel Dekker.

February 22

Multidimensional Item Response Theory

Edwards, M.C. & Edelen, Maria Orlando. (in press). Special Topics in Item response theory. In R. Millsap & A. Maydeu-Olivares, Handbook of quantitative methods in psychology. London: Sage Publications. (MIRT section)

Wirth, R.J., & Edwards, M.C. (2007). Item Factor Analysis: Current Approaches and Future Directions. Psychological Methods, 12, 58-79.

McLeod, L.D., Swygert, K., & Thissen, D (2001). Factor analysis for items scored in two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 5)

Swygert, K., McLeod, L.D., & Thissen, D (2001). Factor analysis for items scored in more than two categories. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 6)

Reckase, M.D..(1997). A linear logistic multidimensional model for dichotomous item response data. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 16).

 

Bock, R.D., Gibbons, R. & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261-280.

Yung, Y.F., McLeod, L.D., & Thissen, D. (1999). On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64, 113-128.

Steinberg, L. & Jorgensen, R.S. (1996). Assessing the MMPI-vased Cook-Medley Hostility Scale: The implications of dimensionality. Journal of Personality and Social Psychology, 70, 1281-1287.

Hill, C.D., Edwards, M.C., Thissen, D., Langer, M.M., Wirth, R.J., Burwinkle, T.M., & Varni, J.W. (2007). Practical issues in the application of item response theory: A demonstration using items from the Pediatric Quality of Life Inventoryª (PedsQLª) 4.0 Generic Core Scales. Medical Care, 45, S39-47.

February 29

 

Bock's Nominal Model and Variations & Multilog III

Thissen, D., Cai, L., & Bock, R.D. (in press). The nominal item response model. In M. Nering & R. Ostini (Eds.), Handbook of polytomous item response theory models: Developments and applications.

Bock, R.D. (1997). The nominal categories model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 2).

Anderson, E.B. (1997). The rating scale model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 4).

Masters, G.N., & Wright, B.D. (1997). The partial credit model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 6).

Muraki, E. (1997). A generalized partial credit model. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch 9).

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more latent categories. Psychometrika, 37, 29-51.

Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149174.

Masters, G.N., & Wright, B.D. (1984). The essential process in a family of measurement models. Psychometrika, 49, 529-544.

Thissen, D. & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567-577.

Thissen, D. & Steinberg, L. (1988). Data analysis using item response theory. Psychological Bulletin, 104, 385-395.

 

March 7

Local Dependence and Testlets

Thissen, D. & Steinberg, L. (in press). Using item response theory to disentangle constructs at different levels of generality. In S. Embretson & J. Roberts (Eds.), New directions in psychological measurement with model-based approaches.

Wainer, H., Bradlow, E.T., and Wang, X., Testlet Response Theory and Its Applications. New York, NY: Cambridge University Press, 2007. (Chapters 4-12)

Steinberg, L. & Thissen, D. (1996). Uses of item response theory and the testlet concept in the measurement of psychopathology, Psychological Methods, 1, 8197.

 

Hoskens, M. & De Boeck, P. (1997). A parametric model for local dependence among test items. Psychological Methods, 21, 261-277.

Chen, W.H. & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.

Wang, X. Bradlow, E. T., & Wainer, H. (2004). A userÕs guide for SCORIGHT (Version 3.0): A computer program for scoring tests built of testlets including a module for covariate analysis (Research Report RR 04-49). Princeton, NJ: Educational Testing Services

Thissen, D., Steinberg, L. & Mooney, J.A. (1989). Trace lines for testlets: A use of multiple-categorical-response models. Journal of Educational Measurement, 26, 247-260.

 

March 21

ÒHolidayÓ! No class

 

March 28

Developmental Scales and Differential Item Functioning

Williams, V.S.L., Pommerich, M., & Thissen, D. (1998). A comparison of developmental scales based on Thurstone methods and item response theory. Journal of Educational Measurement, 35, 93-107.

Edwards, M.C. & Edelen, Maria Orlando. (in press). Special Topics in Item response theory. In R. Millsap & A. Maydeu-Olivares, Handbook of quantitative methods in psychology. London: Sage Publications. (DIF section)

Steinberg, L., & Thissen, D. (2006) Using Effect Sizes for Research Reporting: Examples using Item Response Theory to Analyze Differential Item Functioning. Psychological Methods, 11, 402-415.

Bock, R.D. & Zimowski, M.F. (1997). Multiple Group IRT. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory. New York: Springer-Verlag, (Ch. 25).

Thissen, D., Steinberg, L. & Wainer, H. (1993) Detection of differential item functioning using the parameters of item response models. In P.W. Holland & H. Wainer (Eds.), Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates, 67-113.

Camilli, G. & Shepard, L.A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.

Steinberg, L. (1994). Context and serial-order effects in personality measurement: Limits on the generality of measuring changes the measure. Journal of Personality and Social Psychology, 66, 341-349.

Steinberg, L. (2001).The consequences of pairing questions: Context effects in personality measurement Journal of Personality and Social Psychology, 81, 332-342.

Edelen, M.O., Thissen, D., Teresi, J.A., Kleinman, M., & Ocepek-Welikson, K. (2006). Identification of differential item functioning using item response theory and the likelihood-based model comparison approach: application to the Mini-Mental Status Examination. Medical Care, 44, S134-142.

Langer, M.M., Hill, C.D., Thissen, D., Burwinkle, T.M., Varni, J.W., & DeWalt, D.A. (in press). Item response theory detects differential item functioning between healthy and ill children in QoL measures. Journal of Clinical Epidemiology.

April 4

Score Combination & Sub-scores

Rosa, K., Swygert, K., Nelson, L., & Thissen, D. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items-scale scores for patterns of summed scores. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 7)

Thissen, D., Nelson, L., & Swygert, K. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items-approximation methods for scale scores. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 8)

Wainer, H., Vevea, J.L., Camacho, F., Reeve, B.B., Rosa, K, Nelson, L., & Swygert, K. & Thissen, D., (2001). Augmented scores---"Borrowing strength" to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds), Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 9)

.

Thissen, D., Wainer, H., & Wang, X.B. (1994). Are tests comprising both multiple-choice and free-response items necessarily less unidimensional than multiple-choice tests? An analysis of two tests. Journal of Educational Measurement, 31, 113-123.

Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple-choice, constructed-response, and examinee-selected items on two achievement tests. Journal of Educational Measurement, 31, 234-250.

 

Flora, D.B., & Thissen, D. (2002). UserÕs guide for IRTSCORE: Item response theory score approximation Software. Electronic Research Memorandum #2002-1. Chapel Hill, NC: University of North Carolina, L.L. Thurstone Psychometric Laboratory.

Obtain IRTScore and its documentation from links in the PlotIRT section of this page.

 

April 11

CAT

Edwards, M.C. & Edelen, Maria Orlando. (in press). Special Topics in Item response theory. In R. Millsap & A. Maydeu-Olivares, Handbook of quantitative methods in psychology. London: Sage Publications. (CAT section)

 

Thissen, D., Reeve, B.B., Bjorner, J.B., & Chang, C.-H. (2007). Methodological issues for building item banks and computerized adaptive scales. Quality of Life Research, 16, 109-116.

Bjorner, J.B., & Chang, C.-H., Thissen, D., Reeve, B.B. (2007). Developing tailored instruments: item banking and computerized adaptive assessment. Quality of Life Research, 16, 95-108.

 

April 18

April 25

Your

presentations

 

*Full-length books are not available electronically. Libraries and (used) bookstores are good sources for such documents.

Requirements, grading, and stuff: There will be no tests.  There will be homework assignments; most (or all) of these will involve data analysis using computers, and a 2-4 page written (typed and printed, please, thank you) report. The report must present the results in readable English, as well as numbers and (maybe) graphics. These homework assignments will appear in class every couple weeks, and will be part of the basis for your grade.  A reasonably substantial paper (e.g., 10-20 pages) on some topic in IRT, or describing the (hypothetical or real) item analysis and/or construction of an instrument for some sort of psychological measurement, will be a major part of your grade; this paper will be due April 25. These papers/projects may be done individually, or in pairs, on topics of your choosing, with brief oral presentations on April 18 or 25. We will discuss this aspect of the course in more detail in February.

Class participation: For each week, the readings listed above will serve as the topical focus. Everybody is to read the readings during the week (before class), and write (type) two questions that can be the focus of clarifying discussion during class. Those questions are to be emailed to me (dthissen@email.unc.edu), or handed in (to my mailbox) by Noon on each Thursday preceding class. During class on each Friday morning, we will (jointly) do our best to deal with the questions, either through discussion or additional reading material.