David Thissen

Publications on Testing and Measurement

Liu, Y., Magnus, B., & Thissen, D. (2016). Modeling and testing differential item functioning in unidimensional binary item response models with a single continuous covariate: A functional data analysis approach. Psychometrika, 81, 371-398.

Reeve, B.B., Thissen, D., DeWalt, D.A., Huang, I-C., Liu, Y., Magnus, B., Quinn, H., Gross, H.E., Kisala, P.A., Ni, P., Haley, S.M., Mulcahey, M.J., Charlifue, S., Hanks, R., Slavin, M., Jette, A.M., & Tulsky, D.S. (2016). Linkage between the PROMIS pediatric and adult emotional distress measures. Quality of Life Research, 25, 823-833.

Tassé, J., Schalock, Thissen, D., Balboni, G., Bersani, H., Borthwick-Duffy, S.A., Spreat, S., Widaman, K.F. Zhang, D., & Navas, P. (2016). Development and standardization of the Diagnostic Adaptive Behavior Scale: Application of item response theory to the assessment of adaptive behavior. American Journal on Intellectual and Developmental Disabilities, 121, 79-94.

Thissen, D. (2016). Bad questions: An essay involving item response theory. Journal of Educational and Behavioral Statistics, 41, 81-89. Thissen, D. (2016). Rejoinder to Commentaries by Ackerman, Ho, and Wainer on “Bad Questions”. Journal of Educational and Behavioral Statistics, 41, 104-108.

Thissen, D., Liu, Y., Magnus, B., Quinn, H., Gipson, D.S., Dampier, C., Huang, I-C., Hinds, P.S., Selewski, D.T., Reeve, B.B., Gross, H.E., and DeWalt, D.A. (2016). Estimating minimally important difference (MID) in PROMIS pediatric measures using the scale-judgment method. Quality of Life Research, 25, 13-23.

DeWalt, D.A., Gross, H.E., Gipson, D.S., Selewski, D., DeWitt, E.M., Dampier, C.D., Hinds, P.S., Huang, I, Thissen, D., & Varni, J.W. (2015). PROMIS® pediatric self report scales distinguish subgroups of children within and across six common pediatric chronic health conditions. Quality of Life Research, 24, 2195-2208.

Varni, J.W., Thissen, D., Stucky, B.D., Liu, Y., Magnus, B., He, J., DeWitt, E.M., Irwin, D.E., Lai, J.-S., Amtmann, D., & DeWalt, D.A. (2015). Item-level informant discrepancies between children and their parents on the PROMIS® Pediatric Scales. Quality of Life Research, 24, 1921-1937.

Thissen, D., Liu, Y., Magnus, B., & Quinn, H. (2015). Extending the Use of Multidimensional IRT Calibration as Projection: Many-to-One Linking and Linear Computation of Projected Scores. In L. A.van der Ark, D. M. Bolt, W-C. Wang, J. A. Douglas, & S-M. Chow (eds.), Quantitative Psychology Research: The 79th Annual Meeting of the Psychometric Society, Madison, Wisconsin, 2014 (pp. 1-16). New York, NY: Springer.

Thissen, D. (2015). Growth through levels. Measurement: Interdisciplinary Research and Perspectives, 13, 128-131.

Thissen, D. (2015). Psychometrics: Item response theory. In J. Wright (Ed.), International Encyclopedia of the Social and Behavioral Sciences, 2nd edition, Volume 19 (pp. 436-439). Oxford: Elsevier.

Thissen, D. (2015). Failing tests: Commentary on “Adapting educational measurement to the demands of test-based accountability”. Measurement, 13, 49-52.

Cai, L., & Thissen, D. (2015). Modern approaches to parameter estimation in item response theory. In S.P. Reise & D.A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 41-59). New York: Taylor & Francis (Routledge).

Liu, Y., & Thissen, D. (2014). Comparing score tests and other local dependence diagnostics for the graded response model. British Journal of Mathematical and Statistical Psychology, 67, 496-513.

Quinn, H., Thissen, D., Liu, Y., Magnus, B., Lai, J-S., Amtmann, D., Varni,J.W., Gross, H.E., & DeWalt, D.A. (2014). Using item response theory to enrich and expand the PROMIS® Pediatric Self Report Banks. Health and Quality of Life Outcomes, 12:160.

Balboni, G., Tassé, J., Schalock, Borthwick-Duffy, S.A., Spreat, S., Thissen, D.M., Widaman, K.F., Zhang, D., & Navas, P. (2014). The Diagnostic Adaptive Behavior Scale: Evaluating Its Diagnostic Sensitivity And Specificity Research in Developmental Disabilities. Research in Developmental Disabilities, 35, 2884-2893.

Varni, J.,W., Magnus, B., Stucky, B.D., Liu, Y., Quinn, H., Thissen, D., Gross, H.E., Huang, I., & DeWalt, D.A. (2014). Psychometric Properties of the PROMIS® Pediatric Scales: Precision, Stability, and Comparison of Different Scoring and Administration Options. Quality of Life Research, 23, 1233-1243.

Varni, J.,W., Thissen, D., Stucky, B.D., Liu, Y., Magnus, B., Quinn, H., Irwin, D.E., DeWitt, E.M., Lai, J-S., Amtmann, D., Gross, H.E., & DeWalt, D.A. (2014). PROMIS® parent proxy report scales for children ages 5-7 years: An item response theory analysis of differential item functioning across age groups. Quality of Life Research, 23, 349-361.

Thissen, D. (2013). Using the testlet response model as a shortcut to multidimensional item response theory subscore computation. In R.E. Millsap, L.A. van der Ark, D.M. Bolt, & C.M. Woods (Eds.) New developments in quantitative psychology—Presentations from the 77th Annual Psychometric Society Meeting (Pp. 29-40). New York, NY: Springer.

Lai, J.S., Stucky, B.D., Thissen, D., Varni, J.W., DeWitt, E.M., Irwin, D.E., Yeatts, K.B., DeWalt, D.A (2013). Development and psychometric properties of the PROMIS® Pediatric Fatigue item banks. Quality of Life Research, 22, 2417-2427.

DeWalt, D.A., Thissen, D., Stucky, B.D., Langer, M.M., DeWitt, E.M., Irwin, D.E., Lai, J.S., Yeatts, K.B., Gross, H.E., Taylor, O., Varni, J.W. (2013). PROMIS Pediatric Peer Relationships Scale: Development of a peer relationships item bank as part of social health measurement. Health Psychology, 32, 1093-1103.

Thissen, D., & Norton, S. (2013). What might changes in psychometric approaches to statewide testing mean for NAEP? In F.B. Stancavage & G.W. Bohrnstedt (Eds.), Examining the content and context of the Common Core State Standards: A first look at implications for the National Assessment of Educational Progress. San Mateo, CA: American Institutes for Research, NAEP Validity Studies Panel.

Thissen, D. (2013). The meaning of goodness of fit tests: Commentary on “Goodness-of-fit assessment of item response theory models”. Measurement, 11, 123-126.

Thissen-Roe, A., & Thissen, D. (2013) A two-decision model for responses to Likert-type items. Journal of Educational and Behavioral Statistics, 38, 522-547.

Tian, W., Cai, L., Thissen, D., Xin, T. (2013). Numerical differentiation methods for computing error covariance matrices in item response theory modeling: An Evaluation and a new proposal. Educational and Psychological Measurement, 73, 412-439.

Steinberg, L., & Thissen, D. (2013). Item response theory. In J. Comer & P. Kendall (Eds.), The Oxford handbook of research strategies for clinical psychology (Pp. 336-373). New York, NY: Oxford University Press.

Gipson, D.S., Selewski, D.T., Massengill, S.F., Wickman, L., Messer, K.L., Herreshoff, E., Corinna, B., Ferris, M.F., Mahan, J.D., Greenbaum, L.A., MacHardy, J., Kapur, G., Chand, D.H., Goebel, J., Baletta, G.M., Geary, D., Kershaw, D.B., Pan, C.G., Gbadegesin, R., Hidalgo, G., Lane, J.C., Leiser, J.D., Plattner, B.W., Song, P.X., Thissen, D., Liu, Y., Gross, H.M., & DeWalt, D.A. (2013). Gaining the PROMIS perspective from children with nephrotic syndrome: a Midwest Pediatric Nephrology Consortium study. Health and Quality of Life Outcomes, 11, 1-30.

Selewski, D.T., Collier, D.N., MacHardy, J., Gross, H.E., Pickens, E.M., Cooper, A.W., Bullock, S., Earls, M.F., Pratt, K.J., Scanlon, K., McNeill, J.D., Messer, K.L., Lu, Y., Thissen, D., DeWalt, D.A., & Gipson, D.S. (2013). Promising insights into the health related quality of life for children with severe obesity. Health and Quality of Life Outcomes, 11, 1-10.

Stucky, B.D., Thissen, D., & Edelen, M.O. (2013). Using logistic approximations of marginal trace lines to develop short assessments. Applied Psychological Measurement, 37, 41-57.

Liu, Y., & Thissen, D. (2012). Identifying local dependence with a score test statistic based on the bifactor logistic model. Applied Psychological Measurement, 36, 670-688.

Varni, J.W., Thissen, D., Stucky, B.D., Liu, Y., Gorder, H., Irwin, D.E., DeWitt, E.M., Lai, J-S, Antmann, D., & DeWalt, D.A. (2012). PROMIS® parent proxy report scales: An item response theory analysis of the parent proxy report item banks. Quality of Life Research, 21, 1223-1240.

Irwin, D.E., Stucky, B.D., Langer, M.M., Thissen, D., DeWitt, E.M., Lai, J-S, Yeatts, K.B., Varni, J.W., and DeWalt, D.A. (2012). PROMIS Pediatric Anger Scale: An item response theory analysis. Quality of Life Research, 21, 697-706.

Tassé, M.J., Schalock,R.L., Balboni, G., Bersani, Jr., H., Borthwick-Duffy, S.A., Spreat, S., Thissen, D., Widaman, K.F., Zhang, D. (2012). The construct of adaptive behavior: Its conceptualization, measurement, and use in the field of intellectual disability. American Journal on Intellectual and Developmental Disabilities, 117, 291-303.

Thissen, D. (2012). On the National Assessment of Educational Progress, with emphasis on validity issues. Examinations Research, 2, General No. 31, 66-76.

Irwin, D.E., Gross, H.E., Stucky, B.D., Thissen, D., DeWitt, E.M., Lai, J.S., Amtmann, D., Khastou, L., Varni, J., & DeWalt, D.A. (2012). Development of six PROMIS Pediatrics parent proxy-report item banks. Health and Quality of Life Outcomes, 10:22.

Thissen, D. (2012). Validity issues involved in cross-grade statements about NAEP results. Washington, DC: American Institutes for Research, NAEP Validity Studies Panel.

Edwards, M.C., Flora, D.B., & Thissen, D. (2012). Multi-stage computerized adaptive testing with uniform item exposure. Applied Measurement in Education, 25, 118-141.

Carle, A. C., Cella, D., Cai, L, Choi, S. W., Crane, P. K., Curtis, S. M., Gruhl, J., Lai, J., Mukherjee, S., Reise, S., Teresi, J., Thissen, D., & Wu, E. J., Hays, R. (2011). Advancing PROMIS's methodology: Results of the third PROMIS Psychometric Summit. Expert Review of Pharmacoeconomics & Outcomes Research, 11, 677-684.

DeWitt, E.M., Stucky, B.D., Thissen, D., Irwin, D.E., Langer, M.,Varni, J.W., Lai, J-S, Yeatts, K.B., DeWalt, D.A. (2011). Construction of the eight item PROMIS Pediatric Physical Function Scales: Built using item response theory. Journal of Clinical Epidemiology, 64, 794-804.

Thissen, D., Varni, J.W., Stucky, B.D., Liu, Y., Irwin, D.E., & DeWalt, D.A. (2011). Using the PedsQL™ 3.0 Asthma Module to obtain scores comparable with those of the PROMIS Pediatric Asthma Impact Scale (PAIS). Quality of Life Research, 20, 1497-1505.

Varni, J., Stucky, B.D., Thissen, D., DeWitt, E.M., Irwin, D., Lai, J.S.,Yeatts, K., & DeWalt, D.A. (2010). PROMIS Pediatric Pain Interference Scale: An Item Response Theory Analysis of the Pediatric Pain Item Bank. Journal of Pain, 11, 1109-1119

Irwin, D., Stucky, B.D., Thissen, D., DeWitt, E.M., Lai, J.S., Yeatts, K., Varni, J., & DeWalt, D.A. (2010). An item response analysis of the Pediatric PROMIS Anxiety and Depressive Symptoms Scales. Quality of Life Research, 19, 595-607.

Irwin, D.E., Stucky, B.D., Thissen, D., DeWitt, E.M., Lai, J.S.,Yeatts, K., Varni, J.W., DeWalt, D.A. (2010). Sampling plan and patient characteristics of the PROMIS pediatrics large-scale survey. Quality of Life Research, 19, 585-594.

Yeatts, K., Stucky, B.D., Thissen, D., Irwin, D., Varni, J., DeWitt, E.M., Lai, J.S., & DeWalt, D.A. (2010). Construction of the Pediatric Asthma Impact Scale (PAIS) for the Patient Reported Outcomes Measurement Information System (PROMIS). Journal of Asthma, 47, 295-302.

Thissen, D., Cai, L., & Bock, R.D. (2010). The nominal categories item response model. In M. Nering & R. Ostini (Eds.), Handbook of polytomous item response theory models (Pp. 43-75). New York, NY: Routledge.

Thissen, D. & Steinberg, L. (2010). Using item response theory to disentangle constructs at different levels of generality. In S. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches (Pp. 123-144). Washington, DC: American Psychological Association.

Thissen, D. (2009). The MEDPRO project: An SBIR project for a comprehensive IRT and CAT software system—IRT software. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. At www.psych.umn.edu/psylabs/CATCentral/.

Linn, R.L., McLaughlin, D., & Thissen, D. (2009). Utility and validity of NAEP linking efforts. Washington, DC: American Institutes for Research, NAEP Validity Studies Panel.

Thissen, D. (2009). On interpreting the parameters for any item response model. Measurement: Interdisciplinary Research & Perspectives, 7, 104-108.

Thissen, D. & Steinberg, L. (2009). Item response theory. In R. Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology (Pp. 148-177). London: Sage Publications.

Langer, M.M., Hill, C.D., Thissen, D., Burwinkle, T.M., Varni, J.W., & DeWalt, D.A. (2008). Item response theory detects differential item functioning between healthy and ill children in QoL measures. Journal of Clinical Epidemiology, 61, 268-276.

Edwards, M. C. & Thissen, D. (2007). Exploring potential designs for multi-form structure computerized adaptive tests with uniform item exposure. In D. J. Weiss (Ed.), Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. At www.psych.umn.edu/psylabs/CATCentral/.

Bjorner, J.B., & Chang, C.-H., Thissen, D., Reeve, B.B. (2007). Developing tailored instruments: item banking and computerized adaptive assessment. Quality of Life Research, 16, 95-108.

Thissen, D., Reeve, B.B., Bjorner, J.B., & Chang, C.-H. (2007). Methodological issues for building item banks and computerized adaptive scales. Quality of Life Research, 16, 109-116.

Thissen, D. (2007). Linking assessments based on aggregate reporting: Background and issues. In N.J. Dorans, M. Pommerich, & P.W. Holland (Eds.) Linking and aligning scores and scales (Pp. 287-312). New York, NY: Springer.

Hill, C.D., Edwards, M.C., Thissen, D., Langer, M.M., Wirth, R.J., Burwinkle, T.M., & Varni, J.W. (2007). Practical issues in the application of item response theory: A demonstration using items from the Pediatric Quality of Life Inventoryª (PedsQLª) 4.0 Generic Core Scales. Medical Care, 45, S39-47.

Reeve, B.B., Hays, R.D, Bjorner, J.B., Cook K.F., Crane, P.K., Teresi, J.A., Thissen, D., Revicki, D.A., Weiss, D.J., Hambleton, R.K., Liu, H., Gershon, R., Reise, S.P., & Cella, D (2007). Psychometric evaluation and calibration of health-related quality of life items banks: Plans for the patient-reported outcome measurement information system (PROMIS). Medical Care, 45, S22-31.

Jones, L.V. & Thissen, D. (2007). A history and overview of psychometrics. In C.R. Rao and S. Sinharay, Handbook of Statistics, 26: Psychometrics (Pp. 1-27) Amsterdam: North Holland.

Edelen, M.O., Thissen, D., Teresi, J.A., Kleinman, M., & Ocepek-Welikson, K. (2006). Identification of differential item functioning using item response theory and the likelihood-based model comparison approach: application to the Mini-Mental Status Examination. Medical Care, 44, S134-142.

Steinberg, L., & Thissen, D. (2006) Using Effect Sizes for Research Reporting: Examples using Item Response Theory to Analyze Differential Item Functioning. Psychological Methods, 11, 402-415.

Cai, L., Maydeu-Olivares, A., Coffman, D.L., & Thissen, D. (2006). Limited information goodness-of-fit testing of item response theory models for sparse 2p tables. British Journal of Mathematical and Statistical Psychology, 59, 173-194.

Woods, C.M. & Thissen, D. (2006). Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika, 71, 281-301.

Bethke, A., Hill, C., McLeod, L., VanDyk, P., Zhao, L., Zhou, X., & Thissen, D. (2004). North Carolina Computerized Adaptive Testing System: 2003 comparability study results. Research Triangle Park, NC: RTI International.

Rodebaugh, T.L., Woods, C.M., Thissen, D., Heimberg, R.G., Chambless, D.L., & Rapee, R.M. (2004). More information from fewer questions: The factor structure and item properties of the original and brief fear of negative evaluation scale. Psychological Assessment, 16, 169-181.

Orlando, M. & Thissen, D. (2003). Further invesigation of the performance of S-X2: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27, 289-298

McLeod, L., Lewis, C, & Thissen, D. (2003). A Bayesian method for the detection of item preknowledge in computerized adaptive testing. Applied Psychological Measurement, 27, 121-137.

Thissen, D. (2003). Psychometric engineering as art: Variations on a theme. In H. Yanai, A. Okada, and K. Shigemasu, Y. Kano, & J.J. Meulman (Eds), New developments in psychometrics: Proceedings of the International Meeting of the Psychometric Society IMPS 2001 (Pp. 3-18). Tokyo: Springer-Verlag.

Vevea, J.L., Edwards, M.C., Thissen, D., Reeve, B.B., Flora, D.B., Sathy, V., & Coon, C. (2002). User's guide for Augment v.2: Emperical Bayes Subscore Augmentation Software. Electronic Research Memorandum #2002-2. Chapel Hill, NC: University of North Carolina, L.L. Thurstone Psychometric Laboratory.

Flora, D.B., & Thissen, D. (2002). User's guide for IRTScore: Item response theory score approximation Software. Electronic Research Memorandum #2002-1. Chapel Hill, NC: University of North Carolina, L.L. Thurstone Psychometric Laboratory.

Thissen, D. (2001). Psychometric engineering as art. Psychometrika, 66, 473-486.

Thissen, D. & Wainer, H. (Eds) (2001) Test Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates.

Thissen, D. & Wainer, H. (2001). Overview of Test Scoring. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 1-19). Hillsdale, NJ: Lawrence Erlbaum Associates.

Wainer, H. & Thissen, D (2001). True score theory: The traditional method. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 23-72). Hillsdale, NJ: Lawrence Erlbaum Associates.

Thissen, D., & Orlando, M. (2001). Item response theory for items scored in two categories. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 73-140). Hillsdale, NJ: Lawrence Erlbaum Associates.

Thissen, D., Nelson, L., Rosa, K., & McLeod, L.D. (2001). Item response theory for items scored in more than two categories. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 141-186). Hillsdale, NJ: Lawrence Erlbaum Associates.

McLeod, L.D., Swygert, K.A., & Thissen, D (2001). Factor analysis for items scored in two categories. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 189-216). Hillsdale, NJ: Lawrence Erlbaum Associates.

Swygert, K.A., McLeod, L.D., & Thissen, D (2001). Factor analysis for items scored in more than two categories. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 217-250). Hillsdale, NJ: Lawrence Erlbaum Associates.

Rosa, K., Swygert, K.A., Nelson, L., & Thissen, D. (2001). Item response theory applied to combinations of multiple-choice and constructed-response itemsÑscale scores for patterns of summed scores. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 253-292). Hillsdale, NJ: Lawrence Erlbaum Associates.

Thissen, D., Nelson, L., & Swygert, K.A. (2001). Item response theory applied to combinations of multiple-choice and constructed-response itemsÑapproximation methods for scale scores. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 293-341). Hillsdale, NJ: Lawrence Erlbaum Associates.

Wainer, H., Vevea, J.L., Camacho, F., Reeve, B.B., Rosa, K., Nelson, L., Swygert, K.A., & Thissen, D. (2001). Augmented scores---"borrowing strength" to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds), Test Scoring (Pp. 343-387). Hillsdale, NJ: Lawrence Erlbaum Associates.

Orlando, M., Sherbourne, C.D., & Thissen, D. (2000). Summed-score linking using item response theory: Application to depression measurement. Psychological Assessment, 12, 354-359.

Thissen, D. & Mislevy, R.J. (2000). Testing algorithms. In H. Wainer, N. Dorans, D. Eignor, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen (Eds.), Computerized adaptive testing: A primer (Second Edition). Hillsdale, NJ: Lawrence Erlbaum Associates, 101-133.

Thissen, D. (2000). Reliability and measurement precision. In H. Wainer, N. Dorans, D. Eignor, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen (Eds.), Computerized adaptive testing: A primer (Second Edition). Hillsdale, NJ: Lawrence Erlbaum Associates, 159-184.

Steinberg, L., Thissen, D. & Wainer, H. (2000). Validity. In H. Wainer, N. Dorans, D. Eignor, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen (Eds.), Computerized adaptive testing: A primer (Second Edition). Hillsdale, NJ: Lawrence Erlbaum Associates, 185-229.

Wainer, H., Dorans, N., Green, B., Mislevy, R.J., Steinberg, L. & Thissen, D. (2000). Future challenges. In H. Wainer, N. Dorans, D. Eignor, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen (Eds.), Computerized adaptive testing: A primer (Second Edition). Hillsdale, NJ: Lawrence Erlbaum Associates, 231-270.

Orlando, M., & Thissen, D. (2000). New item fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50-64.

Yung, Y.F., McLeod, L.D., & Thissen, D. (1999). On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64, 113-128.

Chen, W.H., & Thissen, D. (1999). Estimation of Item Parameters for The Three-Parameter Logistic Model Using The Marginal Likelihood of Summed Scores. British Journal of Mathematical and Statistical Psychology, 52, 19-37.

Thissen, D., Nelson, L., Billeaud, K., & McLeod, L. (1998). A brief introduction to item response theory for items scored in more than two categories. In Bourque, M.L. (Ed.), Proceedings of achievement levels workshop (Pp. 47-61). Washington, DC: National Assessment Governing Board.

Billeaud, K., Swygert, K., Nelson, L., & Thissen, D. (1998). Some ideas about item response theory applied to combinations of multiple-choice and constructed-response items---Scale scores for patterns of summed scores. In Bourque, M.L. (Ed.), Proceedings of achievement levels workshop (Pp. 65-76). Washington, DC: National Assessment Governing Board.

Williams, V.S.L., Billeaud, K., Davis, L.A., Thissen, D., & Sanford, E. (1998). Projecting to the NAEP scale: Results from the North Carolina End-of-Grade testing program. Journal of Educational Measurement, 35, 277-296..

Williams, V.S.L., Pommerich, M., & Thissen, D. (1998). A comparison of developmental scales based on Thurstone methods and item response theory. Journal of Educational Measurement, 35, 93-107.

Bock, R.D., Thissen, D., & Zimowski, M.F. (1997). IRT estimation of domain scores. Journal of Educational Measurement, 34, 197-211.

Chen, W.H. & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.

Thissen, D. & Steinberg, L. (1997). A response model for multiple choice items. In W.J. van der Linden & Ronald K. Hambleton (Eds), Handbook of item response theory (Pp. 51-65). New York: Springer-Verlag.

Steinberg, L. & Thissen, D. (1996). Uses of item response theory and the testlet concept in the measurement of psychopathology, Psychological Methods, 1, 81-97.

Wainer, H. & Thissen, D. (1996). How is reliability related to the quality of test scores? What is the effect of local dependence on reliability? Educational Measurement: Issues and Practice, 15, 22-29.

Thissen, D., Pommerich, M., Billeaud, K., & Williams, V.S.L. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19, 39-49.

Wang, X.B., Wainer, H., & Thissen, D. (1995). On the viability of some untestable assumptions in equating exams that allow examinee choice. Applied Measurement in Education, 8, 211-225.

Steinberg, L. & Thissen, D. (1995). Item response theory in personality research. In P. Shrout & S. Fiske (Eds.), Personality research, methods & theory: A Festschrift honoring Donald W. Fiske. Hillsdale, NJ: Lawrence Erlbaum Associates, 161-181.

Wainer, H., Wang, X.B., & Thissen, D. (1994). How well can we compare scores on test forms that are constructed by examinees' choice? Journal of Educational Measurement, 31, 183-199.

Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple-choice, constructed-response, and examinee-selected items on two achievement tests. Journal of Educational Measurement, 31, 234-250.

Thissen, D., Wainer, H., & Wang, X.B. (1994). Are tests comprising both multiple-choice and free-response items necessarily less unidimensional than multiple-choice tests? An analysis of two tests. Journal of Educational Measurement, 31, 113-123.

Wainer, H., & Thissen, D. (1994). On examinee choice in educational testing. Review of Educational Research, 64, 159-195.

Pommerich, M., Billeaud, K., Williams, V.S.L., & Thissen, D. (1993). User's guide for the North Carolina End of Grade Tests. Raleigh, NC: North Carolina Department of Public Instruction.

Wainer, H. & Thissen, D. (1993). Combining multiple-choice and constructed response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6, 103-118.

Thissen, D. (1993). Repealing rules that no longer apply to psychological measurement. In N. Frederiksen, R.J. Mislevy & I. Bejar (Eds.), Test theory for a new generation of tests. Hillsdale, NJ: Lawrence Erlbaum Associates, 79-97.

Thissen, D., Steinberg, L. & Wainer, H. (1993) Detection of differential item functioning using the parameters of item response models. In P.W. Holland & H. Wainer (Eds.), Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates, 67-113.

Sireci, S.G., Thissen, D. & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28, 237-247.

Wainer, H., Sireci, S.G. & Thissen, D. (1991). DIFferential testlet functioning: Definitions and detection. Journal of Educational Measurement, 28, 197-219.

Thissen, D. & Wainer, H. (1990). Confidence envelopes for item response theory. Journal of Educational Statistics, 15, 113-128.

Thissen, D. & Mislevy, R.J. (1990). Testing algorithms. In H. Wainer, N. Dorans, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen, Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates, 103-135.

Thissen, D. (1990). Reliability and measurement precision. In H. Wainer, N. Dorans, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen, Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates, 161-186.

Steinberg, L., Thissen, D. & Wainer, H. (1990). Validity. In H. Wainer, N. Dorans, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen, Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates, 187-231.

Wainer, H., Dorans, N., Green, B., Mislevy, R.J., Steinberg, L. & Thissen, D. (1990). Future challenges. In H. Wainer, N. Dorans, R. Flaugher, B. Green, R. Mislevy, L. Steinberg & D. Thissen, Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates, 233-272.

Thissen, D. & Mooney, J.A. (1989). Loglinear item response models, with applications to data from social surveys. Sociological Methodology 1989, 299-330.

Thissen, D., Steinberg, L. & Mooney, J.A. (1989). Trace lines for testlets: A use of multiple-categorical-response models. Journal of Educational Measurement 26, 247-260.

Thissen, D., Steinberg, L. & Fitzpatrick, A.R. (1989). Multiple choice models: The distractors are also part of the item. Journal of Educational Measurement, 26, 161­176.

Thissen, D. & Steinberg, L. (1988). Data analysis using item response theory. Psychological Bulletin, 104, 385-395.

Thissen, D., Steinberg, L. & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. Braun (Eds.), Test Validity. Hillsdale, NJ: Erlbaum, pp. 147-169.

Wainer, H. & Thissen, D. (1987). Estimating ability with the wrong model. Journal of Educational Statistics, 12, 339-368.

Thissen, D. & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567-577.

Thissen, D. (1986). Measurement precision and "reliability": Some considerations of metrics and stopping rules in CAT. Proceedings of the 27th Annual Conference of the Military Testing Association. San Diego: NPRDC.

Thissen, D., Steinberg, L. & Gerrard, M. (1986). Beyond group mean differences: The concept of item bias. Psychological Bulletin, 99, 118-128.

Thissen, D. & Steinberg, L. (1984). A response model for multiple choice items. Psychometrika, 49, 501-519.

Thissen, D. & Wainer, H. (1983). Toward the measurement and prediction of victim proneness. Journal of Research in Crime and Delinquency, 20, 243-261.

Thissen, D. (1983). Timed testing: An approach using item response theory. In D. Weiss (Ed.), New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing. N.Y.: Academic Press, pp. 179-203.

Thissen, D., Steinberg, L., Pyszczynski, T. & Greenberg, J. (1983). An item response theory for personality and attitude scales: Item analysis using restricted factor analysis. Applied Psychological Measurement, 7, 211-226.

Thissen, D. & Wainer, H. (1982). Some standard errors in item response theory. Psychometrika, 47, 397-412.

Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47, 175-186.

Thissen, D. (1976). Information in wrong responses to the Raven Progressive Matrices. Journal of Educational Measurement, 13, 201-214.


Return to Dave Thissen's Front Page
Last modified 6/17/2016