Innovations In Clinical Neuroscience

NOV-DEC 2017

A peer-reviewed, evidence-based journal for clinicians in the field of neuroscience

Issue link: https://innovationscns.epubxp.com/i/924986

Contents of this Issue

Navigation

Page 41 of 83

42 ICNS INNOVATIONS IN CLINICAL NEUROSCIENCE November-December 2017 • Volume 14 • Number 11–12 O R I G I N A L R E S E A R C H Item response theory (IRT) analyses of the PANSS have been used to identify how different PANSS items measure symptom severity within specific symptom dimensions. 3–6 Most of these studies focused on baseline data, and it remains unclear if end-point or post-treatment data are similar or different in structure. In the first IRT of the PANSS, Santor and colleagues 4 used non- parametric IRT to analyze baseline PANSS data from 9,205 patients with schizophrenia, schizoaffective, or schizophreniform disorder who were enrolled in either observational studies or clinical trials. Also using non- parametric models in a follow-up study, Khan and colleagues 3 analyzed baseline PANSS scores from 7,187 patients. Levine et al 5 used IRT to assess the consistency of the PANSS scale using the same dataset as the original Marder analysis, with a parametric graded response model. Levine did not rank items or propose subsets of items for removal. Weak PANSS items might be sample dependent and could vary across country and stage of illness, 7,8 with characteristic changes in IRT models seen between active and placebo interventions. 9 Moreover, the methodological approach of ranking items within a factor domain dismisses the usage of the PANSS as a unidimensional measurement when item scores are summed across all domains, implying that the quality of items using these previous analyses are with respect to each subdomain and not the entire PANSS scale. Although PANSS data have been the focus of many factor analytic studies in a wide variety of samples, 10–12 as well as several applications of item response theory (IRT) models to domain subscales, 3–5,9 few if any studies have carefully examined the psychometric properties of the total score as reflecting variation on an overall symptom severity dimension, or considered the psychometric properties of the items in the remission subset. A bifactor IRT model would separately identify a general factor that might be independent of other specific factors, thus helping separate out generalized symptoms from specific symptoms. The purpose of the present investigation is thus to better understand and evaluate the psychometric properties of PANSS total scores and the remission set. We aimed to determine 1) how well the total score and/or the general factor identified in an IRT bifactor model work to measure overall symptom severity; (2) if there is a subset of items that might be superior to using the total score to identify patients who achieve "relief " from symptoms (i.e., when symptom ratings are between "absent" and "minimal" in terms of PANSS anchors); and 3) in this sample of individuals studied at the end of their participation in clinical trials, how the remission criteria compare to relief criteria. To address these questions, we applied a bifactor item response theory 13–16 model to a large sample of ratings on subjects diagnosed with schizophrenia assessed at the termination of 11 clinical trials. This bifactor model was specified to allow for one general dimension, representing overall symptom severity, and five specific domain dimensions (i.e., positive, negative, disorganized, excited, and anxiety/ depression symptoms) representing unique variation that cannot be explained by a general factor. The utility of subsets of PANSS items, including the Remission set, were compared to evaluate how symptom relief, or mild illness levels, can best be measured. A brief review of item response theory and bifactor models. The basic goal of applying an IRT model is to use a mathematical model (typically a logistic function) to characterize the relation between individual differences on a latent variable (i.e., trait levels) and the probability of responding in a particular category. 17 For example, in the well-known graded response model (GRM) 18 for ordered polytomous items, each item is characterized by a set of "parameters" that reflect the strength of the relation with the latent variable (called "discrimination" and symbolized by a) and a set of the location parameters (called "thresholds" and symbolized by b) that indicate the trait levels at which the probability of responding above a given category is 0.50. Finally, trait levels in IRT are typically reported in a z-score like metric such that the mean score in the population is zero with a standard deviation (SD) of 1. Once estimated, these item parameters define the category response curves (CRCs) for a given item. To illustrate, Figure 1 displays the CRCs for four PANSS items that vary in discrimination: a =0.65, 0.99, 1.58, and 1.92 for Blunted Affect, Difficulty in Abstract Thinking, Conceptual Disorganization, and Delusions, respectively. For each item, from left to right, the CRCs provide a visual depiction of the probability of responding in Categories 1 to 6 as a function of symptom severity . Observe that as the item discrimination increases, the CRCs become more peaked, and are thus more "discriminating" (i.e., responses in particular categories convey more precision in terms of trait standing). A convenient feature of IRT models is that CRCs can be easily converted to item information curves (IICs). For example, Figure 2 displays the IICs for the four items shown in Figure 1. The lowest curve corresponds to the least discriminating item in Figure 1 and the highest curve corresponds to the most discriminating item in Figure 1. Simply stated, items with higher discriminations provide more information, and the location of that information is determined by the threshold parameters. The IRT concept of information is critically important in judging item quality because the amount of information, conditional on trait level, is inversely related to an item's contribution in reducing an individual's standard error of measurement. Standard errors of measurement are one divided by the square root of the information. As we will show shortly, item information functions can be added together to form an overall test information curve (TIC) used to judge the overall quality and measurement precision a set of items provides. Most applications of IRT modeling are application of so-called "unidimensional" models where there is a single latent variable of interest. With the PANSS, however, we know from previous research and our own data explorations that item responses are highly multidimensional. This multidimensionality can severely bias parameter estimates when fitting a unidimensional (one trait) IRT model. However, fitting IRT models within separate factor domains omits the usage of the total PASS score as a measure of illness level and evaluates items only with respect to others in that particular domain. For this reason, in the present study, we fit a bifactor IRT model. 15 The bifactor model specifies that each item is an indicator of a general trait (symptom severity here), as well as one secondary specific dimension (e.g., positive symptoms). The general factor and the specific dimensions are orthogonal. As described below, although

Articles in this issue

Archives of this issue

view archives of Innovations In Clinical Neuroscience - NOV-DEC 2017