Results 1 - 10
of
143
Computing inter-rater reliability for observational data: an overview and tutorial,”
- Tutorials in Quantitative Methods for Psychology,
, 2012
"... Many research designs require the assessment of inter-rater reliability (IRR) to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
Many research designs require the assessment of inter-rater reliability (IRR) to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or do not address how IRR affects the power of their subsequent analyses for hypothesis testing. This paper provides an overview of methodological issues related to the assessment of IRR with a focus on study design, selection of appropriate statistics, and the computation, interpretation, and reporting of some commonly-used IRR statistics. Computational examples include SPSS and R syntax for computing Cohen's kappa and intra-class correlations to assess IRR. The assessment of inter-rater reliability (IRR, also called inter-rater agreement) is often necessary for research designs where data are collected through ratings provided by trained or untrained coders. However, many studies use incorrect statistical analyses to compute IRR, misinterpret the results from IRR analyses, or fail to consider the implications that IRR estimates have on statistical power for subsequent analyses. This paper will provide an overview of methodological issues related to the assessment of IRR, including aspects of study design, selection and computation of appropriate IRR statistics, and interpreting and reporting results. Computational examples include SPSS and R syntax for computing Cohen's kappa for nominal variables and intraclass correlations (ICCs) for ordinal, interval, and ratio variables. Although it is beyond the scope of the current paper to provide a comprehensive review of the many IRR statistics that are available, references will be provided to other IRR statistics suitable for designs not covered in this tutorial.
An examination of interrater reliability for scoring the Rorschach Comprehensive System in eight data sets
- Journal of Personality Assessment
, 2002
"... Exner, 1993) in 8 relatively large samples, including (a) students, (b) experienced re- ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
(Show Context)
Exner, 1993) in 8 relatively large samples, including (a) students, (b) experienced re-
How does motivational interviewing work? Therapist interpersonal skill predicts client involvement within motivational interviewing sessions
- Journal of Consulting and Clinical Psychology
, 2005
"... All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.
Treatment adherence, competence, and outcome in individual and family therapy for adolescent behavior problems
- Journal of Consulting and Clinical Psychology
, 2008
"... This study examined the impact of treatment adherence and therapist competence on treatment outcome in a controlled trial of individual cognitive–behavioral therapy (CBT) and multidimensional family therapy (MDFT) for adolescent substance use and related behavior problems. Participants included 136 ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
This study examined the impact of treatment adherence and therapist competence on treatment outcome in a controlled trial of individual cognitive–behavioral therapy (CBT) and multidimensional family therapy (MDFT) for adolescent substance use and related behavior problems. Participants included 136 adolescents (62 CBT, 74 MDFT) assessed at intake, discharge, and 6-month follow-up. Observational ratings of adherence and competence were collected on early and later phases of treatment (192 CBT sessions, 245 MDFT sessions) by using a contextual measure of treatment fidelity. Adherence and competence effects were tested after controlling for therapeutic alliance. In CBT only, stronger adherence predicted greater declines in drug use (linear effect). In CBT and MDFT, (a) stronger adherence predicted greater reductions in externalizing behaviors (linear effect) and (b) intermediate levels of adherence predicted the largest declines in internalizing behaviors, with high and low adherence predicting smaller improvements (curvilinear effect). Therapist competence did not predict outcome and did not moderate adherence–outcome relations; however, competence findings are tentative due to relatively low interrater reliability for the competence ratings. Clinical and research implications for attending to both linear and curvilinear adherence effects in manualized treatments for behavior disorders are discussed.
Measurement issues in the alignment of standards and assessments: A case study (No
- Los Angeles, Calif.: University of California, National Center for
, 2005
"... Center for the Study of Evaluation National Center for Research on Evaluation, ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
Center for the Study of Evaluation National Center for Research on Evaluation,
Temporal Stability of WISC–III Subtest Composite: Strengths and Weaknesses
"... used to identify subtest-based cognitive strengths and weaknesses that are subsequently used to generate interventions. Given that intelligence is presumed to be an enduring trait, cognitive strengths and weaknesses identified via subtest analysis should also be stable over time. This was evaluated ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
(Show Context)
used to identify subtest-based cognitive strengths and weaknesses that are subsequently used to generate interventions. Given that intelligence is presumed to be an enduring trait, cognitive strengths and weaknesses identified via subtest analysis should also be stable over time. This was evaluated with 579 students who were twice tested with the WISC–III. Based on 66 subtest composites, 6 or 7 interpretable cognitive strengths and weaknesses were found on each WISC–III administration. However, subtest-based strengths and weaknesses replicated across test–retest occasions at chance levels (Mdn .02). Because subtest-based cognitive strengths and weaknesses are unreliable, recommendations based on them will also be unreliable. The Wechsler scales are the most popular individual measures of intelligence for children, adolescents, and adults (Alfonso, Oak-land, LaRocca, & Spanakos, 2000; Belter & Piotrowski, 2001). Among the school-age population, millions of children have been administered the Wechsler Intelligence Scale for Children—Third Edition (WISC–III; Wechsler, 1991) as part of an evaluation to determine eligibility for special education services (Kamphaus,
Brief alcohol interventions: Do counsellors’ and patient’s communication characteristics predict change? Alcohol and Alcoholism
, 2008
"... Abstract — Aims: To identify communication characteristics of patients and counsellors during brief alcohol intervention (BAI) which predict changes in alcohol consumption 12 months later. Methods: Tape-recordings of 97 BAI sessions with hazardous drinkers were analysed using the Motivational Interv ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract — Aims: To identify communication characteristics of patients and counsellors during brief alcohol intervention (BAI) which predict changes in alcohol consumption 12 months later. Methods: Tape-recordings of 97 BAI sessions with hazardous drinkers were analysed using the Motivational Interviewing Skill Code (MISC). Outcome measures were (i) baseline to a 12-month difference in the weekly drinking quantity, and (ii) baseline to a 12-month difference in heavy drinking episodes per month. Bivariate analyses were conducted for all MISC measures, and significant variables were included in multiple linear regression models. Results: Patient communication characteristics (ability to change) during BAI significantly predicted the weekly drinking quantity in the multiple linear regression model. There were significant differences for some of the counsellor skills in bivariate analyses but not in the multiple regression model adjusting for patients ’ talk characteristics. Changes in heavy drinking showed no significant association with patient or counsellor skills in the multiple linear regression model. Conclusion: Findings indicate that the more the patient expresses ability to change during the intervention, the more weekly alcohol use decreases. The role of the counsellor during the interaction, and influence on the outcomes was not clearly established. Implications for BAI and related research are discussed.
Reliability of the Services Assessment for Children and Adolescents. Psychiatric Services,
, 2001
"... ..."
Reliability and validity of the matson evaluation of social skills with youngsters
- Behavior Modification
, 2010
"... Social skills are an important part of development, and deficits in this area have long-term impacts on a child. As a result, clinicians should include a measure of social skills as part of a comprehensive assessment. There are a few well-researched measures of social skills that are currently used, ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Social skills are an important part of development, and deficits in this area have long-term impacts on a child. As a result, clinicians should include a measure of social skills as part of a comprehensive assessment. There are a few well-researched measures of social skills that are currently used, including the Matson Evaluation of Social Skills with Youngsters (MESSY). The MESSY has been translated and studied internationally in more than nine countries; how-ever, updated norms for the United States have not been conducted since the inception of the measure. The purpose of this article is to examine the psy-chometric properties of the MESSY using an updated norm sample and age cohorts. Overall results indicated strong internal consistency and good to strong convergent and divergent validity. Psychometric properties for the older age cohorts were stronger and more consistent than those for the 2- to 5-year-olds. This reflects the variability of development and difficulty of assess-ing social skills at this young age. at PENNSYLVANIA STATE UNIV on May 11, 2016bmo.sagepub.comDownloaded from 540 Behavior Modification 34(6)
Interobserver reliability of tongue diagnosis using traditional Korean medicine for stroke patients,” EvidenceBased Complementary and Alternative Medicine, vol
- 2012, Article ID
, 2012
"... Observation of the tongue, also known as tongue diagnosis, is an important procedure in diagnosis by inspection in Traditional Korean medicine (TKM). We investigated the reliability of TKM tongue diagnosis in stroke patients by evaluating interobserver reliability regarding tongue indicators as par ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Observation of the tongue, also known as tongue diagnosis, is an important procedure in diagnosis by inspection in Traditional Korean medicine (TKM). We investigated the reliability of TKM tongue diagnosis in stroke patients by evaluating interobserver reliability regarding tongue indicators as part of the project named the Fundamental Study for the Standardization and Objectification of Pattern Identification in TKM for Stroke (SOPI-Stroke). A total of 658 patients with stroke admitted to 9 oriental medical university hospitals participated. Each patient was independently seen by two experts from the same department for an examination of the status of the tongue. Interobserver agreement about subjects regarding pattern identification with the same opinion between the raters (n = 451) was generally high, ranging from "moderate" to "excellent". Interobserver agreement was nearly perfect for certain signs of special tongue appearance (mirror, spotted, and bluish purple), poor for one of the tongue colors (pale) and moderate for others. Clinicians displayed measurable agreement regarding tongue indicators via both observation and pattern identification consistency. However, interobserver reliability regarding tongue color and fur quality was relatively low. Therefore, it is necessary to improve objectivity and reproducibility of tongue diagnosis through the development of detail-oriented criteria and enhanced training of clinicians.