What is the optimal assessment of speech? A multicentre, international evaluation of speech assessment in 2500 patients with a cleft

Introduction

A cleft lip and/or palate (CL/P) is the most common congenital craniofacial anomaly, with varying incidence rates among Asians (1:500), Caucasians (1:1000) and patients of African descent (1:2500).1–4 Causes of a cleft are multifactorial, as both environmental and genetic factors have been reported.4 Clefts can be categorised in multiple classification systems, of which a commonly used classification system includes four cleft types: a cleft lip (CL); a cleft lip and alveolus (CLA); a cleft palate (CP); and a cleft lip, alveolus and palate (CL(A)P).5 In addition, clefts can occur unilaterally or bilaterally.3

Due to the facial defects, functional and appearance-related problems can occur, of which the extent may depend on the cleft type; the severity of the cleft; and the coping of the individual and his/her environment.6 Functional problems such as speech problems, hearing impairment and orodental problems are often reported. As a result of the latter, difficulties with eating, drinking and breathing can occur as well.5 7

Given the broad range of problems a patient with a cleft may has to face, treatment of patients with CL/P is ideally done by a specialised and multidisciplinary cleft team in which speech therapists, maxillofacial and plastic surgeons, otolaryngologists, paediatricians, psychologists, orthodontists, geneticists and specialised nurses are involved.7 Treatment and monitoring patients with CL/P consists of multiple surgical interventions to close the defect and to improve appearance if the patient desires so. Follow-up of hearing function is indicated in case of a CP, and placement of moppets is regularly done if necessary. Furthermore, psychological guidance is often indicated while the child grows up. Moreover, speech monitoring and long-term, intensive speech therapy is often necessary to improve the eligibility of the child.5 7

The development of speech is often complex in patients with a CP (with or without a CL, CP±L). Persistent velopharyngeal incompetence, residual fistula, adenoid atrophy, surgical intervention and hearing problems influence speech disorder severity in this population.8–12 Speech problems in patients with CP±L can have a large impact on an individual’s life, as proper speech skills play an essential role in activities, social functioning and participation in society.13 Many treatment pathways are focused on speech improvement to ameliorate Quality of Life (QoL).14 Logically, speech assessment is an important parameter in cleft care.

However, no consensus has been reached regarding best diagnostic speech outcome measures and their timing in this population.5 Developing scientifically solid instruments to assess speech in an objective manner is complicated, because listener’s perception of speech deficits, even by experts, may differ substantially.15 An additional challenge is systematic assessment of the patient’s perspective, which is essential to include due to the impact of speech problems on the individual.16 Although widely accepted agreement seems essential for improvement of cleft care, finding consensus is complex, especially since speech outcomes should be comparable between different languages to facilitate international collaboration.

Recently, the International Consortium for Health Outcomes Measurement (ICHOM) developed the ICHOM Standard Set for Cleft Lip and Palate (ICHOM Standard Set), with different pathways for varying cleft types.5 Based on patient and expert consensus, a minimal, accessible set of outcome measures was established to enable benchmarking between cleft centres in a systematic manner. For speech assessment, an outcome set was included with both clinical measures and Patient Reported Outcome Measures (PROMs), being the patient’s and parent’s perspectives.

So far, the selected standardised speech outcome measures and their timing have not been evaluated. As an increasing number of centres are implementing this set, it is important to critically evaluate and optimise this ICHOM Standard Set. Three centres, the Boston Children’s Hospital (Boston, USA), Duke University Hospital (Durham, USA) and the Erasmus Medical Center (Rotterdam, The Netherlands), started clinical implementation and an international collaboration in 2015. The overarching aim of this collaboration is to share data and knowledge obtained by using the set in standard care. Additionally, they collaborate with McMaster University (Hamilton, Canada), who developed and field tested the CLEFT-Q questionnaire. The CLEFT-Q is a PROM that is specifically developed and validated for patients with a cleft . Many scales are included in the ICHOM Standard Set.

The objective of this study was to evaluate the current standardised speech outcome measures of the ICHOM Standard Set for patients with CP±L. More specifically, the value of every speech outcome measure was examined, as well as the best age intervals for assessment of these outcome measures. In addition, other speech assessment tools are discussed. Finally, recommendations are made for an optimal and complete assessment of speech in patients with CP±L, that is efficient and accessible for all cleft centres.

Discussion

Evaluation of the value of the current ICHOM speech outcome measures

All correlations between PROMs were moderate, except for the strong correlation of the SFunction with both the SDistress and the ICS in patients with a CP. The fact that the correlation between the SFunction and SDistress is stronger in patients with CP than in patients with CL(A)P suggests that the visibly different appearance in patients with CL(A)P plays a significant role in SDistress as well; in a social context, looking differently may cause additional or more distress besides having speech problems. This is supported by our finding that the ICS correlated moderately with SFunction, but weakly with SDistress in the CL(A)P group. Parent-reported speech intelligibility correlated higher to children’s self report of their speech function than it did to the speech distress the children themselves experience. In the latter, distress about appearance could be included. This finding suggests that the ICS can give an indication of ‘patient-reported’ SFunction in young children who cannot complete a PROM themselves yet (7 years and younger).

The PROMs showed weak correlations with the clinical outcomes measures, except for the moderate correlation that was seen between the ICS and the PCC in both patient groups. Based on these findings, PROMs appear to be of added value, as they provide different information than that captured with the clinical outcome measures included in the Standard Set. They add a unique dimension to speech outcome measurement—a subjective dimension related to the patient’s experiences with everyday speaking situations. While clinical measures objectively appraise the quality of speech, they will probably be insufficient to adequately capture the more nuanced social, emotional and psychological aspects of SDistress and SFunction. With this additional self-report and parental information, clinicians can more comprehensively explore the patients’ problems concerning speech in order to find out whether additional treatment or guidance is indicated.

Evaluation of the impact of age of assessment on measurement outcomes

In both CLEFT-Q Speech Scales, the age group of 8–9 years enholds the worst scores. Speech improvement due to speech therapy or late closure of the hard palate (in certain protocols around the age of 9 years when alveolar bone grafting is performed), might explain the higher, better scores in the age groups of 10–13 and 14–16 years. In age groups 17 and up, however, CLEFT-Q scores appeared to decline whereas PCC scores improved. This finding suggests that (almost) adult patients with CP±L develop feelings of insecurity concerning their speech, although their speech sound production remains good, or even improves. This is in line with speech therapists’ experiences in the outpatient clinic, where patients were seen in person at the age of 22, but not at age of 17–19. Quite often, when discussing outcomes of the CLEFT-Q Scales as well as the PCC with the patient, (s)he reacted surprised when told that no (cleft-related) problems were present in their speech. Taking the lower CLEFT-Q scores in 8–9, 17–19 and 20–22 year-olds that were found in the field test into consideration, additional assessment of a PROM at the age groups of 8–9 (youngest age at which this PROM can be assessed) and 17–19 years should be considered for implementation in the ICHOM Standard Set. Therewith, monitoring patients more closely will be enabled, and any concerns of patients with CP±L regarding their speech can be discussed timely.

The two CLEFT-Q Speech Scales showed to capture overlapping information as they strongly correlate in patients with CP. Questions deriving from the SDistress are not measurable in any other manner, whereas SFunction from the patient’s perspective might be less of added value for a PROM questionnaire. Therefore, implementation of the CLEFT-Q SDistress scale in patients with both cleft types is recommended in the age groups of 8–9 and 17–19 years (figure 3).

Figure 3
Figure 3

Overview of the new proposed ICHOM Standard Set concerning speech assessment. Newly made recommendations are coloured in pink. *Suggestion for centres that have adequate resources to implement and are interested in research with speech outcomes. CAPS-A, Cleft Audit Protocol for Speech Augmented; ICHOM, International Consortium of Health Outcomes Measurement; ICS, Intelligibility in Context Scale; PCC, Percent Consonants Correct; SDistress, Speaking-Related Distress; SFunction, Speech Function; VPC, Velopharyngeal Competence Rating.

A ceiling effect in ICS outcomes of patients with CP, without clear differences between average scores in patients with CP and CL(A)P, suggests that the group with CP contains a diverse population and severity of the speech problems vary widely. Furthermore, since ICS is not specifically developed for a population with CP±L, it is debatable whether this tool captures the information necessary to point out all relevant speech problems in the patient group.

However, exclusion of ICS could mean that a large part of the speech problems in the population with CP would remain undetected. Assessment at 5 and 12 years in patients with both cleft types, which is the current timing in the ICHOM Standard Set, appears therefore appropriate despite the ceiling effect.

Although VPC scores were relatively favourable in patients with CP, no changes regarding the implementation of the VPC scores are recommended as the outcomes showed to vary. VPC can serve as a suitable screening tool and outcomes are easily gathered by the observation of a clinician. Hence, patient-burden is low and the tool efficiently detects any velopharyngeal problems.

PCC scores that were found indicated speech sound problems, especially in the younger age groups of the patients with CL(A)P. Twenty-two-year-olds with both cleft types showed mild speech sound problems in general. Therefore, time points as currently implemented in the ICHOM Standard Set appear adequate.

In contrast, the suitability of PCC assessment in a cleft set focusing on standardised outcome measures is still debatable, as intercentre and intracentre reliabilities have not been investigated thoroughly in all participating centres so far.15 Future research should include an examination of scoring and interpreting PCC scores in different centres and/or different countries.

Future considerations regarding alternative speech outcome measures

In order to establish an optimal cleft set for speech assessment, other standardised outcome measures should be considered. Based on clinical experience with ICHOM Standard Set, possible suggestions for additional outcome measures are discussed here.

Regarding PROMs for speech assessment in patients with CP±L, the CLEFT-Q Scales seem to be the most suitable PROMs available. Their comprehensive psychometric examination and cross-cultural character make them accessible for all cleft centres that seek an efficient minimal cleft set that comprises all important speech parameters.17–19 The standardised approach for translation and validation of the CLEFT-Q questionnaire enables accessibility of the PROM even for centres that still need to translate the CLEFT-Q into their native language.33 34 Another cleft-specific PROM is the Cleft Hearing and Speech Questionnaire (CHASQ). Whereas the psychometric properties of the CLEFT-Q were examined throughout Rasch measurement theory, classical test theory was used for the CHASQ.35 A recent cross-sectional questionnaire study that compared the CLEFT-Q with the CHASQ, found that the majority of the patients with CP±L preferred the CLEFT-Q.35 Therefore, implementation of the CHASQ speech does not seem to be of added value in the current cleft set.

Besides the used VPC measure, a more elaborate variant exists, namely the VPC-Summary (VPC-Sum). This includes assessment of hypernasality, passive VPI symptoms and the transcriptions of active non-oral consonant errors.36 VPC-Sum can either be reported as a score between 0 and 6, or as a dichotomised outcome (velopharyngeal competence or incompetence).36

Calculation of the VPC-Sum is based on single words, whereas VPC-rate is based on observation of spontaneous speech.37 VPC-Sum would be an interesting measure due to its efficiency, although it may not be achievable to implement VPC-Sum in all centres in the near future as only five different languages are currently available.31 Other alternatives such as nasopharyngoscopy or MRI are invasive, expensive and enlarge the patient burden,38 and therefore not easy accessibility for all centres.

The currently implemented PCC lacks any categorisation of consonant errors. The Eurocleft Speech Group created a research protocol with a phonetic framework, which was used in six centres and five different languages.39 It also included consonant production, but assessed on sentence level instead of single words. It is categorised into three groups (correct, almost correct and incorrect). Further division into 21 error categories that were sampled in five groups was done in case of incorrect consonants (nasal airflow, glottal realisations, alveolar deviations, sibilant deviations and other).39 Moreover, general speech quality was assessed concerning hypernasality and hyponasality, and voice quality.40 Expert rating of these outcomes requires periodic training of sufficient inter-rater reliability. However, it might be too detailed for implementation in an efficient, clinically oriented cleft set. Therefore, we suggest to further categorise the PCC score, although not as detailed as in the Eurocleft studies. Based on clinical experience with the ICHOM Standard Set, it is recommended that speech pathologists report whether any cleft-related, phonological, or phonetic problems are detected.

Another clinical outcome measure, the Great Ormond Street Speech Assessment 1998 (GOS.SP.ASS’98), provides a comprehensive view of all speech associated features for patients with CP±L.41 42 Its suitability for intercentre comparison would make it interesting for the ICHOM Standard Set5; however, it is too detailed for clinical audit.43 In succession the Cleft Audit Protocol for Speech Augmented (CAPS-A) was developed for cleft-related problems, and could be an alternative for PCC.44 Seen its rigorous psychometric assessment, it fits well into a set that seeks standardised outcome measures. The Americleft Speech Project found that an acceptable inter-rater and intrarater reliability can be achieved.43 45 Furthermore, it is suitable for assessment in 5-year-olds, which enables detection of speech problems at an earlier age.46 However, the CAPS-A is limited in types of statistical analyses due to the scaling type used (equal appearing interval).47 A more practical challenge concerning implementing the CAPS-A would be the required training of all involved speech therapists, and the amount of time the assessment takes (15 min).44 Moreover, the CAPS-A is developed and applicable for English-speaking countries, necessitating translation and validation in other languages.43 The CAPS-A is not ideal for centres interested in a minimal and efficient cleft set. However, centres with experience and resources are highly recommended to implement this tool in order to promote further international standardisation of elaborate speech assessment in patients with CP±L (figure 3). Implementation of the CAPS-A would also enable the use of the recently developed and validated CAPS-A-VPC-Sum score to reliably measure velopharyngeal function.48 Our suggestion for centres that consider the implementation of the CAPS-A is to assess it at ages 5–7, 10–13 and 20–22 years in order to enable long-term follow-up.

Limitations of the study

Data were analysed cross-sectionally. Longitudinal analyses to explore development of speech and for benchmarking will be possible in the future since data collection continues. Moreover, because this study included data from the CLEFT-Q field test, a higher number of outcome data from the CLEFT-Q scales were available for analyses than from the other outcome measures included in the ICHOM Standard Set.

This post was originally published on https://bmjopen.bmj.com