Development and psychometric properties evaluation of nurses innovative behaviours inventory in Iran: protocol for a sequential exploratory mixed-method study

This will be a sequential exploratory mixed-methods approach, incorporating a design that combines qualitative and quantitative studies.24 The ontology and epistemology of mixed-method studies encompass the acknowledgement of multiple realities and the significance of comprehending phenomena from diverse perspectives.25 By incorporating both qualitative and quantitative methods, researchers aim to capture a more comprehensive understanding of the studied phenomenon. qualitative methods delve into the subjective experiences, meanings and contexts, providing depth and richness to the analysis. Quantitative methods, on the other hand, contribute statistical data and objective measurements, offering breadth and generalisability.26 Due to the intricate and multifaceted nature of innovative behaviours, characterised by creative problem-solving and idea generation,27 we have chosen a sequential mixed methods design for the study protocol. This design will integrate both qualitative and quantitative data, providing a comprehensive exploration of the dimensions and intricacies of innovative behaviours in nursing. Embracing this approach will enable us to gather diverse insights, thereby enhancing the validity and reliability of the intended measuring instrument’s development. The qualitative phase aims to investigate the concept of NIB by gathering insights from nurses’ experiences and conducting a thorough review of existing literature. Subsequently, the primary components of NIB would be formulated. Moving on to the second phase, the quantitative aspect of the study focuses on evaluating the psychometric properties of the developed instrument (figure 1). This research study received ethical approval in November 2022. It is currently underway and is expected to last until December 2024.

The qualitative phase


The study population for this study comprises nurses who possess a minimum of 1 year of experience in clinical practice. The selection of participants will be limited to nurses working within healthcare organisations such as hospitals and clinics in Tehran, Iran. At this stage, the researcher will formally invite eligible nurses to participate by communicating with clinical centres. Interested individuals will then reach out to the researcher for additional details about the study. Participants are encouraged to communicate their preferred time and location if they decide to partake. Involvement in the study will be entirely voluntary, and participants will provide written informed consent after receiving a thorough overview of the study’s objectives.


Purposeful sampling will be employed to optimise the range of participants, thereby promoting diversity within the study group, as evident through demographic factors such as age, gender, education and marital status, as well as professional background.28 The recruitment process will persist until data saturation is achieved, signifying that no novel concepts emerge during the interviews.29

Data collection

The data collection process will use both the inductive and deductive methods. Inductively, codes will be derived from semistructured personal interviews conducted with nurses. The individual interviews will be conducted at selected times and locations, as preferred by the participants. These interviews will enable the extraction of firsthand insights and perspectives from the participants. Additionally, the deductive approach will involve extracting codes from the literature review, allowing for a comprehensive exploration of existing knowledge and theories related to the topic of NIBs. During the data collection phase, the interviews will be carefully analysed using the conventional qualitative content analysis method.30 Given the unpredictable nature of sample size in qualitative studies, data collection will continue until data saturation is achieved.31 The interviews will be guided by an interview guide32 that will be developed in consultation with experts in qualitative research and nursing researcher (online supplemental material) to ensure methodological rigour and relevance.33 Qualitative research experts will contribute to the robustness of the study’s design, aligning questions with established qualitative principles.34 Researchers will provide domain-specific insights, ensuring the questions are meaningful within the context of discipline.32

Supplemental material

Prior to the interviews, the participants will be provided with a clear explanation of the study’s objectives and the purpose of the interviews presented in both written and verbal formats. Each interview will commence with an open-ended and factual question concerning the challenges they encounter in their work and the approaches they employ to address these challenges. Subsequent questions will be tailored based on their responses to the initial question and in accordance with the interview guide. Probing questions such as ‘What do you mean by this?’ or ‘Can you provide more detailed explanations?’ will be used when necessary to delve deeper into specific topics. Furthermore, participants will be given an opportunity at the end of each interview to address any points they feel may have been overlooked. Participants’ non-verbal cues, including tone, silence, emphasis and body gestures, will be documented during the interviews to capture a comprehensive understanding of their communication. All interviews will be recorded and transcribed verbatim immediately following the interview.

Data analysis

To analyse the collected data, a content analysis approach using the conventional method will be employed. The Zhang and Wildemuth framework will be used for this purpose.35 Certainly, other methods of qualitative content analysis would also be done, however, considering that Zhang and Wildemuth have clearly stated the steps of qualitative content analysis, their proposed framework will be used. It is worth noting that the inherent nature of this method of inductive qualitative content analysis is not significantly different from other frameworks such as ‘Graneheim and Lundman’36 or ‘Hsieh and Shannon’.37 Before conducting each subsequent interview, the transcription, analysis and coding of the previous interview will be completed. During the analysis process, codes, subcategories, categories and themes will be derived from the transcribed data. Initial codes that are related will be combined and labelled to form subcategories and categories. Through consensus among researchers, the underlying meaning of the text and the main themes will be extracted. This process will persist throughout the entirety of the qualitative data analysis phase until its completion that ultimately leading to a comprehensive understanding of the concept of NIBs. The extracted themes, main categories and relevant findings from the existing literature and instruments will be used to generate the primary item pool for the NIBI.

Accuracy and rigour

In the current study, efforts will be made to ensure data rigour. Lincoln and Guba38 proposed the criteria of credibility, dependability, transferability and confirmability as practical strategies for achieving rigour in research.38A series of meticulous activities will be conducted throughout the research process: Beginning with the formulation of precise research questions, the study will involve repeated readings for a profound understanding of the data.39 A coding system with clear definitions will be established, and multiple coders will collaborate to enhance reliability.40 Regular team meetings, peer reviews and an audit trail will ensure transparency and consistency. Reflexivity, consensus building, member checking and continuous saturation checks will be integral components, fostering accuracy, dependability and a thorough exploration of the content.39

Item generation

Each code will be meticulously examined to create an initial item pool for the development of the scale. Within each specific subcategory, the items will be derived from the identified themes and subthemes, incorporating the participants’ actual expressions as much as possible. Subsequently, instructions and response options will be incorporated for each item. It will be essential to maintain simplicity and adhere to the published guidelines41 while constructing the items.

The NIBI will use a Likert scale with five response options, ranging from ‘always’ to ‘never’. The wording of these options aims to capture a spectrum of sentiments, ensuring a comprehensive and nuanced understanding of participants’ perspectives.42

Literature review

Following the establishment of a clear definition and dimensions of NIBs, a literature review will be undertaken. This review aims to identify any features that may not have been identified in the qualitative study or extracted in the qualitative section. If such features or statements are missing, they will be added to the item pool for further consideration. In this stage, the primary guide for the study will be York University’s five-step guide.43 This guide offers a comprehensive set of structured and detailed instructions for designing, implementing and reporting results in systematic review studies.44 The steps outlined in this guide include: determining the review question, establishing selection criteria for studies, identifying relevant studies, data extraction, data synthesis and plan for dissemination.45 Electronic databases including MEDLINE (via PubMed), Scopus, PsycINFO (via EBSCO), CINAHL (via EBSCO) and ProQuest will be searched without any time limitations. A combination of keywords such as “Nurses’ Innovative behaviors,” “Innovative Work Behavior” and “Tool (Scale, Inventory, Instrument, Questionnaire) Development” will be used to retrieve relevant literature.

The quantitative phase

In this phase, a quantitative study with a cross-sectional design employing an approach of instrument design and psychometrics will be conducted to evaluate the psychometric properties of the NIBI. The assessment will encompass various properties, including face validity, content validity, construct validity and reliability. These evaluations are crucial for ensuring the robustness and accuracy of the inventory.

Face validity assessment

Face validity refers to the extent to which the statements within a measurement tool appear to be relevant and appropriate for measuring the specific construct or subject they were designed to assess.46 The assessment of the face validity of this study will be conducted using both quantitative and qualitative approaches. In the quantitative approach, a 5-point Likert scale will be used to assess each item of the questionnaire. The scale consists of the following options: completely understandable (5 points), understandable to some extent (4 points), moderately understandable (3 points), slightly understandable (2 points) and not understandable at all (1 point). To evaluate face validity, a sample of 10 nurses will be selected using a convenience sampling method. The inclusion criteria for this stage include holding at least a bachelor’s degree in nursing, having more than 1 year of work experience, and expressing willingness to participate in the study. Nurses who were part of the qualitative phase will not be recruited for this step. They will be asked to review each item and select one option from the Likert scale. Subsequently, the impact score for each item will be calculated using the formula:

Impact score=frequency (%)×comprehensiveness.

The frequency refers to the percentage of respondents who assigned a score of 4 or 5 to each item, indicating the average level of understanding based on the Likert scale. If the impact score surpasses 1.5, the item will be considered suitable for further analysis and retained. Conversely, items scoring below 1.5 will not be eliminated but will undergo review and modification.47

To assess face validity qualitatively, face-to-face interviews will be conducted with the same group of 10 participating nurses involved in the quantitative stage. Items with a face validity score below 1.5 will be specifically targeted for examination. The interviews will aim to explore the following aspects:

Difficulty: This refers to the comprehension of phrases or words that proved challenging for the respondents to understand.

Relevancy: This entails evaluating the accuracy and appropriateness of the items in relation to the considered structure and its dimensions, as perceived by the respondents.

Ambiguity: This involves investigating whether the expressions of the items were misunderstood or if the words lacked sufficient clarity of meaning.47

Content validity assessment

Content validity offers insights into how well elements of an assessment instrument align with and accurately represent the intended construct for a specific assessment objective.48 In this stage, content validity will be assessed through both qualitative and quantitative approaches.

For the qualitative assessment of content validity, a panel of 10 university professors from various Iranian universities, each possessing significant research expertise in the field of innovation, will be invited via email and in person to assess the questionnaire’s grammar, wording, item allocation and scaling. Their feedback will be used to make necessary modifications to the items.

Content validity ratio

To assess the necessity of the questionnaire items, the study will employ the content validity ratio (CVR). This ratio will be calculated by inviting experts to evaluate each item on a three-point scale, encompassing the categories of ‘not essential’, ‘useful but not essential’ and ‘essential’. The scale will range from 1 to 3, allowing experts to assign scores accordingly. The CVR will be computed using the following formula:

Embedded Image

where Ne represents the number of experts indicating an item as ‘essential’, and N represents the total number of experts participating. The determination of item acceptance will be based on the Lawshe Table (1975), which takes into account the number of experts involved.49 In this study, with 10 participating experts, the minimum acceptable CVR score, based on Lawshe’s recommendation, is set at 0.62. Items with a score lower than 0.62 will be removed.

Content Validity Index

Two Content Validity Indices (CVIs) will be calculated in this study: the Individual Item-CVI and the Scale-CVI. To determine the Individual Item-CVI, the questionnaire will be distributed to a panel of 10 experts,50 who will be asked to score each item on a four-point scale (1) : irrelevant, (2) moderately relevant, (3) relevant and (4) absolutely relevant. The Individual Item-CVI will then be calculated by dividing the number of experts who scored an item as either 3 or 4 by the total number of experts. Subsequently, modified kappa statistics will be employed to account for the chance agreement. To calculate modified kappa, the probability of chance agreement must first be determined using the following formula, where N represents the number of panellists and A represents the number of panellists who agree that the item is relevant:

Embedded Image

Next, the modified kappa will be computed using the following formula:

Embedded Image

Based on the criterion established by Fleiss (1981), kappa values exceeding 0.75 will regarded as ‘excellent’. Furthermore, Polit and Yang highlighted that an Individual Item-CVI value surpassing 0.78 is equivalent to a modified kappa value greater than 0.75. Consequently, an I-CVI value above 0.78 will serve as evidence of the item’s sufficient relevance.51 Items with a score lower than 0.78 will be removed.

To compute the Scale-CVI, the sum of the total CVI for each Individual Item-CVIitem will be divided by the total number of items in the instrument.52 In this approach, the desired value for S-CVI/Ave is 0.90, while the minimum acceptable value should be 0.80.50

Item analysis

Before proceeding with the assessment of construct validity, a pilot study will be conducted to evaluate the internal consistency of the NIBI. At this stage, the primary tool, having undergone face validity and content validity stages, will be distributed to 40 clinical nurses for their responses by convenience sampling method. The inclusion criteria for this stage include holding at least a bachelor’s degree in nursing, having more than 1 year of work experience and expressing willingness to participate in the study. Nurses who were part of the qualitative phase will not be recruited for this step. This step aims to identify any potential issues by estimating Cronbach’s alpha and inter-item correlation. Items that demonstrate a corrected item-total correlation score below 0.3 will be excluded from the next stages of psychometrics evaluations.53

Construct validity

Construct validity refers to the extent to which an instrument is suitable for measuring the intended construct.54 It is a measure of how well the instrument aligns with the concept it is designed to measure.50

Sampling and sample size

The study population for this stage of the study will consist of nurses involved in clinical practice within healthcare organisations across Iran. Convenience sampling will be employed to select participants during this phase. Data collection will be conducted online. To facilitate this, a Google Forms online questionnaire will be developed, and the corresponding URL link will be distributed to nurses via email or popular social networking applications such as Telegram or WhatsApp. The data collected through Google Forms will be exported to an Excel file for further analysis.

For exploratory factor analysis (EFA), the minimum sample size will be set at 300 individuals,55 or an alternative guideline of having 5–10 participants per item.56 In this research study, two separate independent samples will be collected. The first sample will consist of at least 300 participants and will be used for EFA. The second sample, comprising at least 200 distinct samples from the first one, will be gathered to conduct confirmatory factor analysis (CFA).57

Statistical data analysis

To assess the construct validity and uncover the latent constructs of the NIBI, EFA will be employed. In this process, the adequacy of the sampling and the appropriateness of the data will be evaluated using the Kaiser-Meyer-Olkin (KMO) method and Bartlett’s sphericity test.58 A KMO statistic exceeding 0.9 will be considered excellent.58 Factors will be primarily extracted using Varimax rotation, based on eigenvalues greater than 1 and a scree plot. The presence of an item within a latent factor will be determined based on a factor loading close to or exceeding 0.3, estimated using the formula: CV=5.152/√ (n−2), where CV represents the number of extractable factors, and ‘n’ represents the sample size.59 The number of factors will be estimated using Horn’s parallel analysis.60 Furthermore, items with commonalities below 0.2 will be excluded from the EFA.61

The CFA aims to assess the proposed factor structure from the EFA and confirm its validity with a different sample of participants. To evaluate the structural factors, CFA will be employed, using the maximum-likelihood method and the widely used goodness-of-fit indices. Model fitness will be assessed using several indices, including the root mean square error of approximation, Parsimonious Fit Index, Parsimonious Comparative Fit Index (CFI), Tucker-Lewis’s Index, CFI, Incremental Fit Index, and CMIN/DF. These indices will be used to determine the adequacy of the model fit. In this study, all statistical analyses will be conducted by using the SPSS (V.26)62 and AMOS63 software. A maximum error margin of 5% will be deemed acceptable for all tests.


Reliability pertains to the degree to which the observed variances in test scores can be attributed to genuine discrepancies in the characteristics being studied, as opposed to variations caused by random errors.47 To assess the reliability of the NIBI and measurement stability, two methods will be employed: internal consistency and test–retest reliability analysis.

Internal consistency

To evaluate the internal consistency of the measures, various statistical indicators will be used, namely Cronbach’s alpha (α), McDonald’s omega (Ω) and the average inter-item correlation (AIC). The thresholds for determining acceptable internal consistency will be based on previous research, where coefficient values of α and Ω exceeding 0.7 were considered satisfactory,64 while an AIC range of 0.2–0.4 was deemed acceptable.65

Test–retest reliability

To assess the stability of the NIBI, the intraclass correlation coefficients (ICC) will be employed.66 The ICC will be calculated using a two-way random effect model with a 2-week interval66 among a sample of 30 clinical nurses.


Responsiveness refers to the capability of an instrument to detect changes as they occur.67 In order to evaluate the responsiveness of the NIBI, the SE of measurement (SEM) and the minimum detectable changes (MDC) will be used.

The SEM will be calculated to quantify the errors in the scale scores. The formula for SEM is as follows:

SEM=SD Pooled × √ (1−ICC)

where SD Pooled represents the pooled SD.

The calculation of the MDC will be performed using the following formula:


This formula accounts for the SEM and applies a multiplier of 1.96 to achieve a 95% confidence level. The resulting MDC value represents the minimum amount of change that can be reliably detected by the instrument.68 In line with established criteria, an acceptable level of MDC is defined as being below 30%. Furthermore, an MDC value below 10% will considered to be excellent.69


Interpretability refers to the minimum important change in the score of the specific instrument, or the extent to which a change in the instrument’s score holds meaningful significance.70 To assess the interpretability of the NIBI, various factors will be examined, including the distribution of total scores across the entire sample, as well as the presence of floor and ceiling effects. In our study, the presence of floor and ceiling effects will be determined by calculating the percentage of participants who achieve the lowest and highest scores, respectively, on the NIBI. If more than 15% of respondents obtain the lowest or highest score, it will be considered indicative of the presence of floor or ceiling effects.47

Normal distributions, outliers

To examine the normal distributions of the data, both univariate and multivariate analyses will be conducted using measures of skewness and kurtosis. The multivariate distributions will be assessed to determine their normality and identify any potential multivariate outliers. The evaluation of multivariate normality will involve calculating Mardia’s coefficient of multivariate kurtosis, aiming for a value below 20.58 The presence of multivariate outliers will be determined by evaluating the Mahalanobis distance. Items with a Mahalanobis distance below p<0.001 will be classified as multivariate outliers.71


The instrument responses will be scored using a Likert scale. To facilitate interpretation and comparison, the scores will be transformed into a 0–100 scale using the following formula.72

Embedded Image

By converting the scores to standard values, a higher average score, closer to 100, will indicate a higher level of innovative behaviours among clinical nurses.

This post was originally published on