The PRISM-Capabilities model for AI integrates the Practical, Robust Implementation and Sustainability Model (PRISM) [20] with the Capabilities Approach [21, 22] (Fig. 1). When combined, this model addresses historical shortcomings of AI in social science and CER by promoting ethical, human-centered collaboration. This ensures that research aligns with the values, morals, and needs of the communities being served, so that AI is a complement, rather than a replacement to human efforts. This approach also fosters participatory processes, shared learning, co-design, and co-ownership to ensure that AI-enabled CER is guided by community voice and lived experiences [23,24,25].
Fig. 1This figure illustrates the six interconnected and mutually reinforcing components of the PRISM-Capabilities model for AI, as applied to community-engaged research (CER)
PRISM is an implementation science framework for designing, delivering, and evaluating interventions [26], that incorporates the RE-AIM (Reach, Effectiveness, Adoption, Implementation, Maintenance) conceptual model [20]. We selected the PRISM framework because it explicitly incorporates organizational characteristics (e.g., culture, leadership), external environments (e.g., policy, funding) and perspectives of multiple stakeholders (patients, providers, administrators, funders), making it particularly well-suited for complex, real-world settings, and practical for implementation in diverse environments and at various levels (local community, nationally). In addition to implementation outcomes, PRISM focuses on sustainability and continuous feedback loops to support long-term change and ongoing improvement, which other implementation science frameworks may overlook. The domains of PRISM directly correspond to the types of data and decision points where AI methods (such as NLP, fairness audits, and simulation modeling) excel by enabling continuous learning, multilevel monitoring, and rapid feedback. PRISM allows for a holistic assessment of implementation efforts, including both process and outcome measures across multiple levels (patient, provider, organization, system) [20, 26]. The PRISM-Capabilities model for AI emphasizes iterative feedback and systems thinking, [29, 30] making PRISM the most suitable implementation science framework for use with AI tools. Moreover, PRISM guided the HEALing communities Study [31], which will be used to illustrate the PRISM-Capabilities model for AI in this paper.
The Capabilities Approach focuses on the freedoms and conditions that enable individuals and communities to achieve their goals [21, 27, 28]. When applied to CER, the Capabilities Approach emphasizes ethical imperatives rooted in autonomy and human dignity [32], which are critical when AI influences decisions and outcomes. It underscores the need to ground AI in local realities and ensure that individuals have a hand in shaping the data and insights that affect their communities [22, 28]. It also enhances transparency and accountability by embedding community voices in every phase of AI development and use [33]. This"bottom-up"approach elevates community contributions through shared ownership and local knowledge [34]. The PRISM-Capabilities model thus ensures that AI solutions are culturally relevant and tailored to community priorities, fostering equitable and effective outcomes.
As a non-linear model, PRISM-Capabilities supports iterative feedback aligned with human-centered design (HCD), where rapid cycles of human-AI collaboration refine implementation in real time. By incorporating systems thinking [35, 36], the model addresses the interconnected influences that shape CER outcomes. Co-creation and shared goals, such as clarifying the benefits of AI use, addressing bias, and transparency are central to this process. Ultimately, the model positions community members as co-designers, co-analysts, and co-stewards of AI-enabled CER.
This paper first presents an overview of the PRISM-Capabilities model for AI (Fig. 1). Next, this paper uses the HEALing Communities Study (HCS), the largest implementation science study ever funded to address substance use [31], as a retrospective use case to demonstrate the PRISM-Capabilities model. Although AI was not widely available during the implementation phase of HCS, it could have enhanced the CER process. This paper strengthens the empirical foundation of the PRISM-Capabilities model for AI by describing post-hoc analyses that will be completed to simulate the real-time utility of AI during HCS implementation to fully capitalize on the extensive dataset generated by the HCS, while also presenting limitations. Finally, we describe the potential technical limitations of AI such as hallucinations, explainability challenges, automation risks and algorithmic bias, which could undermine ethical CER implementation, while also proposing safeguards.
The interconnected components of the PRISM-Capabilities model for human-AI collaboration in CERBy delineating the model’s six components (Table 1), we offer a practical blueprint for translating the conceptual model into action. The model supports real-world application by detailing specific data types and analytic techniques (e.g., NLP, fairness audits, simulation modeling), promoting transparent human-AI collaboration, and surfacing key questions for participatory co-design. The guide is tailored to help research teams, AI experts, and community partners use AI in ways that enhance trust, contextual responsiveness, and ethical accountability throughout all phases of CER implementation.
Table 1 PRISM-capabilities model components for human-AI collaboration in community-engaged researchOptimizing engagement of implementers, settings, and recipientsThe PRISM-Capabilities model for AI begins with gathering data from key implementers, organizational leaders, community members, individuals with lived experience, and policymakers to identify challenges and priorities to improve intervention acceptability. Community engagement at this stage aims to co-define the research question, identify barriers, and ensure that diverse stakeholder voices are included from the outset of CER implementation. Human-AI collaboration in this phase could generate real-time insights using NLP, sentiment analysis and other AI tools when drawing from qualitative data to support more inclusive and effective implementation. Topic modeling could also be applied to meeting transcripts to identify recurring themes in engagement and trust, helping to tailor implementation strategies to local needs.
AI tools could help answer questions like: Are there emotional tone, morale, or participation gaps across stakeholder groups? Who are the key community partners? (identified via NLP in meeting transcripts)? What is the state of organizational infrastructure (assessed through partner feedback and documents)? How is the intervention perceived (measured through sentiment analysis)? What prior experiences shape implementer perspectives? What external factors, policy, funding, or local support might be barriers to implementation? What skills and training gaps exist among implementers (identified through performance records)? When answering such questions, topic modeling could uncover recurring themes to enhance an understanding of implementation challenges. Additionally, ML methods could support responsive and equitable decision-making by synthesizing diverse datasets such as demographic trends, local health outcomes, and economic indicators to construct a dynamic, data-driven model of the implementation context.
Characteristics of implementers, settings, and recipientsThis component considers the skills, capacities, readiness, and contextual factors of the individuals and systems involved in CER implementation to ensure alignment with local needs, contexts and available resources. This is achieved by incorporating feedback from all stakeholders early in the CER process and enabling continuous refinement of core components and implementation strategies. To ensure contextual fit, all stakeholders must assess whether interventions align with community needs, values, and available resources. SDM could be used to visualize variations across sites using NLP to enhance the process of understanding site-level differences in organizational readiness and capacity. SDM data sources may include in-depth interviews, focus group discussions, administrative records, and surveys. Real-time sentiment analysis could support timely adjustments by addressing questions like: How are participants responding to the intervention? Or What changes could increase impact?
Importantly, readiness indicators and other features used in AI models should be co-developed with community input. Tools like SHapley Additive exPlanations (SHAP) or Local Interpretable Model-agnostic Explanations (LIME) could help make AI outputs interpretable and actionable [37]. By quantifying how much each feature contributes to an individual prediction, SHAP and LIME enable transparent, consistent, and locally accurate explanations of complex ML models and could be used for auditing AI models, identifying bias, or building trust with stakeholders. AI models could also automate routine tasks and improve decision-making, and engagement.
Equity assessment and risk managementThe next component identifies potential disparities in implementation and outcomes and ensures inclusive access to benefits across diverse populations. When used in CER, AI could enhance and ensure equity assessment and risk management by continuously analyzing performance data, detecting trends in real time and supporting equitable intervention distribution [38, 39]. These tools could uncover disparities in participation, access and outcomes, particularly when implemented in collaboration with communities.
NLP methods, including supervised classification and unsupervised clustering, analyze meeting transcripts, interviews, and narratives to detect linguistic biases, exclusionary framing, and disparities in how underrepresented groups are being discussed by various stakeholders. These analyses could help identify patterns of disparities and disproportionate burden, prompting timely adaptations. AI could also integrate demographic and other contextual data to guide equitable resource allocation and performance using indicators such as race, income, geography, or criminal-legal system involvement [40]. AI dashboards and fairness audits that are stratified by these variables could be used to visualize emerging inequities and track subgroup disparities to shape inclusive and effective interventions [40, 41].
Ultimately, equity assessment in this model is not just about data accuracy, but also about participatory oversight and actionable insights that reduce harm for all populations. To ensure equity metrics are transparent, accountable and meaningful, communities must co-define risks by selecting which disparities to track, how to interpret subgroup errors, validate algorithmic outputs, and what thresholds warrant action. Fairness-aware modeling (e.g., demographic parity checks and disparate impact audits) must be implemented as a continuous auditing mechanism that is governed collaboratively and continuously, rather than as a one-time evaluation. This transforms equity assessments into a dynamic, corrective mechanism that moves beyond static disparity reporting to enable actionable, real-time mitigation. Furthermore, researchers could develop AI-driven dashboards to allow for transparency and data-informed outputs to enable CER implementation teams to respond quickly.
Implementation and sustainability infrastructureThe PRISM-Capabilities model for AI highlights the importance of contextual factors in building and sustaining implementation systems [27, 42]. This component of the model evaluates organizational systems, resource flows, training, and operational supports to ensure effective intervention delivery and sustainability. It provides a framework for optimizing workflows, training, and resource planning through simulation and forecasting tools that incorporate diverse stakeholder inputs [43].
AI tools such as ML models, agent-based modeling (ABM) and system dynamics modeling (SDM) could be used with community input to simulate or estimate needs such as resources, staffing, fidelity, or community engagement necessary for achieving the desired outcomes [44, 45]. This allows for informed decision-making, intervention planning [46], and maintenance of standards throughout the implementation process [19]. AI tools could also support implementation fidelity and sustainability by analyzing multiple data sources including meeting transcripts, session recordings, and technical assistance (TA) logs detailing the type of support offered, frequency of interactions, and specific implementation challenges addressed to assess intervention fidelity and community responsiveness to the intervention. For example, NLP could identify procedural drift or flag low engagement by analyzing language use, while ML could integrate fidelity reports with demographic data to detect where implementation may falter [22]. Importantly, researchers and community members should co-specify thresholds for acceptability to enable AI models to reflect shared expectations around fidelity and performance and empirically test these thresholds and the corresponding responses. This iterative testing is critical to developing data-informed decision rules when observed variables change in a community; and determine the type of response that is warranted and appropriate thresholds. Applying this strategy in studies like the HCS would allow communities and researchers to calibrate actions based on evidence, strengthen planning, training, and mid-course corrections through AI-informed learning cycles. Moreover, AI tools could detect early warning signs of resource strain (e.g., reduced meeting participation or burnout indicators) and simulate future implementation needs under various scenarios to support sustainability planning. ABM and SDM could test how variations in coalition leadership, staffing, or funding affect implementation success over time [19], and AI-driven forecasting tools ensure local relevance and accuracy, when developed with stakeholder input.
External environmentThe PRISM-Capabilities model for AI incorporates how external factors such as policies, regulations, community assets and broader socio-political factors shape the success and sustainability of CER [20, 27]. PRISM focuses on how systemic structures (e.g., laws, reimbursement systems, resource availability) affect intervention delivery and sustainability, while the Capabilities Approach examines how those same forces constrain or enable individuals’ abilities to achieve desired outcomes. Together, they provide a complementary lens to assess how broader conditions impact equity and feasibility in CER. AI tools could enhance this by processing large volumes of unstructured and structured data. NLP could be used to analyze policy documents, clinical guidelines, legislative records, and media content to extract relevant shifts in regulation, reimbursement, or political sentiment that may affect intervention implementation. Geospatial mapping could help identify gaps in local infrastructure (e.g., healthcare or educational facilities), while image-recognition tools could assess geographic disparities in service delivery [47, 48]. However, all AI-generated interpretations of policy or resource data should be validated through community and expert review, particularly in contexts with contested or historically exclusionary policies. Ultimately, the value of AI lies not only in monitoring regulatory, political or economic shifts, but in ensuring such insights are interpreted collaboratively and used to design ethically and practically grounded interventions that address real-world problems.
Ethical assessment and evaluationResearchers conducting CER must prioritize ethical AI use across all six components of the PRISM-Capabilities model to ensure inclusivity, equity, safety, data privacy, and accountability across all phases of CER. In this area, NLP could analyze large volumes of feedback (e.g., outcome data, meeting transcripts, social media sentiment etc.) to identify ethical concerns or outcomes that partners, participants, and community members may miss during human review. For instance, AI algorithms could detect early-stage implementation biases such as unequal access across sociodemographic groups and generate ethical impact reports to guide decision-making.
While AI tools are powerful for synthesis, they must not replace human judgment. Oversight is essential to contextualize AI outputs, especially given the complexity of behavior, cultural differences, and structural inequities across communities. Including data from diverse datasets (e.g., policies, administrative data, meeting minutes) enhances ethically grounded interpretations. In addition, researchers should use clear, well-contextualized prompts and integrate fact-checking to reduce hallucinations (i.e., inaccuracies that arise from overgeneralized or misaligned patterns in the training dataset) in AI-generated content [49, 50]. Though not eliminated entirely, hallucinations could be minimized through retrieval-augmented generation (RAG), in which AI retrieves real information from an external sources (meeting minutes, survey data, focus group discussions, etc.) while generating its answer [51, 52]. In this process, community members could be actively involved when reviewing AI-generated outputs for accuracy. Furthermore, AI tools must be deployed alongside strong data protection measures. This includes informed consent, clear explanation of AI’s role, compliance with ethical and legal standards (e.g., HIPAA, GDPR), and enterprise-level safeguards like secure platforms, encryption, role-based access, and audit logs. Additional protections such as text and voice anonymization and differential privacy techniques are also crucial when working with sensitive data. Researchers should systematically evaluate the intended and unintended consequences of AI-supported decisions as they evolve over time, integrating this into real-time monitoring. Sociodemographic overlays should be used in conjunction with feedback and outcome data to identify disparities that may not be visible in raw performance metrics. Ethical safeguards must include embedded de-identification protocols, differential privacy layers, and automated audit trail systems within AI pipelines to ensure procedural justice throughout the data lifecycle.
Explicit mechanisms should also be in place to uphold transparency in AI decision-making, supported by real-time explainability features. Algorithmic bias stemming from data and representational imbalances is also a critical issue, and AI models trained on biased data may produce harmful outcomes. To mitigate this, researchers must use diverse datasets, conduct fairness audits, and implement interpretable models. Explainability tools such as SHAP or LIME could help explain how a ML model made a specific prediction, especially when the model itself is complex and not directly interpretable [37]. This could clarify how decisions are made and help stakeholders verify their logic. Participatory AI-checking ensures diverse voices, including researchers, implementers, and people with lived experience are engaged throughout CER. Finally, researchers should also support the development of open-source explainability tools and community-governed AI systems [53].
To ensure ethical and equitable CER, we propose that all stakeholders involved in CER adopt an ethical checklist guided by the six phases of the PRISM- Capabilities model for AI (Table 2). This checklist helps establish a foundation for ethical accountability, fosters trust, and promoting responsible AI integration during CER.
Table 2 Checklist to guide the ethical use of AI in CERMachine learning/AI tools considered in the PRISM-capabilities modelHaving outlined the six components of the PRISM-Capabilities model, we now turn to the specific AI and ML tools that enable real-world application. To move beyond conceptual guidance, the following section (Table 3) operationalizes each of the six components with concrete AI methods, typical data sources, strengths and limitations, and human-AI collaboration points. This mapping is critical for researchers seeking to apply the model in practice, particularly in studies like HCS [31].
Table 3 ML/AI tools considered in the PRISM-capabilities model for AINLP and sentiment analysis could rapidly synthesize qualitative feedback, while ML models could predict patterns and disparities, supporting equity and engagement goals. Generative AI and topic modeling could accelerate reporting and thematic analysis, but require human oversight to maintain accuracy and contextual sensitivity. Simulation and forecasting models could assist in sustainability planning, while explainable AI (XAI) methods and privacy-preserving technologies strengthen transparency, ethical oversight, and data protection. Across all applications, human-AI collaboration remains foundational, ensuring AI complements rather than replaces community knowledge and decision-making. This concrete synthesis moves beyond speculation to offer an actionable, ethically grounded roadmap for AI integration into CER that is fully aligned with the PRISM-Capabilities model for AI.
Case Study: a practical application of the PRISM-Capabilities model in the HEALing Communities Study (HCS)HCS was the largest implementation science research effort to date to address fatal overdose deaths in the US. Guided by the PRISM framework [20, 31, 54, 55], HCS was a multisite, community-level, cluster-randomized controlled trial designed to evaluate the effectiveness of the Communities that HEAL (CTH) intervention in reducing opioid-related overdose deaths in highly affected communities [56]. The trial was guided by the Reach, Effectiveness, Adoption, Implementation, and Maintenance (RE-AIM) framework and PRISM [20]. A total of 67 communities in Kentucky, Massachusetts, New York, and Ohio were randomly assigned to either the intervention arm (n = 34 communities) or the wait-list control arm (n = 33 communities), stratified by state. The study was approved by Advarra, an independent research review organization, which served as the single Institutional Review Board. Oversight was provided by a Data and Safety Monitoring Board (DSMB) chartered by the National Institute on Drug Abuse (NIDA) [31, 56, 57].
The CTH intervention unfolded in six phases (Fig. 2), emphasizing community engagement, evidence-based practices (EBPs), and data-driven decision-making by community coalitions who were aided by visualizations made available via community-specific data dashboards [58, 59]. EBPs included increased naloxone distribution, expanded access to medications for opioid use disorder (MOUD), improved MOUD linkage and retention, promotion of safer opioid prescribing and dispensing, and communication campaigns to drive demand for EBPs and reduce stigma toward MOUD and people who use drugs [60].
Fig. 2The phased approach for implementation of the Communities that HEAL (CTH) intervention of the HCS to reduce fatal overdose (Martinez, L.S., et al., Community engagement to implement evidence-based practices in the HEALing communities study. Drug and alcohol dependence, 2020. 217: p. 108,326.)
The HCS utilized a vast amount of data from multiple sources (Table 4). To ensure fidelity to the CTH intervention, researchers implemented rigorous monitoring protocols, including monthly assessments of EBPs delivered in communities. In addition to qualitative data, HCS collected extensive administrative and epidemiological data.. Some study sites used advanced modeling techniques to further refine predictive capabilities, such as SDM to capture the interconnected nature of the opioid crisis and intervention points to inform the deployment of EBPs with community coalitions [61]. This approach allowed for a more holistic view of the system-wide impacts of implemented EBPs. The New York sites also utilized ABM to simulate individual behaviors and interactions within the community to predict how much EBPs needed to increase to achieve the study outcomes [62]. The integration of these diverse data sources and analytical methods created a robust framework for evaluating the effectiveness of the CTH intervention. By combining qualitative insights, quantitative metrics, and advanced modeling techniques, HCS was able to provide a comprehensive assessment of community-level efforts to implement EBPs and reduce opioid overdose deaths.
Table 4 Sources of data used in the New York State site of the HCSPotential empirical validation and scenarios for retrospective and real-world applications of the PRISM-capabilities model for AI using HCSWhile the PRISM-Capabilities Model for AI was not used during HCS implementation, we offer concrete operationalization of the six components through retrospective and real-time applications of AI. Table 5 details how AI tools could be used to retrospectively analyze existing HCS data and simulates real-time utility in CER to validate the model. Table 1 also maps the six model components to corresponding AI tools and analytic objectives (e.g., sentiment shifts, engagement trajectories, policy simulations). Together, these tables illustrate the model’s empirical utility and transition it from theoretical abstraction to a data-driven implementation roadmap.
Table 5 Empirical operationalization of the PRISM-capabilities model using retrospective and real-time AI applications in the HEALing Communities Study (HCS)Retrospective and real-time validation using HCS dataTo operationalize the PRISM-Capabilities model for AI in dynamic, CER, we are conducting a multi-pronged, post-hoc analysis using HCS data to evaluate the practical utility of the PRISM-Capabilities model for AI and empirically test each of its six components. We describe potential retrospective analyses and real-time AI tools across all six components to enable responsive implementation, ethical oversight, and iterative adaptation in CER below. Our aim is to assess whether state-of-the-art AI techniques can replicate the barriers, facilitators, and outcomes observed during the original HCS implementation [69]. These analyses leverage both structured and unstructured data, collected from 16 New York State communities (Table 4).
Optimizing engagementRetrospectiveAs described in Table 5, NLP methods such as BERTopic and Latent Dirichlet Allocation (LDA) could be applied to coalition transcripts and interviews to uncover evolving themes when engaging community service providers for the deployment of EPBs. Sentiment analysis tools (e.g., Vader, TextBlob, RoBERTa-based models) can track emotional tone related to stigma, optimism, and resistance. Sequential pattern analysis could be used to map the alignment between coalition goals and researcher priorities, cross-referenced with fidelity data and TA logs to assess community coalitions engagement trends. Furthermore, sequential pattern analysis could be used to understand the barriers and facilitators community service providers and other stakeholders faced when identifying and deploying EPBs.
Real-timeTransformer-based NLP models (e.g., RoBERTa) and real-time topic modeling could monitor ongoing transcripts, TA logs, and coalition involvement in the deployment of EBPs. These tools could be used to generate sentiment dashboards, flag disengagement, track participation equity, and detect signs of community fatigue. Automated alerts could be used to support timely facilitation adjustments and re-engagement of underrepresented stakeholders and coalitions members.
Characteristics of implementers, settings, and recipientsRetrospectiveML classifiers (e.g., Random Forests, XGBoost) could be trained on structured data from readiness assessments, staffing patterns, and coalition characteristics to predict effective EBP implementation. Clustering algorithms (e.g., k-means, DBSCAN) could be used to identify distinct community typologies that may benefit from tailored implementation support.
Real-timeML classifiers and NLP analytics could be applied to readiness data, interviews, and implementation records could be used to detect contextual misalignments (e.g., low local capacity, cultural misalignment). These tools could be used to guide real-time adaptation by matching interventions to community strengths, flagging implementation risks, and dynamically tailoring TA logs and resource allocation.
Equity assessment and risk managementRetrospectiveNLP tools could be used to surface patterns of underrepresentation, exclusionary language, and implicit bias in coalition discourse. Fairness-aware ML algorithms (e.g., reweighing, adversarial debiasing) could be used to detect disparities in resource distribution and access to EBPs across demographic groups. Findings coulc be triangulated with fidelity scores and intervention outcomes and could be used to validate equity concerns raised by the community coalitions involved in CER.
Real-timeEquity dashboards integrate MOU
Comments (0)