Developing Conversational Data Analytics in Health and Care

The Conversational Data Analytics (CDA) project aims to develop a chatbot-based teaching platform that makes statistical techniques and coding skills more accessible to people who use data in their work but lack formal training in statistics. Ultimately, the platform could connect with digital NHS health records to help clinicians and clinically adjacent professionals such as GPs, social care staff, and occupational therapists gain statistically grounded insights into the populations they serve. 

The project is led by Dan Kumpik, a CHI-Zone postdoctoral researcher whose work focuses on the use of large language models (LLMs) for health and social care. Liverpool has a strong emphasis on collaboration between academics with different areas of expertise and with local industry partners. The breadth of activity aimed at supporting digital health and social care innovation in the Liverpool City Region makes it an exciting environment in which to develop projects like CDA. 

The first stage of the project has involved developing a prototype chatbot-based web application for teaching statistics. One challenge with LLM-based systems is that they can sometimes generate inaccurate responses, often referred to as “hallucinations.” To address this, the team uses a technique called retrieval-augmented generation (RAG), which constrains the chatbot’s responses using a trusted knowledge base. 

The foundational knowledge base draws on help files from the statistical analysis software StatsDirect, developed by Professor Iain Buchan of the University’s Civic Health Innovation Labs (CHIL). The team is also building a complementary repository of validated statistical analysis templates written in R. These examples mirror the statistical procedures described in the StatsDirect documentation, meaning the chatbot can guide users not only on which statistical tests are appropriate but also how to implement them in code. 

Alongside this, a custom safety module has been developed to detect harmful or inappropriate exchanges, as well as an automated evaluation dataset and pipeline that allows the team to continuously test and refine the system’s responses as development progresses. 

Another key priority has been ensuring the platform is designed around the needs of the people who will ultimately use it. With rapid advances in LLM capabilities continuing across the AI space, the project has prioritised the co-design aspect to better understand what users want from an AI-assisted statistics teaching platform in the context of these new capabilities. 

The team has launched a co-creation process involving students and researchers who use statistics but are not statistical specialists. The first phase, currently ongoing, consists of semi-structured interviews exploring people’s experiences with statistics, data, and AI tools; the barriers they encounter when learning statistical methods; and their expectations for how an AI-based system might support their work.  

Once the interviews are complete, qualitative thematic analysis will be conducted to identify patterns across different user groups. These insights will inform the design of the next prototype, which will then be tested in a second user-experience phase before we iterate further. 

Alongside the CDA project, Dan’s work within the CHI-Zone has also explored other applications of AI in health and care. This includes supporting research design on a study evaluating the potential of AI-enabled smart speakers to support people receiving domiciliary care, within our Adult Social Care Testbed. This project brings together a commercial smart speaker provider, local authorities in Merseyside, and care providers to trial the technology.

Over the next year, work will focus on refining the CDA prototype, continuing the co-design process with users, and moving closer to a platform that can support clinicians and other professionals in making better use of data in their work. 

Follow us on LinkedIn for the latest updates.

Next
Next

Spotlight: vTime – AI-generated immersive training for healthcare