In one of Matogen Applied Insights’ (MAI) earliest health sector data science projects, a US-based medical company approached the team to create a tuberculosis monitoring dashboard using clinical trial data, including an automated ETL (Extract, Transform, Load) and wrangling process.
Monitoring tuberculosis patients
Keeping track of tuberculosis patients is critical as there are various exogenous and endogenous risk factors that influence the likelihood of developing active disease following exposure to tuberculosis bacilli. In addition, traditional monitoring approaches to track HIV or TNF (Tuberculosis Necrosis Factor), for example, rely primarily on costly and time intensive on-site visits and source data verification. In order to mitigate these difficulties, regulatory agencies encourage adopting a risk-based monitoring (RBM) approach that identifies and tracks critical data and procedures regarding the overall impact on trial integrity and subject safety.
Data wrangling and visualisation
As the first step in the ETL process, the MAI data scientist coded a SAS script to extract the source information from the clinical database and then transformed the aggregate patient attributes into a usable format. The output data was then connected and loaded into Power BI, creating a dashboard that visualises critical data elements for all the participant sites involved. It used the R-script functionality within Power BI to deliver detailed visualisations, including risk factor diagrams, individual site performance analysis featuring interactive plots as well as drill-down capabilities.
This tuberculosis monitoring dashboard ensures that critical information reaches the right people in a timeous manner in a format that is easy to interpret even for the non-statistical end-user. By utilising Power BI, the automated process “refreshes” functionality, thereby ensuring that the provision of tuberculosis risk data is consistent, actionable and readily available to all clinical project management team members.