Authors: Xian Du, Meysam Safarzadeh, Maoqin Zhu, Shishir Prasad, Sudeshna Das, Joohyun Chung
Abstract
Pain monitoring and assessment traditionally rely on subjective methods such as self-reports and caregiver evaluations, which can be costly and often inaccurate due to their inherent subjectivity and reliance on the individual’s communication skills. Many objective methods have been introduced to address these issues, primarily utilizing single or multiple wearable sensor modalities. However, these approaches face challenges in home care settings, particularly concerning continuous wearability and discomfort, especially among elderly users. An alternative solution is using patient monitoring tools such as various imaging modalities to detect pain-related facial expressions. In this paper, we developed a new transformer model to extract pain-related features from facial expressions captured through three imaging modalities—RGB, thermal, and depth across sequential images. This method can leverage the multi-attention mechanism of the transformer to grasp intricate semantic relationships within visual signals for pain assessment. We demonstrated that this model can achieve a classification accuracy of 37.56% on a pain intensity scale of 0 (no pain) – 4 (intense pain) using a publicly available MIntPAIN dataset. This accuracy is comparable to nurses’ assessments (43% in four-class classification) and outperforms the state-of-the-art machine learning models in this field. We identified the five most critical facial action units for assessing pain through attention maps, which agree well with subjective pain assessment. These attention maps visually reveal the weight of each modality’s contribution to the final pain scale, offering valuable insights that can aid clinicians in gaining a deeper understanding of the relationship between patient expression and pain levels.
Access online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5201152
