Proper mortality prediction in the intensive care units (ICUs) is essential to provide
timely clinical care and efficient resource allocation. The traditional systems of
scoring like SOFA and APACHE II are limited by their inability to reflect dynamic
and multimodal nature of patient information and they are static. It is our hypothesis in this work that a multimodal deep learning-based ICU mortality prediction
model can be developed using four heterogeneous data streams, namely structured
clinical parameters, temporal vital signs sequences, unstructured clinical text, and
chest X-ray images. The framework uses the multilayer perceptrons that encode
structured data, bidirectional long short-term memory networks that encode temporal signals, transformer-based language models that use Low-Rank Adaptation
(LoRA) to encode clinical narratives, and DenseNet-121 to encode radiographic
images. The merged representations are learned using an early fusion approach
and learned through a deep neural architecture to create mortality risk predictions. The model is tested on MIMIC-IV with better performance over traditional
scoring systems and unimodal baselines with a maximum of 0.95 on the AUROC.
Moreover, SHAP-based interpretability offers clinically relevant information, as it
reveals how physiological and textual factors are important mortality risk factors.
The findings higlights the need to combine multimodal data with the purpose of
enhancing predictive accuracy and transparency, providing a powerful, clinically
useful decision-support system in the critical care setting.