I found this paper, which seems to be what I was proposing! Labeling clinical notes to map the ICD codes (Multi-label text classification) https://arxiv.org/abs/2102.09136
Another article used the same data set MIMIC-III to evaluate the ICD9 code assignment of RNNs and CNNs. https://github.com/lsy3/clinical-notes-diagnosis-dl-nlp
GitHub seems to provide code and cleaned data sets.
Paper: https://arxiv.org/pdf/1802.02311v2.pdf
This one used a different dataset to assign ICD-10 code with BERTS: http://ceur-ws.org/Vol-2380/paper_67.pdf
Using the resources from the GitHub project to assign ICD9 code using different multi-label text classification models
Is it possible to optimize their models?
Prediction models using the ICD9 codes with covariates (insurance type, gender*, ethnicity, marital status, admission type) to see what are the top ICD codes that are associated with prolonged length of stay. https://towardsdatascience.com/predicting-hospital-length-of-stay-at-time-of-admission-55dfdfe69598
Compare the prediction models of different multi-label text classification models, and see if the results are agreed across models