README.md 4.44 KB
Newer Older
Ryan Melvin's avatar
Ryan Melvin committed
1
# COVID-19_RISK_PREDICTOR
2

Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
3
4
***!!! For research purposes only !!!***

5
6
7
8
9
10
11
12
13
14
15
16
17
18
- [COVID-19_RISK_PREDICTOR](#covid-19_risk_predictor)
    - [Data availability](#data-availability)
    - [Usage](#usage)
        - [Installation](#installation)
        - [Requirements](#requirements)
        - [Activate conda environment](#activate-conda-environment)
        - [Run parser](#run-parser)
        - [Run model training](#run-model-training)
        - [Build Streamlit app](#build-streamlit-app)
        - [Unit Testing](#unit-testing)
    - [Contact information](#contact-information)

**Aim:** To develop a model that takes in demographics, living style and symptoms/conditions to predict risk of COVID-19
infection for patients.
Ryan Melvin's avatar
Ryan Melvin committed
19

Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
20
## Data availability
21
22
23
24

Data was made available through the UAB Biomedical Research Information Technology Enhancement (U-BRITE) framework.
Access to the level-2 i2b2 data was granted upon self-service pursuant to an IRB exemption.
[link](https://www.uab.edu/ccts/research-commons/berd/55-research-commons/informatics/325-i2b2)
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
25
26

### Directory structure used to parse data from positive and negative cohorts
Ryan Melvin's avatar
Ryan Melvin committed
27

28
29
30
31
Dataset used was transformed to adhere to the [OMOP Common Data Model Version 5.3.1](https://ohdsi.github.io/CommonDataModel/cdm531.html)
to enable systemic analyses of EHR data from disparate sources.

```directory
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
32
33
34
35
36
37
38
39
40
41
42
43
Cohorts/
├── positive               <--- positive cohort directory
│   ├── measurement.csv - test and results
│   ├── condition_occurance.csv - conditions of patients
│   ├── observation.csv - things like smoking history
│   └── person.csv - demographic information
├── negative                <--- negative cohort directory
│   ├── measurement.csv - test and results
│   ├── condition_occurance.csv - conditions of patients
│   ├── observation.csv - things like smoking history
│   └── person.csv - demographic information
└── README.md
Ryan Melvin's avatar
Ryan Melvin committed
44
45
```

Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
46
47
48
## Usage

### Installation
49

Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
50
51
52
53
54
55
56
57
Installation simply requires fetching the source code. Following are required:

- Git

To fetch source code, change in to directory of your choice and run:

```sh
git clone -b master \
58
    git@gitlab.rc.uab.edu:center-for-computational-genomics-and-data-science/public/covid-19_risk_predictor.git
Ryan Melvin's avatar
Ryan Melvin committed
59
```
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

### Requirements

*OS:*

Currently works only in Linux OS. Docker versions may need to be explored later to make it useable in Mac (and
potentially Windows).

*Tools:*

- Anaconda3
    - Tested with version: 2020.02

### Activate conda environment

Change in to root directory and run the commands below:

```sh
# create conda environment. Needed only the first time.
conda env create --file configs/environment.yaml

# if you need to update existing environment
conda env update --file configs/environment.yaml

# activate conda environment
Ryan Melvin's avatar
Ryan Melvin committed
85
86
conda activate rico
```
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
87
88

### Run parser
89
90

```sh
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
91
92
93
94
python src/filter_dataset.py --pos Cohorts/positive/ --neg Cohorts/negative/
```

For help, use the `-h` help argument
95
96

```sh
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
97
98
99
100
101
python src/filter_dataset.py -h
```

parsed files are saved in `./results` directory.

Ryan Melvin's avatar
Ryan Melvin committed
102
### Run model training
103
104

```sh
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
105
python src/Model.py --input results/encoded-100-week-filter.csv
Ryan Melvin's avatar
Ryan Melvin committed
106
107
```

Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
108
109
110
output files are saved in `./results` directory.

### Build Streamlit app
111
112

To demonstrate the application of these models one of the four was chosen and a sample Streamlit app was created and included in the project. Please refer to
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
113
114
`src/streamlit/RICO.py`

115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
**Note** - This Streamlit app is for demonstration of one of the models and is not a necessity for the pipeline but only for display of calculation and interpretation. The questionnaire from the models can be used manually without this. Hence, the Streamlit app is not tested and should be used at your own risk for demo purposes or as a guide for building from this work.

### Unit Testing

To test the functions in `filter_dataset.py`, use the below command -

```sh
python -m unittest -v testing/unit_test.py
```

To test the coverage of testing, use the below commands -

```sh
# test the coverage
coverage run -m unittest -v testing/unit_test.py

# To get a coverage report
coverage report

# To get annotated HTML listings
coverage html
```

**Note** - Functions in `Model.py` are adapted from [this Github repo](https://github.com/yandexdataschool/roc_comparison),
where they already implemented unit testing.
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
140
141

## Contact information
142
143

For issues, please send an email with clear description to
Tarun karthik kumar Mamidi's avatar
Parsing    
Tarun karthik kumar Mamidi committed
144
145
146

Tarun Mamidi    -   tmamidi@uab.edu

147
Ryan Melvin     -   rmelvin@uabmc.edu