plt.title('User Requested RAM per CPU and per Node together for all Jobs')
plt.title('User Requested RAM per CPU and per Node together for all Jobs')
plt.xlabel('Requested Gigs of RAM')
plt.xlabel('Requested Gigs of RAM')
plt.ylabel('Number of Users Requesting')
plt.ylabel('Number of Users Requesting')
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
```
#shows requested cpu memory for array jobs alongside requested cpu memory for non array jobs for easy comparison.
#shows requested cpu memory for array jobs alongside requested cpu memory for non array jobs for easy comparison.
CPU_arraytask_fig = sns.distplot(CPU_arraytask['ReqMemCPU'], kde=False, label='CPU Array Task', color = "green")
CPU_arraytask_fig = sns.distplot(CPU_arraytask['ReqMemCPU'], kde=False, label='CPU Array Task', color = "green")
CPU_arraytask_fig.set_yscale('log')
CPU_arraytask_fig.set_yscale('log')
CPU_nonarraytask_fig = sns.distplot(CPU_nonarraytask['ReqMemCPU'], kde=False, label='CPU Non Array Task')
CPU_nonarraytask_fig = sns.distplot(CPU_nonarraytask['ReqMemCPU'], kde=False, label='CPU Non Array Task')
CPU_nonarraytask_fig.set_yscale('log')
CPU_nonarraytask_fig.set_yscale('log')
plt.legend(prop={'size': 12})
plt.legend(prop={'size': 12})
plt.title('User Requested RAM per CPU for Array Jobs vs Not Array Jobs')
plt.title('User Requested RAM per CPU for Array Jobs vs Not Array Jobs')
plt.xlabel('Requested Gigs of RAM')
plt.xlabel('Requested Gigs of RAM')
plt.ylabel('Number of Users Requesting')
plt.ylabel('Number of Users Requesting')
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
```
#shows requested node memory for array jobs alongside requested node memory for non array jobs for easy comparison.
#shows requested node memory for array jobs alongside requested node memory for non array jobs for easy comparison.
Node_arraytask_fig = sns.distplot(Node_arraytask['ReqMemCPU'], kde=False, label='Node Array Task', color = "green")
Node_arraytask_fig = sns.distplot(Node_arraytask['ReqMemCPU'], kde=False, label='Node Array Task', color = "green")
Node_arraytask_fig.set_yscale('log')
Node_arraytask_fig.set_yscale('log')
Node_nonarraytask_fig = sns.distplot(Node_nonarraytask['ReqMemNode'], kde=False, label='Node Non Array Task')
Node_nonarraytask_fig = sns.distplot(Node_nonarraytask['ReqMemNode'], kde=False, label='Node Non Array Task')
Node_nonarraytask_fig.set_yscale('log')
Node_nonarraytask_fig.set_yscale('log')
plt.legend(prop={'size': 12})
plt.legend(prop={'size': 12})
plt.title('User Requested RAM per Node for Array Jobs vs Not Array Jobs')
plt.title('User Requested RAM per Node for Array Jobs vs Not Array Jobs')
plt.xlabel('Requested Gigs of RAM')
plt.xlabel('Requested Gigs of RAM')
plt.ylabel('Number of Users Requesting')
plt.ylabel('Number of Users Requesting')
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
# These are Plotly Express Graphs of the some of the above Seaborn graphs. Run them only if you need more details about the data in the graph. They will make your notebook run slower.
# These are Plotly Express Graphs of the some of the above Seaborn graphs. Run them only if you need more details about the data in the graph. They will make your notebook run slower.
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Graphs: User Requested RAM per CPU for all Jobs
Graphs: <br>
User Requested RAM per CPU for all Jobs
<br>
<br>
User Requested RAM per CPU for Non Array Jobs
User Requested RAM per CPU for Non Array Jobs
<br>
<br>
User Requested RAM per CPU for Array Jobs
User Requested RAM per CPU for Array Jobs
<br>
<br>
User Requested RAM per Node for all Jobs
User Requested RAM per Node for all Jobs
<br>
<br>
User Requested RAM per Node for Non Array Jobs
User Requested RAM per Node for Non Array Jobs
<br>
<br>
User Requested RAM per Node for Array Jobs
User Requested RAM per Node for Array Jobs
<br>
<br>
These graphs create histograms using the data for the month of March 2020.
These graphs create histograms using the data for the month of March 2020.
The x axis measures the amount of requested RAM in gigs per CPU/Node, from 0 to the max declared in the upperRAMlimit variable above - 5 gigs.
The x axis measures the amount of requested RAM in gigs per CPU/Node, from 0 to the max declared in the upperRAMlimit variable above - 5 gigs.
The y axis measures how many users requested that amount RAM per CPU or Node.
The y axis measures how many users requested that amount RAM per CPU or Node.
Can also show box or violin graph above to show where min, max, median, and 3rd quartile is.
Can also show box or violin graph above to show where min, max, median, and 3rd quartile is.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
```
CPU_fig = px.histogram(CPU_cutoff, x="ReqMemCPU",
CPU_fig = px.histogram(CPU_cutoff, x="ReqMemCPU",
title='User Requested RAM per CPU for all Jobs',
title='User Requested RAM per CPU for all Jobs',
labels={'ReqMemCPU':'ReqMemCPU'}, # can specify one label per df column
labels={'ReqMemCPU':'ReqMemCPU'}, # can specify one label per df column
opacity=0.8,
opacity=0.8,
log_y=True, # represent bars with log scale
log_y=True, # represent bars with log scale
marginal="box", # can be `box`, `violin`
marginal="box", # can be `box`, `violin`
hover_data=CPU_cutoff.columns,
hover_data=CPU_cutoff.columns,
nbins=30,
nbins=30,
color_discrete_sequence=['goldenrod'] # color of histogram bars
color_discrete_sequence=['goldenrod'] # color of histogram bars