Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
RC Data Science
createAndParseSACCT
Commits
1a2043a5
Commit
1a2043a5
authored
Apr 20, 2020
by
Ryan Randles Jones
Browse files
updated variable names and added doc strings
parent
114ca4aa
Changes
1
Hide whitespace changes
Inline
Side-by-side
slurm-2sql.ipynb
View file @
1a2043a5
%% Cell type:code id: tags:
```
import sqlite3
import slurm2sql
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from scipy.stats import skew
import plotly.express as px
```
%% Cell type:code id: tags:
```
# creates database of info from March 2020 using sqlite 3
db = sqlite3.connect('/data/rc/rc-team/slurm-since-March.sqlite3')
```
%% Cell type:code id: tags:
```
#slurm2sql.slurm2sql(db, ['-S', '2020-04-01', '-a'])
```
%% Cell type:code id: tags:
```
# For example, you can then convert to a dataframe:
df1 = pd.read_sql('SELECT * FROM slurm', db)
# df_1 is starting database
df_1 = pd.read_sql('SELECT * FROM slurm', db)
```
%% Cell type:code id: tags:
```
# for displaying all available column options
pd.set_option('display.max_columns', None)
df1.head(5)
df
_
1.head(5)
```
%% Cell type:code id: tags:
```
df2 = df1.loc[:,['ReqMemCPU', 'ReqMemNode']]
#df2.head(5)
# df_2 is database with only ReqMemCpu and ReqMemNode
df_2 = df_1.loc[:,['ReqMemCPU', 'ReqMemNode']]
#df_2.head(5)
```
%% Cell type:code id: tags:
```
# df_batch is df_2 with only batch jobs
df_batch = df1.JobName.str.contains('batch')
#df2[df_batch]
#df
_
2[df_batch]
```
%% Cell type:code id: tags:
```
cutoff = df2[df_batch][(df2[df_batch].ReqMemCPU <= 1e+10)]
cutoff
# creates database from df_batch of ReqMemCPU batch jobs that are < or = a given point
CPU_cutoff = df_2[df_batch][(df_2[df_batch].ReqMemCPU <= 1e+10)] # 1e+10 is 1 gig
CPU_cutoff
```
%% Cell type:code id: tags:
```
cutoff.describe(include=None, exclude=None)
# gives mean, min, max, std, and 3 percentiles for cutoff data
# can change what to include or exclude
CPU_cutoff.describe(include=None, exclude=None)
```
%% Cell type:code id: tags:
```
fig = px.histogram(cutoff, x="ReqMemCPU",
# creates histogram of ReqMemCPU for the month of March 2020
# uses cutoff cpu memory declared in df_cutoff - 1 gig
# also can show box or violing graph above to show where min, max, median, and 3rd quartile is
# the mean is at just under half a gig requested memory CPU
CPU_fig = px.histogram(CPU_cutoff, x="ReqMemCPU",
title='Histogram of ReqMemCPU',
labels={'ReqMemCPU':'ReqMemCPU'}, # can specify one label per df column
opacity=0.8,
log_y=True, # represent bars with log scale
marginal="box", # can be `box`, `violin`
hover_data=cutoff.columns,
hover_data=
CPU_
cutoff.columns,
nbins=30,
color_discrete_sequence=['
india
nr
e
d'] # color of histogram bars
color_discrete_sequence=['
golde
nr
o
d'] # color of histogram bars
)
fig.show()
CPU_
fig.show()
```
%% Cell type:code id: tags:
```
cutoff[['ReqMemCPU']].plot(kind='hist',bins=50,rwidth=1, logy=True)
plt.show()
# creates database from df_batch of ReqMemNode batch jobs that are < or = a given point
Node_cutoff = df_2[df_batch][(df_2[df_batch].ReqMemNode <= 1e+10)] # 1e+10 is 1 gig
```
%% Cell type:code id: tags:
```
x = cutoff[['ReqMemCPU']]
print (skew(x))
```
# creates histogram of ReqMemNode for the month of March 2020
# uses cutoff node memory declared in Node_cutoff - 1 gig
# also can show box or violing graph above to show where min, max, median, and 3rd quartile is
# the mean is at just under half a gig requested memory Node
%% Cell type:code id: tags:
```
cutoff = df2[df_batch][(df2[df_batch].ReqMemNode <= 1e+10)]
```
%% Cell type:code id: tags:
```
fig = px.histogram(cutoff, x="ReqMemNode",
Node_fig = px.histogram(Node_cutoff, x="ReqMemNode",
title='Histogram of ReqMemNode',
labels={'ReqMemNode':'ReqMemNode'}, # can specify one label per df column
opacity=0.8,
log_y=True, # represent bars with log scale
marginal="box", # can be `box`, `violin`
hover_data=cutoff.columns,
hover_data=
Node_
cutoff.columns,
nbins=30,
color_discrete_sequence=['
indianred
'] # color of histogram bars
color_discrete_sequence=['
darkblue
'] # color of histogram bars
)
fig.show()
```
%% Cell type:code id: tags:
```
cutoff[['ReqMemNode']].plot(kind='hist',bins=50,rwidth=1, logy=True)
plt.show()
```
%% Cell type:code id: tags:
```
x = cutoff[['ReqMemNode']]
print (skew(x))
Node_fig.show()
```
%% Cell type:code id: tags:
```
```
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment