singularity_container.ipynb 8.5 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# SIngularity Containers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is a container?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. A container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.\n",
    "\n",
    "Containers use the host system's kernel, and can access its hardware more directly. When you run a process in a container it runs on the host system, directly visible as a process on that system. Unlike a Virtual Machine, container is a virtualization at the software level, whereas VMs are virtualization at hardware level. If you are interested in finding out more differences between VM and a container, go to this [link](https://www.electronicdesign.com/dev-tools/what-s-difference-between-containers-and-virtual-machines)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Why use a container?\n",
    "\n",
    "Containers package together a program and all of its dependencies, so that if you have the container you can use it on any Linux system with the container system software installed. It doesn't matter whether the system runs Ubuntu, RedHat or CentOS Linux - if the container system is available then the program runs identically on each, inside its container. This is great for distributing complex software with a lot of dependencies, and ensuring you can reproduce experiments exactly. If you still have the container you know you can reproduce your work. Also since the container runs as a process on the host machine, it can be run very easily in a [SLURM job](https://docs.uabgrid.uab.edu/wiki/Slurm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Docker vs Singularity?\n",
    "\n",
    "[Docker](https://www.docker.com/) is the most popular and widely used container system in the industry. But [Singularity](https://www.sylabs.io/singularity/) was built keeping HPC in mind, i.e a shared environment. Singularity is designed so that you can use it within SLURM jobs and it does not violate security constraints on the cluster. Though, since Docker is very popular and a lot of people were already using the Docker for their softwares, Singularity maintained a compatibility for Docker images. We'll be seeing this compatibility later in the notebook.\n",
    "\n",
    "Singularity is already available on Cheaha. To check the available modules for Cheaha, run the cell below:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
    "!module avail Singularity"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you might have already noticed that we already loaded a Singularity module while starting up this notebook. You can check the version of the Singularity loaded below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity --version"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basic singularity command line functions:\n",
    "\n",
    "To check the basic functions or command line options provided run help on the singularity "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To check more information about a particular parameter, use help in conjunction with that parameter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
    "!singularity pull help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 1:\n",
    "\n",
    "In this example we are going to be pulling images from [Singularity-Hub](https://singularity-hub.org/). This singularity image contains the tool [neurodebian](http://neuro.debian.net/).\n",
    "\n",
    "NeuroDebian provides a large collection of popular neuroscience research software for the Debian operating system as well as Ubuntu and other derivatives."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's begin by pulling the [Neurodebian Singularity image](https://singularity-hub.org/collections/209) from Singularity Hub"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity pull shub://neurodebian/neurodebian"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have pulled the image you should be able to see that image in your directory by simply running a 'ls' command"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!ls"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we'll try to execute a command from within the container. Remember that exec parameter allows you to achieve this functionality. Let's list the content of you /data/user/$USER directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity exec neurodebian-neurodebian-master-latest.simg ls /tmp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Hmmm, an error. Remember your singularity container image doesn't know about the directories on your host machine. It by default (in most containers) binds your HOME and tmp directory. So try to run the above command again but change the list path to your HOME directory.\n",
    "\n",
    "Now, all our raw data is generally in our /data/user/$USER locations, so we really need to access that location if our container has to be useful. SIngularity provides you with a parameter (-B) to bind path from your host machine to the container. Try the same command again, but with the bind parameter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity exec -B /data neurodebian-neurodebian-master-latest.simg ls /data/user/wsmonroe"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now rather then using ls command from within the container we are going to see one of the softwares within the container: [dcm2nii](https://www.nitrc.org/projects/mricron)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity exec -B /data neurodebian-neurodebian-master-latest.simg dcm2nii"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 2:\n",
    "\n",
    "In this example we are going to be pulling a singularity image from [dockerhub](https://hub.docker.com/). This singularity image contains [google-cloud-sdk tools](https://cloud.google.com/sdk/).\n",
    "\n",
    "The Cloud SDK is a set of tools for Cloud Platform. It contains gcloud, gsutil, and bq command-line tools, which you can use to access Google Compute Engine, Google Cloud Storage, Google BigQuery, and other products and services from the command-line. You can run these tools interactively or in your automated scripts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity pull docker://jess/gcloud"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!singularity exec -B /data gcloud.simg gsutil"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!rm *.simg"
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}