rc issueshttps://gitlab.rc.uab.edu/groups/rc/-/issues2023-05-10T15:42:21-05:00https://gitlab.rc.uab.edu/rc/devops/-/issues/394Need for historical user account state history2023-05-10T15:42:21-05:00William E WarrinerNeed for historical user account state history### User story
As a Research Computing facilitator and data scientist, I need the most complete account state history we can manage to discover. A complete account state history will enable more accurate reporting of information such as...### User story
As a Research Computing facilitator and data scientist, I need the most complete account state history we can manage to discover. A complete account state history will enable more accurate reporting of information such as grant portfolios and publications.
### Possible Solution
I have a (nearly) complete list of account creation dates for every user of Cheaha. In principle these are accurate, in the worst case, to about a week. We can update the `user_state` and `users` tables of the user registration sqlite database to reflect this information.
There is one thing I learned recently (about `user_state`) that I will incorporate, then this will be fully ready.https://gitlab.rc.uab.edu/rc/devops/-/issues/393Need for accurate, up-to-date user account state history2023-05-30T09:50:19-05:00William E WarrinerNeed for accurate, up-to-date user account state history### User story
As a Research Computing facilitator and data scientist, I need UAB user account state history so that I can accurately report information about our platform, especially grants. For example, I want to be sure we aren't cou...### User story
As a Research Computing facilitator and data scientist, I need UAB user account state history so that I can accurately report information about our platform, especially grants. For example, I want to be sure we aren't counting grants for users who are no longer affiliated with Cheaha, or who haven't been using Cheaha recently.
### Possible solution
Regularly poll UAB IDM database(s) for information to determine if RC account state needs to be updated. Compute new state from polled IDM information. If state has changed, add an entry to the `user_state` table of the user registration sqlite database reflecting that change.
At a recent Zoom meeting we discussed the potential need to add another state to the list of existing states. I propose the state "unaffiliated" for researchers whose IDM information reflects they are not affiliated with UAB. We also discussed an "inactive" state for researchers who are affiliated with UAB but have been in the certification state for some sufficient amount of time.
### Notes
As states are added, it may be helpful to create a directed graph of state transitions as part of documentation, to parallel the concept of a Finite State Machine.Incorporating new states for user accounts on cheahahttps://gitlab.rc.uab.edu/rc/rc-mail-template/-/issues/2Updates to mail text2023-04-18T17:30:10-05:00William E WarrinerUpdates to mail text- All instances of `uabrc.github.io` should be replaced with `docs.rc.uab.edu`.
- "our Wiki pages" should be "our Docs pages".
- "Cheaha Quick Start" should be "Cheaha Getting Started"
- "Cluster" should be "Cheaha"
- "General informatio...- All instances of `uabrc.github.io` should be replaced with `docs.rc.uab.edu`.
- "our Wiki pages" should be "our Docs pages".
- "Cheaha Quick Start" should be "Cheaha Getting Started"
- "Cluster" should be "Cheaha"
- "General information about Cheaha" should be "... about Research Computing"
- Point out membership in the relevant listservs is necessary to access RC services. Unsubscribing from any of the listservs results in loss of access to RC services.
- textwrap.dedent() allows "proper" PEP8 formatting in triple-quoted strings without inserting tabs into the body of the email.
I'm planning to propose a rewrite at some point after we get the UAB RC IT Home Page updated.https://gitlab.rc.uab.edu/rc/devops/-/issues/370RELION interactive app in OOD2023-09-07T14:00:07-05:00William E WarrinerRELION interactive app in OOD### Background and justification
In the near future, we are retiring `sinteractive`. Researchers with CryoEM data who use RELION have expressed a need that exposes a gap in our services. Currently, the only way CryoEM researchers can in...### Background and justification
In the near future, we are retiring `sinteractive`. Researchers with CryoEM data who use RELION have expressed a need that exposes a gap in our services. Currently, the only way CryoEM researchers can interact with the RELION GUI with (up to) four GPUs is using `sinteractive`.
The need has been demonstrated directly in tickets and email discussions by three researchers and, from discussions with the CryoEM facility director (Terje Dokland), there are at least 6 other regular users who have not communicated with us. When the CryoEM facility is fully operational again (expected May-June 2023), we will see an increase in RELION use on Cheaha.
An additional detail is the need to know when requested jobs have started. The typical workflow is to request an entire `pascalnodes` node at once. It can take substantial time for the job to start, so a functional email feedback system is needed.
Update 2023-04-12: Reviewing with James Kizziah reveals the software has multiple windows that must be open simultaneously. They must all be discoverable and interactive as needed.
### Proposed action items:
- [ ] Add RELION as first-class interactive app in OOD.
- [ ] RELION job submission form allows selection of number of GPUs.
- [ ] Each RELION job allocation should be within a single node.
- [ ] Ensure `module load RELION/4.0.0` works as expected with multiple GPUs. (See rc/cluster-software#76)
- [ ] Ensure `module load RELION/4.0.0` has all functionality described [here](https://gitlab.rc.uab.edu/rc/devops/-/issues/370#note_74882)
- [ ] Functional "I would like to receive an email when the session starts" checkbox.
- [ ] Interactive environment with multiple window support.
### Related material
Related tickets:
- [ ] [RITM0616403](https://uabprod.service-now.com/nav_to.do?uri=sc_req_item.do?sys_id=46f2915b1bb1a5902e00eb93604bcb01)
- [ ] [RITM0616558](https://uabprod.service-now.com/nav_to.do?uri=sc_req_item.do?sys_id=1952b1a31b7de590a8e26571604bcb5b)
Related issues:
- rc/cheaha#15
- rc/cluster-software#76 (thanks @prema!)
Related projects:
- rc-data-science/facilitation-projects/cryoem-needs>Ravi TripathiRavi Tripathihttps://gitlab.rc.uab.edu/rc/devops/-/issues/355Work with Data Science team to get a Proof of concept report for scratch viol...2023-05-23T09:56:34-05:00Ravi TripathiWork with Data Science team to get a Proof of concept report for scratch violation per user[Active] Scratch Policy violation notificationRavi TripathiRavi Tripathihttps://gitlab.rc.uab.edu/rc/devops/-/issues/349Replace CRI_XCBC-based OOD deploy with native OOD ansible method from OOD pro...2023-09-27T06:48:33-05:00Ravi TripathiReplace CRI_XCBC-based OOD deploy with native OOD ansible method from OOD projectOOD Ansible instructions: https://github.com/OSC/ood-ansibleOOD Ansible instructions: https://github.com/OSC/ood-ansiblehttps://gitlab.rc.uab.edu/rc/devops/-/issues/82Add user state for deleted user2023-05-26T10:58:24-05:00Bo-Chun ChenAdd user state for deleted userCurrently, if we want to remove a user, we need to remove all entries in the database. That removes the history of the user on Cheaha.
Might be related: https://github.com/jprorama/CRI_XCBC/pull/349Currently, if we want to remove a user, we need to remove all entries in the database. That removes the history of the user on Cheaha.
Might be related: https://github.com/jprorama/CRI_XCBC/pull/349Bo-Chun ChenBo-Chun Chenhttps://gitlab.rc.uab.edu/rc/rabbitmq_agents/-/issues/152Feat: Add a `created` column to `user_reg.db`2024-03-27T09:59:19-05:00William E WarrinerFeat: Add a `created` column to `user_reg.db`My understanding is the `last_update` column could potentially be overwritten by a repeat registration action. That should never happen, but just in case...the `created` column would contain the date of initial registration for each user...My understanding is the `last_update` column could potentially be overwritten by a repeat registration action. That should never happen, but just in case...the `created` column would contain the date of initial registration for each user.
Having a definitive source of this date is useful for metrics, reporting and facilitation.Bo-Chun ChenBo-Chun Chenhttps://gitlab.rc.uab.edu/rc/self-reg-form/-/issues/7SocketIO session disconnected2024-03-20T15:23:51-05:00Bo-Chun ChenSocketIO session disconnectedThe self reg app failed twice within a short period of time.
Here's error message from `/var/log/account/error.log`
```sh
[2024-03-05 16:36:31 -0600] [317239] [ERROR] Error handling request /socket.io/?EIO=3&transport=polling&t=OuGmkEB...The self reg app failed twice within a short period of time.
Here's error message from `/var/log/account/error.log`
```sh
[2024-03-05 16:36:31 -0600] [317239] [ERROR] Error handling request /socket.io/?EIO=3&transport=polling&t=OuGmkEB&sid=85cf84fe4909470eb11acbe5a6543cd4
Traceback (most recent call last):
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/gunicorn/workers/base_async.py", line 55, in handle
self.handle_request(listener_name, req, client, addr)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/gunicorn/workers/ggevent.py", line 127, in handle_request
super().handle_request(listener_name, req, sock, addr)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/gunicorn/workers/base_async.py", line 108, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/flask/app.py", line 2463, in __call__
return self.wsgi_app(environ, start_response)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/flask_socketio/__init__.py", line 46, in __call__
start_response)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/engineio/middleware.py", line 60, in __call__
return self.engineio_app.handle_request(environ, start_response)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/socketio/server.py", line 534, in handle_request
return self.eio.handle_request(environ, start_response)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/engineio/server.py", line 393, in handle_request
socket = self._get_socket(sid)
File "/var/www/ood/register/account/venv/lib64/python3.6/site-packages/engineio/server.py", line 561, in _get_socket
raise KeyError('Session is disconnected')
KeyError: 'Session is disconnected'
```https://gitlab.rc.uab.edu/rc/devops/-/issues/529Debug why the webapp for xnat is not loading properly.2024-03-12T09:32:58-05:00Eesaan AtluriDebug why the webapp for xnat is not loading properly.https://gitlab.rc.uab.edu/rc/devops/-/issues/528Explore Wazuh for ovn/sdn sec observability2024-03-13T09:44:55-05:00John-Paul RobinsonExplore Wazuh for ovn/sdn sec observabilityhttps://wazuh.com/
> Wazuh delivers robust security monitoring and protection for your IT assets using its Security Information and Event Management (SIEM) and Extended Detection and Response (XDR) capabilities. Wazuh use cases are desi...https://wazuh.com/
> Wazuh delivers robust security monitoring and protection for your IT assets using its Security Information and Event Management (SIEM) and Extended Detection and Response (XDR) capabilities. Wazuh use cases are designed to safeguard your digital assets and enhance your organization's cybersecurity posture.
> These use cases encompass File Integrity Monitoring (FIM) ensuring the integrity of your critical files, Security Configuration Assessment (SCA) fortifying your system configurations against potential threats, Vulnerability Detection pinpointing potential weaknesses before they are exploited, and others. Explore our use cases and capabilities below.
Wazuh has a broad number of use cases
![image](/uploads/a9f476636a7370b748cd14b04056955c/image.png)https://gitlab.rc.uab.edu/rc/terraform-openstack/-/issues/21Remove remote exec provisioners from deploy factory project2024-03-05T13:11:56-06:00Krish MoodbidriRemove remote exec provisioners from deploy factory projectKrish MoodbidriKrish Moodbidrihttps://gitlab.rc.uab.edu/rc/rabbitmq_agents/-/issues/151Move rabbitmq agents out of master node2024-03-19T09:38:51-05:00Bo-Chun ChenMove rabbitmq agents out of master nodeThis will include but not limit to the following tasks:
- Replace local commands with API calls
- containerize agentsThis will include but not limit to the following tasks:
- Replace local commands with API calls
- containerize agentshttps://gitlab.rc.uab.edu/rc/devops/-/issues/523Restrict user resource on login node2024-03-26T09:38:05-05:00Bo-Chun ChenRestrict user resource on login nodeI have been seeing this issue from day 1.
I came across this [ContainerSSH: Launch containers on demand](https://containerssh.io/v0.5/usecases/lab/), and I think this might be a solution for the login node abuse issue we have had for s...I have been seeing this issue from day 1.
I came across this [ContainerSSH: Launch containers on demand](https://containerssh.io/v0.5/usecases/lab/), and I think this might be a solution for the login node abuse issue we have had for so long.
> ContainerSSH launches a new container for each SSH connection in Kubernetes, Podman or Docker. The user is transparently dropped in the container and the container is removed when the user disconnects. Authentication and container configuration are dynamic using webhooks, no system users required.Bo-Chun ChenBo-Chun Chenhttps://gitlab.rc.uab.edu/rc/rabbitmq_agents/-/issues/150API documentation for rabbitmq agents2024-02-20T09:53:35-06:00Bo-Chun ChenAPI documentation for rabbitmq agentsWhile developing the group manage cli, I want to know what message the agent expects and what I should expect for the response. I found this is really hard since we do not have documents for every agents. I had to go through the agent an...While developing the group manage cli, I want to know what message the agent expects and what I should expect for the response. I found this is really hard since we do not have documents for every agents. I had to go through the agent and trace every line of the code to determine what fields I should include in my message.https://gitlab.rc.uab.edu/rc/rabbitmq_agents/-/issues/148Design of group management app2024-03-26T09:50:29-05:00Bo-Chun ChenDesign of group management appAn excalidraw graph for the group management APP.An excalidraw graph for the group management APP.Users should be able to add user to their group easilyBo-Chun ChenBo-Chun Chenhttps://gitlab.rc.uab.edu/rc/rabbitmq_agents/-/issues/146Document how you set up a test env for testing the group_management app2024-03-26T10:08:51-05:00Eesaan AtluriDocument how you set up a test env for testing the group_management appUsers should be able to add user to their group easilyBo-Chun ChenBo-Chun Chenhttps://gitlab.rc.uab.edu/rc/devops/-/issues/519Move jprorama/CRI_XCBC repo to gitlab2024-01-31T09:47:36-06:00Bo-Chun ChenMove jprorama/CRI_XCBC repo to gitlabhttps://gitlab.rc.uab.edu/rc/devops/-/issues/518Move uabrc/CRI_XCBC repo to gitlab2024-01-31T09:47:20-06:00Bo-Chun ChenMove uabrc/CRI_XCBC repo to gitlabhttps://gitlab.rc.uab.edu/rc/devops/-/issues/515K8s worker route debugging2024-02-09T14:28:13-06:00Eesaan AtluriK8s worker route debugging