Information Lifecycle Managment (ILM) via GPFS policy engine
The GPFS policy engine is well described in this white paper. A good presentation overview of the policy file is here. The relavent documentation is available from IBM.
This project focuses on scheduled execution of lifecyle policies to gather and process data about file system objects and issue actions against those objects based on policy.
Running a policy
A policy is executed in the context of a SLURM batch job reservation using the submit-pol-job script:
submit-pol-job <outdir> <policy> <nodecount> <corespernode> <ram> <partition> <time>
Where the positional arguments are:
- outdir - the directory for the output files, should be global to cluster (e.g. /scratch of the user running the job)
- policy - path to the GPFS policy to execute (e.g. in ./policy directory)
- nodecount - number of nodes in the cluster that will run the policy
- corespernode - number of cores on each node to reserve
- ram - ram per core, can use "G" for gigabytes
- partition - the partition to submit the job
- time - the time in minutes to reserve for the job
Note: the resource reservation is imperfect. The job wrapper calls a script run-mmpol.sh
which is responsible for executing the mmapplypolicy
command.
The command is aligned to run on specific nodes by way of arguments to mmapplypolicy. The command is technically not run inside of the job reservation so the resource constraints are imperfect. The goal is to use the scheduler to ensure the policy run does not conflict with existing resource allocations on the cluster.
Running the policy "list-policy-external"
The list-policy-external policy provides an efficient tool to gather file stat data into a URL-encoded ASCII text file. The output file can then be processed by down-stream to create reports on storage patterns and use.
An example invocation would be:
submit-pol-job /path/to/output/dir \
/absolute/path/policy/list-path-external \
4 24 4G partition_name \
/path/to/listed/dir \
180
Some things to keep in mind:
- the
submit-pol-job
script may need a./
prefix if it is not in your path. - use absolute paths for all directory arguments to avoid potential confusion
- make sure the output dir has sufficient space to hold the resulting file listing (It could be 100's of Gigabytes for a large collection of files.)
The slurm job output file will be local to the directory from which this command executed. It can be watched to observe progress in the generation of the file list. A listing of 100's of millions of files may take a couple of hours to generate and consume serveral hundred gigabytes for the output file.
The output file in /path/to/output/dir
is named as follows
- a prefix of "list-${SLURM_JOBID}"
- ".list" for the name of the policy rule type of "list"
- a tag for the list name name defined in the policy file, "list-gather" for
list-path-external
policy
The output file contains one line per file object stored under the /path/to/listed/dir
. No directories or non-file objects are included in this listing. Each entry is a space-seperated set of file attributes selected by the SHOW command in the LIST rule. Entries are encoded according to RFC3986 URI percent encoding. This means all spaces and special characters will be encoded, making it easy to split lines into fields using the space separator.
The ouput file is an unsorted list of files in uncompressed ASCII. Further processing is desireble to use less space for storage and provide organized collections of data.
Processing the output file
Split and compress
Pre-parse output for Python
Processing GPFS log outputs is controlled by the run-convert-to-parquet.sh
script and assumes the GPFS log has been split into a number of files of the form list-XXX.gz
where XXX
is an incrementing numeric index. This creates an array job where each task in the array reads the quoted text in one file, parses it into a dataframe, and exports it as a parquet file with the name list-XXX.parquet
.
While the file is being parsed, the top-level-directory (tld
) is extracted for each entry and added as a separate column to make common aggregations easier.
This script is written to parse the list-path-external
policy format with quoted special characters.
Usage: ./run-convert-to-parquet.sh [ -h ]
[ -o | --outdir ] [ -n | --ntasks ] [ -p | --partition]
[ -t | --time ] [ -m | --mem ]
gpfs_logdir"
-
outdir
: Path to save parquet outputs. Defaults to${gpfs_logdir}/parquet
-
gpfs_logdir
: Directory path containing the split log files as*.gz
All other options control the array job resources. The default resources can parse 5 million line files in approximately 3 minutes so should cover all common use cases.
Running reports
Disk usage by top level directies
A useful report is the top level directory (tld) report. This is akin to running a du -s *
in a directory of interest, but much faster since there is no walking of directory trees. Only the list policy output file is used, reducing the operation to a parsing an summing of the data in the list policy output file.
Comparing directory similarity
Scheduling regular policy runs via cron
The policy run can be scheduled automatically with the cronwrapper script.
Simpley add append the above script and arguments to the crownwrapper in a crontab line.
For example to run it every morning at 4 am you would add:
0 4 * * * /path/to/cronwrapper submit-pol-job <outdir> <policy> <nodecount> <corespernode> <ram> <partition>