Explore symlinks as tool to maintain namespace compatibility between gpfs4 and gpfs5
Migration from gpfs4 to gpfs5 with pixstore/s3 space management comes with some limitations. Specifically we need to physically move data to gpfs5 in order to take advantage of the s3 migration capabilities.
A good target for this is to pre-migrate the least frequently used content of gpfs4 to gpfs5. The problem with this is that it leaves holes in the gpfs4 namespace where the migrated content used to be and degrades the user-curated file namespace in their /data/user or /data/projects spaces.
It's possible that we could use symlinks to avoid file names disappearing from gpfs4. Assuming we mount gpfs5 into the cluster at /future/data
, an unused file moved from gfps4 at /data/projects/projectA/oldfiles/file1
to /future/data/projects/projectA/oldfiles/file1
can be replaced with a symlink reflecting the move. This keeps the user-curated namespace intact and makes sure they can continue to use the current gpfs4 namespace without any surprises with missing files.
In the mean time, the gpfs5+s3 process on the pixstore can auto-migrate the data to s3. This migration maintains the inode on gpfs5 (preserving the namespace) but marks the data a migrated to s3.
The user will continue to accesses their original file locations in /data
. If they then open()
a file that has been migrated to the /future
, the OS will transparently locate that file in the future namespace via the symlink. As the OS starts reading the content of the file on gfps4, pixstore will automatically reconstitute the data in gpfs5 from s3 and the application will receive the data.
This approach should allow transparent access to data during migration.
Note: In the case where a user deletes files on gpfs4 (ie. removes the symlinks for files migrated to the /future
) that would leave the "orphaned" file on gpfs5+s3. If they use flags that follow symlinks then the inode on gpfs5 in /future
will be removed. This will remove the file and its content from visibility via gpfs (4 or 5) and effectively make it appear deleted. It should be noted, however, that if the file content was migrated to s3 the data will still reside there. There is no hook on file deletes from gpfs5 for file content migrated to s3. Cleanup of orphaned s3 content or of gpfs5 files from symlink deletes will require a reconciliation process.
The delete scenarios require their own issue and investigation. Their just mentioned here for completeness.