Updated 10/18/23 with specific issues and solutions.
Description of the migration
What is happening?
Original Path | New Path |
/soft | /common/software/install/migrated |
/panfs/roc/msisoft | /common/software/install/migrated |
/panfs/roc/soft/el6 | /common/software/install/migrated.softel6 |
/panfs/roc/intel | /common/software/install/migrated.intel |
Original Path | New Path |
/panfs/roc/soft/modulefiles.common | /common/software/modulefiles/migrated/common |
/panfs/roc/soft/modulefiles.hpc | /common/software/modulefiles/migrated/hpc |
/panfs/roc/soft/modulefiles.centos7 | /common/software/modulefiles/migrated/centos7 |
/panfs/roc/soft/modulefiles.mesabi | /common/software/modulefiles/migrated/mesabi |
/panfs/roc/soft/modulefiles.mangi | /common/software/modulefiles/migrated/mangi |
/panfs/roc/soft/modulefiles.k40 | /common/software/modulefiles/migrated/k40 |
/panfs/roc/soft/modulefiles.v100 | /common/software/modulefiles/migrated/v100 |
/panfs/roc/soft/modulefiles.legacy | /common/software/modulefiles/migrated/legacy |
/panfs/roc/intel/modulefiles | /common/software/modulefiles/migrated/intel |
- Symlinks that refer to a '/panfs' directory have been updated to point to the equivalent location in '/common/software'
- Configuration files containing references to '/panfs' directories have been updated to reference the equivalent locations
- Executable files containing '/panfs' directories in their RPATH or RUNPATH have been patched to refer to the equivalent directories
- Modulefiles have been made more specific to ensure that dependencies are found correctly (e.g. updating PERL5LIB for perl modules to the new location)
- Reinstalling modules that don't respond to any of the above methods
Why make this change?
When did this happen?
Changes you might need to make
Check your bashrc for references to old paths
Many software packages and workflow customizations will modify your bashrc file, which is used to initialize settings for new shell sessions. You might define or modify environment variables, define functions and aliases, or load modules among other possible customizations. Some software packages like conda will automatically modify your bashrc file in order to enable special features, so you may have changed your bashrc file even if you've never opened it yourself.
The file is located at ~/.bashrc , and is a plaintext file that you can open with your favorite text editor. Check this file for references to any of the old paths to software installs or modulefiles, and update them to the new paths or remove them if the modifications to your environment are no longer necessary.
As a common example of this, if you ever ran 'conda init' or an equivalent command, you will have a block in your ~/.bashrc file that looks like the following:
# !! Contents within this block are managed by 'conda init' !! __conda_setup="$('/panfs/roc/msisoft/mamba/0.11.3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/panfs/roc/msisoft/mamba/0.11.3/etc/profile.d/conda.sh" ]; then . "/panfs/roc/msisoft/mamba/0.11.3/etc/profile.d/conda.sh" else export PATH="/panfs/roc/msisoft/mamba/0.11.3/bin:$PATH" fi fi unset __conda_setup # <<< conda initialize <<<
Load additional modules during your jobs
Some compiled software will hardcode hints to the location of dependencies when you build it. Later, when you run this software it will use these hints to find the location of library files and other dependencies that are not otherwise visible in your current environment. Unfortunately, hints of this type for software that was built before the software migration will no longer work. As a result you may start seeing 'missing library' errors for workflows that previously worked without issue.
Often you can resolve this by loading the module corresponding to the missing dependency. If you are unsure which module you should load, it will usually be one or more of the modules you loaded when you originally compiled the software. Common modules that might need to be loaded like this include gcc, cuda, and mkl.
Patch or rebuild software that hard-codes old paths
Re-link files that point to old paths
Another potential issue will show up as 'Host is down:' errors when trying to run an executable file. This happens when the executable in the environment you are using is actually a link to that executable in a '/panfs/roc' location. You can find broken links of this type by running:
find ~/.conda/envs -xtype l
ls -lha /users/3/dunn0404/.conda/envs/myenv/bin/python lrwxrwxrwx. 1 dunn0404 msistaff 42 Oct 12 12:08 /users/3/dunn0404/.conda/envs/myenv/bin/python -> /panfs/roc/msisoft/mamba/0.11.3/bin/python
ln -nsf /common/software/install/migrated/mamba/0.11.3/bin/python /users/3/dunn0404/.conda/envs/myenv/bin/python
Common issues you might see
Host is down errors
Since the old software library was located on a network storage appliance that has now been partially turned off, you might see errors of the type:
Host is down:
when trying to run software, even when you wouldn't expect the particular command you are using to need to access another host. This error is showing up because some part of the command, usually the location of an executable file, references one of the old software install locations. So far we've seen this most commonly with python, R, Rscript, and ruby commands that use a conda environment.
The resolution for this issue is usually to update broken links to the old software paths and remove references to old paths in your bashrc.
Missing libraries
One of the more common issues you might see is a missing library file. These errors will look something like the following:
error while loading shared libraries : libsomething.so.16 cannot open shared object file : no such file or directory
This error indicates that the library 'libsomething.so.16' isn't available in your environment. The resolution for this issue is usually to load the module that provides this dependency or patch the impacted executables to reference the updated paths.
Conda environments not working
Due to the specifics of how they are installed, conda environments are especially prone to issues from the software migration. There are a variety of ways that a conda environment might fail after the migration, but you can address the majority of them by to updating broken links to the old software paths and removing references to old paths in your bashrc.
One additional issue you may see are errors referencing problems with an SSL CA certificate that prevent you from creating new environments. You can fix this by manually specifying the location of the certificate file for the conda module you are using. For instance, if you are using the the 'mamba' module you might do the following:
Find the root of the module install:
$ module show mamba ------------------------------------------------------------------- /common/software/modulefiles/migrated/common/mamba/0.11.3: prepend-path PATH /common/software/install/migrated/mamba/0.11.3/bin -------------------------------------------------------------------
The root directory for this module will be the directory that contains 'bin'. So in this case, it would be
/common/software/install/migrated/mamba/0.11.3
The SSL certificate for conda modules is located under '$root/ssl/cert.pem', which in this case would be:
/common/software/install/migrated/mamba/0.11.3/ssl/cert.pem
You can then indicate the location of this certificate to your conda config by running:
conda config --set ssl_verify /common/software/install/migrated/mamba/0.11.3/ssl/cert.pem
At this point you should be able to create new conda environments again without SSL errors.
R libraries not working
Modules that simply stop working
Some of MSI's modules unexpectedly broke during the migration. While we did our best to patch all of the software installations to avoid this outcome, the wide variety in the design of software distribution means that this just isn't possible in some cases. If you find a module that is no longer working after the migration that doesn't match the descriptions of other common errors on this page, please report it to help@msi.umn.edu so we can flag it for reinstallation.