My installation of nvidia-compilers-25.3-CUDA-12.8.0.eb fails when the makelocalrc command is executed:
ERROR: Shell command failed!
full command -> /scratch/leuven/sys/x0139045/foz-software-stack-66141841/rocky9/icelake/2025a/software/nvidia-compilers/25.3-CUDA-12.8.0/Linux_x86_64/25.3/compilers/bin/makelocalrc -x /scratch/leuven/sys/x0139045/foz-software-stack-66141841/rocky9/icelake/2025a/software/nvidia-compilers/25.3-CUDA-12.8.0/Linux_x86_64/25.3/compilers/bin
exit code -> 1
called from -> 'install_step' function in /apps/leuven/common/software/EasyBuild/5.2.1/lib/python3.6/site-packages/easybuild/easyblocks/generic/nvidiabase.py (line 477)
working directory -> /lustre1/scratch/sys/x0139045/foz-software-stack-66141841/rocky9/icelake/2025a/build/nvidiacompilers/25.3/system-system-CUDA-12.8.0/nvhpc_2025_253_Linux_x86_64_cuda_multi
output (stdout + stderr) -> /tmp/eb-_xblgyfc/run-shell-cmd-output/makelocalrc-tskfn41c/out.txt
interactive shell script -> /tmp/eb-_xblgyfc/run-shell-cmd-output/makelocalrc-tskfn41c/cmd.sh
It looks like this command does not work as its help message says, but instead uses an interactive installer that does not pick up the correct installation path:
Welcome to the NVIDIA HPC SDK Linux installer!
You are installing NVIDIA HPC SDK 2025 version 25.3 for Linux_x86_64.
Please note that all Trademarks and Marks are the properties
of their respective owners.
Press enter to continue...
A single system installation is appropriate for a single system or a
homogeneous cluster. A network installation should be selected for a
heterogeneous cluster. For either a single system or network installation,
the HPC SDK configuration (localrc) is created at install time and saved
in the installation directory.
An auto installation is appropriate for any scenario. The HPC SDK
configuration (localrc) is created at first use and stored in each user's
home directory.
1 Single system install
2 Network install
3 Auto install
Please choose install option:
Please specify the directory path under which the software will be installed.
The default directory is /opt/nvidia/hpc_sdk, but you may install anywhere
you wish, assuming you have permission to do so.
Installation directory? [/opt/nvidia/hpc_sdk]
ERROR: mkdir /opt/nvidia/hpc_sdk: permission denied
ERROR: installation directory (/opt/nvidia/hpc_sdk) not created
Exiting...
ERROR: unable to create /scratch/leuven/sys/x0139045/foz-software-stack-66141841/rocky9/icelake/2025a/software/nvidia-compilers/25.3-CUDA-12.8.0/Linux_x86_64/25.3/compilers/bin//localrc
I managed to solve this by replacing the following code block in the nvidiabase.py easyblock at
|
if LooseVersion(self.version) >= LooseVersion('22.9'): |
|
bin_subdir = os.path.join(compilers_subdir, "bin") |
|
cmd = f"{makelocalrc_filename} -x {bin_subdir}" |
|
else: |
|
cmd = f"{makelocalrc_filename} -x {compilers_subdir} -g77 /" |
|
|
|
run_shell_cmd(cmd) |
with
if LooseVersion(self.version) >= LooseVersion('22.9'):
bin_subdir = os.path.join(compilers_subdir, "bin")
cmd = f"{makelocalrc_filename} -x {bin_subdir}"
answers = "\n1\n%s\n" % bin_subdir
else:
cmd = f"{makelocalrc_filename} -x {compilers_subdir} -g77 /"
answers = None
run_shell_cmd(cmd, stdin=answers)
Before making a PR for this I wanted to check if somebody can reproduce the problem or if it only happens on our end? I find it very weird that the test builds for nvidia-compilers-25.3-CUDA-12.8.0.eb succeeded when this easyconfig was added to the repo, based on what I see.
Full build log is attached, this occurs for EasyBuild 5.2.1
easybuild-nvidia-compilers-25.3-20260306.094811.vyyki.log
My installation of
nvidia-compilers-25.3-CUDA-12.8.0.ebfails when themakelocalrccommand is executed:It looks like this command does not work as its help message says, but instead uses an interactive installer that does not pick up the correct installation path:
I managed to solve this by replacing the following code block in the
nvidiabase.pyeasyblock ateasybuild-easyblocks/easybuild/easyblocks/generic/nvidiabase.py
Lines 471 to 477 in df3ca30
with
Before making a PR for this I wanted to check if somebody can reproduce the problem or if it only happens on our end? I find it very weird that the test builds for
nvidia-compilers-25.3-CUDA-12.8.0.ebsucceeded when this easyconfig was added to the repo, based on what I see.Full build log is attached, this occurs for EasyBuild 5.2.1
easybuild-nvidia-compilers-25.3-20260306.094811.vyyki.log