CentOS8 - PBS Pro 单节点安装

CentOS8 - PBS Pro 单节点安装

1 安装依赖

安装编译所需的依赖

1
2
3
4
5
yum install -y gcc make rpm-build libtool hwloc-devel \
libX11-devel libXt-devel libedit-devel libical-devel \
ncurses-devel perl postgresql-devel postgresql-contrib python3-devel tcl-devel \
tk-devel swig expat-devel openssl-devel libXext libXft \
autoconf automake

但是这里libical-devel 安装不上需要手动安装

在aarch64架构上:

1
sudo rpm -ivh https://rpmfind.net/linux/centos/8-stream/AppStream/aarch64/os/Packages/libical-devel-3.0.3-3.el8.aarch64.rpm

然后,就“依赖递归”了,可以编写脚本递归安装

安装运行所需的依赖

1
2
$ yum install -y expat libedit postgresql-server postgresql-contrib python3 \
sendmail sudo tcl tk libical

2 创建非root用户,下载源码

创建用户

1
2
$ sudo addusr -m stu
$ sudo passwd stu

下载源码

1
2
$ git clone -b release_20_0_branch https://github.com/openpbs/openpbs.git
$ cd openpbs

3 编译和安装

1
2
3
4
5
6
$ ./autogen.sh
$ ./configure --help # 这里查看编译选项
$ ./configure --prefix=/opt/pbs

$ make
$ sudo make install

4 配置

1
2
3
4
5
$ systemctl stop firewalld.service		# 关闭防火墙

$ sudo /opt/pbs/libexec/pbs_postinstall # 执行脚本初始化节点
$ sudo vi /etc/pbs.conf # 单节点:将PBS_START_MOM设置为1,将server作为计算节点
$ sudo chmod 4755 /opt/pbs/sbin/pbs_iff /opt/pbs/sbin/pbs_rcp

5 启动服务

1
$ sudo /etc/init.d/pbs start

应用PATH或者MANPATH更新

1
$ . /etc/profile.d/pbs.sh

6 创建节点和工作队列

1
2
3
4
$ sudo /opt/pbs/bin/qmgr
qmgr: create node hostname
qmgr: create queue workq queue_type=e,enabled=t,started=t
qmgr: ^D

7 测试

测试在非root用户下提交作业

1
2
3
4
$ su stu
$ qstat -a
$ echo "sleep 60" | qsub
$ qstat -f Job_ID

8 一些问题

8.1 关于 qstat:cannot connect to server

检查hosts文件,将IP地址与主机名添加进去

1
2
$ uname -a
$ sudo vi /etc/hosts

8.2 关于qmgr:cannot connect to server node

检查node节点pbsnodes -a

删除重新创建

1
2
3
4
$ sudo /opt/pbs/bin/qmgr
qmgr: delete node hostname
qmgr: create node hostname
qmgr: ^D

8.3 关于node节点 state != free

同上,删除节点重新创建

正常情况:

1
2
MOM = hostname
state = free

8.4 其他

1
2
3
4
5
$ tail 10 /var/spool/pbs/server_logs/20200610 # 查看日志

$ qmgr
set queue workq resources_max.walltime = 120:00:00 # 设置单个节点最大运行
set node localhost.localdomain max_user_run=1 # 设置最大的任务运行数为1

参考资料:

1 https://github.com/openpbs/openpbs

2 https://github.com/openpbs/openpbs/blob/master/INSTALL

3 https://pkgs.org/

4 https://rpmfind.net/linux/RPM/index.html

5 http://community.pbspro.org/