社消平台NFS内核模块未持久化配置导致客户端挂载失败

社消平台NFS内核模块未持久化配置导致客户端挂载失败

屏幕截图 2025-08-15 101037

因NFS服务端节点故障,排查完成并重启节点并恢复调度后,rabbitmq一致处于ContainerCreating状态,查看详细事件后发现无法挂载NFS目录。

登录NFS节点查看发现NFS容器一致处于循环重启状态,查看日志后发现是因为内核未加载nfsd模块导致不断重启。

屏幕截图 2025-08-15 101705

应该是前同事忘记配置自动加载nfsd模块[g=youling],开始填坑。。。

root@spt-rabbitmq-service:~# lsmod | grep nfsd
# 临时加载
root@spt-rabbitmq-service:~# modprobe nfsd
root@spt-rabbitmq-service:~# lsmod | grep nfsd
nfsd                  524288  0
nfs_acl               262144  1 nfsd
auth_rpcgss           262144  2 nfsd,rpcsec_gss_krb5
lockd                 262144  2 nfsd,nfs
grace                 262144  2 nfsd,lockd
sunrpc                589824  8 nfsd,nfsv4,auth_rpcgss,lockd,rpcsec_gss_krb5,nfs_acl,nfs
# 开机自动加载
root@spt-rabbitmq-service:~# echo "nfsd" >> /etc/modules-load.d/nfsd.conf
root@spt-rabbitmq-service:~# 
# 重启NFS_Server
root@spt-rabbitmq-service:~# docker restart nfs-server 
nfs-server
root@spt-rabbitmq-service:~# docker logs -f nfs-server --tail 20
----> starting rpc.nfsd on port 2049 with 8 server thread(s)
----> all services started normally

==================================================================
      SERVER STARTUP COMPLETE
==================================================================
----> list of enabled NFS protocol versions: 4.2, 4.1, 4, 3
----> list of container exports:
---->   /nfs *(rw,fsid=0,sync,no_root_squash,no_subtree_check,insecure)
---->   /nfs/rabbitmq *(rw,fsid=1,sync,no_root_squash,no_subtree_check,insecure)
---->   /nfs/rabbitmqBus *(rw,fsid=2,sync,no_root_squash,no_subtree_check,insecure)
----> list of container ports that should be exposed:
---->   111 (TCP and UDP)
---->   2049 (TCP and UDP)
---->   32765 (TCP and UDP)
---->   32767 (TCP and UDP)

==================================================================
      READY AND WAITING FOR NFS CLIENT CONNECTIONS
==================================================================

最后等待pod再次尝试挂载或手动删除pod重新创建即可

© 版权声明
THE END
喜欢就支持一下吧
点赞14赞赏 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容