docker利用docker-compose-gpu.yml启动RAGFLOW,文档解析出错【亲测已解决】
0.问题说明
想要让RAGFLOW利用GPU资源跑起来,可以选择docker-compose-gpu.yml启动。(但是官网启动案例是×86平台的不是NVIDIA GPU的,docker-compose-gpu.yml又是第三方维护,所以稍有问题)
1.问题
docker利用docker-compose-gpu.yml启动RAGFLOW,文档解析出错
报错:
18:10:23 [ERROR][Exception]: NCCL Error 2: unhandled system error (run with NCCL_DEBUG=INFO for details)
2.解决方案
(1)修改docker-compose-gpu.yml文件(稍作改动)
下面是修改后的完整docker-compose-gpu.yml文件,可以直接复制。
# The RAGFlow team do not actively maintain docker-compose-gpu.yml, so use them at your own risk.
# However, you are welcome to file a pull request to improve it.
include:
- ./docker-compose-base.yml
services:
ragflow:
depends_on:
mysql:
condition: service_healthy
image: ${RAGFLOW_IMAGE}
container_name: ragflow-server
ports:
- ${SVR_HTTP_PORT}:9380
- 80:80
- 443:443
volumes:
- ./ragflow-logs:/ragflow/logs
- ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf
- ./nginx/proxy.conf:/etc/nginx/proxy.conf
- ./nginx/nginx.conf:/etc/nginx/nginx.conf
env_file: .env
ipc: host
shm_size: 8g
environment:
- TZ=${TIMEZONE}
- HF_ENDPOINT=${HF_ENDPOINT}
- MACOS=${MACOS}
- NCCL_DEBUG=INFO
networks:
- ragflow
restart: on-failure
# https://docs.docker.com/engine/daemon/prometheus/#create-a-prometheus-configuration
# If you're using Docker Desktop, the --add-host flag is optional. This flag makes sure that the host's internal IP gets exposed to the Prometheus container.
extra_hosts:
- "host.docker.internal:host-gateway"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
参数解释:
ipc: host:允许容器共享主机的IPC命名空间,解决NCCL多卡通信问题
shm_size: 8g:增大共享内存容量(默认64MB不足)
(2)通过docker-compose-gpu.yml重新启动RAGFLOW
docker compose -f docker-compose-gpu.yml up -d
(3)运行ragflow-server服务器
docker logs -f ragflow-server
(4)检查是否成功进行文档解析
成功解析如下结果:
到此,问题解决!