You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The text was updated successfully, but these errors were encountered:
xiaoSUM
changed the title
犀牛鸟angel实战-刘倩
2021Tencent Rhino-bird Open-source Training Program—Angel
Aug 12, 2021
xiaoSUM
changed the title
2021Tencent Rhino-bird Open-source Training Program—Angel
2021Tencent Rhino-bird Open-source Training Program—Angel--刘倩
Aug 16, 2021
一、 angel 算法案例
1.1 LR-spark-on-angel输出
1.2 Debug
1. netty-all-4.1.1.Final.jar与json4s-jackson_2.11-3.4.2.jar版本问题
修改angel-ps与spark-on-angel的pom文件改为以上版本
2. 跑通项目的软件版本
apache-maven-3.8.1
hadoop-2.7.2
jdk1.8.0_161
protobuf-2.5.0
scala-2.11.8
spark-2.3.0-bin-hadoop2.7
angel-2.4.0-bin
二、 Pytorch on angel 算法案例
1.1 deepfm for torch on angel输出
http://hadoop001:8088/cluster/apps
1.2 Debug
1. cmake报错
在dockerfile里面添加
ENV Torch_DIR=/opt/libtorch/share/cmake/Torch
2. pytorch版本和torchvision版本不对应
在dokerfile文件里面添加torchvision=0.4.2
3. spark-submit提交脚本
source /home/liuqian/angel/angel/dist/target/angel-2.4.0-bin/bin/spark-on-angel-env.sh
4.内存问题
把yarn.scheduler.capacity.maximum-am-resource-percent调到0.6
5.提交脚本内存分配不合理
ps log
换一台物理内存大的机器,重新配置跟之前一样的环境,yarn设置和提交脚本如下:
The text was updated successfully, but these errors were encountered: