-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: 151859 is not in list #5
Comments
hi, 好像你的input_ids 是float类型?应该是个 long 才对? 是不是哪里的处理有问题 |
没做处理哦,数据和代码基本没有改,输入开始确实是long,有float应该是inf导致的整体类型发生变化,看着有inf,我怀疑是fp16导致的(训练机器不支持bf16),有试过用fp16训练吗 |
可能是 fp16 的精度的问题吧 试试强制cast一下 input_ids 为 Long ? 我们是在 A100 上开bf16的所以没试过fp16哈 |
cast肯定不行,inf说明数据精度丢了,恢复不了原来的值,不过我昨天在3090上试过bf16确实可以,但是显存爆了。 |
image = image[ : image.index(self.config.visual['image_start_id'] + 2)]
ValueError: 151859 is not in list
使用原始的数据和代码,一直报这个错,可以看下吗
The text was updated successfully, but these errors were encountered: