Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DCPMM Cache Crash? #151

Open
iskyryan opened this issue Sep 10, 2020 · 1 comment
Open

DCPMM Cache Crash? #151

iskyryan opened this issue Sep 10, 2020 · 1 comment

Comments

@iskyryan
Copy link

iskyryan commented Sep 10, 2020

Can having too much DRAM cause stability issues with DCPMM? Currently have 4x128GB DRAM with 2x512GB DCPMM (2:1 DCPMM to DRAM ratio) in memory mode running analytics on a Python Darc library. The system works fine with an 80GB file but anything 100GB and up and the system crashes and reboots. We think there may be some sort of issue with the memory controller. We understand the "performance" recommendation is 4:1; will going beyond that cause problems?

@sscargal
Copy link
Contributor

@iskyryan I would not expect any issues with that DRAM:PMEM configuration. Can you provide more information about the issue please:

  • What error(s) do you see in or around the time of the unplanned reboot? Is it an OOM (Out of Memory) Killer reboot, a Machine Check Exception (MCE), or something else?
  • What mode are you using? Memory Mode or AppDirect?
  • What server make/model are you using?
  • What BIOS version do you have? dmidecode -t bios
  • What PMem Firmware do you have? ipmctl show -dimm
  • What OS are you using? Linux or Windows + Version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants