some errors in training magic-point model #318

zaijianlixiang1996 · 2024-03-13T13:50:31Z

Hi，rpautrat，sorry to trouble you. I have a problom in training magic-point on Synthetic Shapes:
(superpoint) cc@cc-System-Product-Name:~/下载/SuperPoint-master/superpoint$ python experiment.py train configs/magic-point_shapes.yaml magic-point_synth
/home/cc/anaconda3/envs/superpoint/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/cc/anaconda3/envs/superpoint/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/cc/anaconda3/envs/superpoint/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/cc/anaconda3/envs/superpoint/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/cc/anaconda3/envs/superpoint/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/cc/anaconda3/envs/superpoint/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
[03/13/2024 21:46:57 INFO] Running command TRAIN
[03/13/2024 21:46:57 INFO] Number of GPUs detected: 1
[03/13/2024 21:46:58 INFO] Extracting archive for primitive draw_lines.
[03/13/2024 21:46:59 INFO] Extracting archive for primitive draw_polygon.
[03/13/2024 21:47:00 INFO] Extracting archive for primitive draw_multiple_polygons.
[03/13/2024 21:47:02 INFO] Extracting archive for primitive draw_ellipses.
[03/13/2024 21:47:03 INFO] Extracting archive for primitive draw_star.
[03/13/2024 21:47:04 INFO] Extracting archive for primitive draw_checkerboard.
[03/13/2024 21:47:05 INFO] Extracting archive for primitive draw_stripes.
[03/13/2024 21:47:07 INFO] Extracting archive for primitive draw_cube.
[03/13/2024 21:47:08 INFO] Extracting archive for primitive gaussian_noise.
[03/13/2024 21:47:10 INFO] Caching data, fist access will take some time.
[03/13/2024 21:47:10 INFO] Caching data, fist access will take some time.
[03/13/2024 21:47:10 INFO] Caching data, fist access will take some time.
2024-03-13 21:47:10.805028: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2024-03-13 21:47:10.891160: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-03-13 21:47:10.891239: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: NVIDIA GeForce RTX 4060 Ti major: 8 minor: 9 memoryClockRate(GHz): 2.58
pciBusID: 0000:01:00.0
totalMemory: 15.70GiB freeMemory: 15.49GiB
2024-03-13 21:47:10.891247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2024-03-13 21:47:11.361531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-03-13 21:47:11.361549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2024-03-13 21:47:11.361552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2024-03-13 21:47:11.361608: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2024-03-13 21:47:11.361626: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14988 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 4060 Ti, pci bus id: 0000:01:00.0, compute capability: 8.9)
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:11 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
[03/13/2024 21:47:12 INFO] Scale of 0 disables regularizer.
2024-03-13 21:47:12.284377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2024-03-13 21:47:12.284413: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-03-13 21:47:12.284416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2024-03-13 21:47:12.284418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2024-03-13 21:47:12.284468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14988 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 4060 Ti, pci bus id: 0000:01:00.0, compute capability: 8.9)
[03/13/2024 21:47:14 INFO] Start training
2024-03-13 21:47:20.463620: F tensorflow/core/common_runtime/bfc_allocator.cc:458] Check failed: c->in_use() && (c->bin_num == kInvalidBinNum)
已放弃 (核心已转储)
can you give me some advise?

rpautrat · 2024-03-13T15:34:53Z

Hi, this is most probably due to an incompatible tensorflow version. This has been reported in multiple previous issues, including this one: #238 (comment)

2896963297 · 2024-11-05T14:40:55Z

兄弟，你解决这个问题了吗，我也遇到了，可以指点一下吗

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some errors in training magic-point model #318

some errors in training magic-point model #318

zaijianlixiang1996 commented Mar 13, 2024

rpautrat commented Mar 13, 2024

2896963297 commented Nov 5, 2024

some errors in training magic-point model #318

some errors in training magic-point model #318

Comments

zaijianlixiang1996 commented Mar 13, 2024

rpautrat commented Mar 13, 2024

2896963297 commented Nov 5, 2024