Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] AttributeError: function 'TVMGetLastPythonError' not found. Did you mean: 'TVMAPISetLastPythonError'? #1112

Closed
David-Sharma opened this issue Oct 23, 2023 · 16 comments
Labels
bug Confirmed bugs

Comments

@David-Sharma
Copy link
Contributor

🐛 Bug

(base) C:\Users\dmsha\dev\mlc>python -m mlc_llm.build --model Llama-2-7b-chat-hf --target vulkan --quantiz
ation q4f16_1 --llvm-mingw path/to/llvm-mingw
** Compiling models under Windows 11 has not been an issue for me. This started about 1 week ago and I have not been able to resolve. A reload has not cleared this problem.

To Reproduce

Steps to reproduce the behavior:

1.(base) C:\Users\dmsha\dev\mlc>python -m mlc_llm.build --model Llama-2-7b-chat-hf --target vulkan --quantiz
ation q4f16_1 --llvm-mingw path/to/llvm-mingw
1.
1.

Using path "dist\models\Llama-2-7b-chat-hf" for model "Llama-2-7b-chat-hf"
Target configured: vulkan -keys=vulkan,gpu -max_num_threads=256 -max_shared_memory_per_block=32768 -max_threads_per_block=256 -supports_16bit_buffer=1 -supports_8bit_buffer=1 -supports_float16=1 -supports_float32=1 -supports_int16=1 -supports_int32=1 -supports_int8=1 -supports_storage_buffer_storage_class=1 -thread_warp_size=1

The following 2 lines reproduces themselves many times:
[14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_dim
Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

[14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_strategy
Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

[14:37:27] D:\a\package\package\tvm\src\relax\ir\expr.cc:174: Check failed: index < tuple_info->fields.size() (197 vs. 197) : Index out of bounds: Tuple params is of size 197, and cannot be accessed with index 197
Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\build.py", line 46, in
main()
File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\build.py", line 42, in main
core.build_model_from_args(parsed_args)
File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\core.py", line 648, in build_model_from_args
new_params = utils.convert_weights(param_manager, params, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\utils.py", line 229, in convert_weights
mod_transform = relax.transform.LazyTransformParams()(mod_transform)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\dmsha\miniconda3\Lib\site-packages\tvm\ir\transform.py", line 238, in call
return ffi_transform_api.RunPass(self, mod)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\dmsha\miniconda3\Lib\site-packages\tvm_ffi_ctypes\packed_func.py", line 239, in call
raise_last_ffi_error()
File "C:\Users\dmsha\miniconda3\Lib\site-packages\tvm_ffi\base.py", line 415, in raise_last_ffi_error
LIB.TVMGetLastPythonError.restype = ctypes.c_void_p
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\dmsha\miniconda3\Lib\ctypes_init
.py", line 389, in getattr
func = self.getitem(name)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\dmsha\miniconda3\Lib\ctypes_init
.py", line 394, in getitem
func = self._FuncPtr((name_or_ordinal, self))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: function 'TVMGetLastPythonError' not found. Did you mean: 'TVMAPISetLastPythonError'?

As before, I expect a folder containing the weights and library files

Environment

  • Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): Vulkan
  • Operating system (e.g. Ubuntu/Windows/MacOS/...): Windows 11
  • Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...) PC RTX3060
  • How you installed MLC-LLM (conda, source): Conda
  • How you installed TVM-Unity (pip, source):pip
  • Python version (e.g. 3.10): V3.11.4
  • GPU driver version (if applicable):
  • CUDA/cuDNN version (if applicable):
  • TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):

ibinfo().items()))"
USE_NVTX: OFF
USE_GTEST: AUTO
SUMMARIZE: OFF
USE_IOS_RPC: OFF
USE_MSC: OFF
USE_ETHOSU:
CUDA_VERSION: NOT-FOUND
USE_LIBBACKTRACE: AUTO
DLPACK_PATH: 3rdparty/dlpack/include
USE_TENSORRT_CODEGEN: OFF
USE_THRUST: OFF
USE_TARGET_ONNX: OFF
USE_AOT_EXECUTOR: ON
BUILD_DUMMY_LIBTVM: OFF
USE_CUDNN: OFF
USE_TENSORRT_RUNTIME: OFF
USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF
USE_CCACHE: AUTO
USE_ARM_COMPUTE_LIB: OFF
USE_CPP_RTVM:
USE_OPENCL_GTEST: /path/to/opencl/gtest
USE_MKL: OFF
USE_PT_TVMDSOOP: OFF
MLIR_VERSION: NOT-FOUND
USE_CLML: OFF
USE_STACKVM_RUNTIME: OFF
USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF
ROCM_PATH: /opt/rocm
USE_DNNL: OFF
USE_VITIS_AI: OFF
USE_MLIR: OFF
USE_RCCL: OFF
USE_LLVM: llvm-config --link-static
USE_VERILATOR: OFF
USE_TF_TVMDSOOP: OFF
USE_THREADS: ON
USE_MSVC_MT: OFF
BACKTRACE_ON_SEGFAULT: OFF
USE_GRAPH_EXECUTOR: ON
USE_NCCL: OFF
USE_ROCBLAS: OFF
GIT_COMMIT_HASH: 30b4fa3c13fc80d5c9151a9dc445d22c57ced3e0
USE_VULKAN: ON
USE_RUST_EXT: OFF
USE_CUTLASS: OFF
USE_CPP_RPC: OFF
USE_HEXAGON: OFF
USE_CUSTOM_LOGGING: OFF
USE_UMA: OFF
USE_FALLBACK_STL_MAP: OFF
USE_SORT: ON
USE_RTTI: ON
GIT_COMMIT_TIME: 2023-10-17 21:33:54 -0700
USE_HEXAGON_SDK: /path/to/sdk
USE_BLAS: none
USE_ETHOSN: OFF
USE_LIBTORCH: OFF
USE_RANDOM: ON
USE_CUDA: OFF
USE_COREML: OFF
USE_AMX: OFF
BUILD_STATIC_RUNTIME: OFF
USE_CMSISNN: OFF
USE_KHRONOS_SPIRV: OFF
USE_CLML_GRAPH_EXECUTOR: OFF
USE_TFLITE: OFF
USE_HEXAGON_GTEST: /path/to/hexagon/gtest
PICOJSON_PATH: 3rdparty/picojson
USE_OPENCL_ENABLE_HOST_PTR: OFF
INSTALL_DEV: OFF
USE_PROFILER: ON
USE_NNPACK: OFF
LLVM_VERSION: 17.0.2
USE_OPENCL: OFF
COMPILER_RT_PATH: 3rdparty/compiler-rt
RANG_PATH: 3rdparty/rang/include
USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF
USE_OPENMP: OFF
USE_BNNS: OFF
USE_CUBLAS: OFF
USE_METAL: OFF
USE_MICRO_STANDALONE_RUNTIME: OFF
USE_HEXAGON_EXTERNAL_LIBS: OFF
USE_ALTERNATIVE_LINKER: AUTO
USE_BYODT_POSIT: OFF
USE_HEXAGON_RPC: OFF
USE_MICRO: OFF
DMLC_PATH: 3rdparty/dmlc-core/include
INDEX_DEFAULT_I64: ON
USE_RELAY_DEBUG: OFF
USE_RPC: ON
USE_TENSORFLOW_PATH: none
TVM_CLML_VERSION:
USE_MIOPEN: OFF
USE_ROCM: OFF
USE_PAPI: OFF
USE_CURAND: OFF
TVM_CXX_COMPILER_PATH: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.35.32215/bin/HostX64/x64/cl.exe
HIDE_PRIVATE_SYMBOLS: OFF

  • Any other relevant information:
    python -c "import tvm; print(tvm.file)"
    C:\Users\dmsha\miniconda3\Lib\site-packages\tvm_init_.py

python -c "import tvm; print(tvm._ffi.base._LIB)"
<CDLL 'C:\Users\dmsha\miniconda3\Lib\site-packages\tvm\tvm.dll', handle 7ffa41c20000 at 0x230eabbea10>

Additional context

A blank folder is C:\Users\dmsha\dev\mlc\dist\Llama-2-7b-chat-hf-q4f16_1 is created

@David-Sharma David-Sharma added the bug Confirmed bugs label Oct 23, 2023
@junrushao
Copy link
Member

this is an issue from apache/tvm#15596. seems there have been multiple reports in MLC LLM. CC @Lunderberg the original author of this PR if you could take a look

@junrushao
Copy link
Member

Would you mind sharing a Python stacktrace to this error msg?

The following 2 lines reproduces themselves many times:
[14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_dim
Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.
[14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_strategy
Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

This will help us find out where this .shard_dim is used exactly in Python codebase

@Lunderberg
Copy link
Contributor

Hmm. The AttributeError: function 'TVMGetLastPythonError' not found. seems rather odd. Did you recompile TVM after pulling?

@tqchen
Copy link
Contributor

tqchen commented Oct 23, 2023

I added a note in another thread. By default MSVC do not export the function that are explicitly marked as exported. So in this case it is because TVMGetLastPythonError is not marked via TVM_DLL in declaration header

@Lunderberg
Copy link
Contributor

Lunderberg commented Oct 23, 2023

Ah, Windows seems to be the key point, as Windows doesn't expose symbols by default, and so the extern "C" is insufficient. I had been thinking about exposing to non-tvm libraries, and missed the exposure to other portions of tvm on windows.

Can you try with this hotfix applied?

Edit: Hehe, good timing @tqchen, and I like that it looks like a simple fix. 😁

@tqchen
Copy link
Contributor

tqchen commented Oct 24, 2023

Another fix that likely can resolve this issue of build apache/tvm#15973

@Sing-Li
Copy link
Contributor

Sing-Li commented Oct 25, 2023

image

Please pardon my (possible) related comment. The Note section above. Downloading and "rename it to zstd.dll and copy to the same folder as tvm.dll" part of the instruction --- is almost impossible for new users to accomplish because they have to understand how conda relates to python, and where miniconda keeps its python site libs before they can find "the same folder as tvm.dll". I hope one of the above fixes will make sure that zstd.dll is always included as part of the tvm bundle/nightly 🙏

@junrushao
Copy link
Member

That's a nightmare I was trying hard to solve :((

The zstd.dll dependency is introduced by LLVM which we use to generate efficient code, but somehow it is not shipped by default in some Windows distributions...I was trying to static link it into libtvm.dll, but it failed miserably in many different ways...CC @tqchen if you have a better idea :((

@tqchen
Copy link
Contributor

tqchen commented Oct 26, 2023

wonder if wheel bunlder can ship that like other ones

@David-Sharma
Copy link
Contributor Author

Can you solve this by a modification of the docs? (It's how I do understand it)

Original docs: It is likely zstd, a dependency to LLVM, was missing. Please download the precompiled binary, rename it to zstd.dll and copy to the same folder as tvm.dll

Modified: It is likely zstd, a dependency to LLVM, was missing. Please download the precompiled binary, rename it to zstd.dll and copy to the same folder as tvm.dll. Hint - Perform a search for "tvm.dll" and identify the folder in which the path includes the name of the current environment eg. mlc-chat-venv. Copy zstd.dll to that folder.

@tqchen
Copy link
Contributor

tqchen commented Oct 26, 2023

@David-Sharma great suggestions, do you mind open a PR

@David-Sharma
Copy link
Contributor Author

@tqchen Submitted #1135

@tqchen
Copy link
Contributor

tqchen commented Oct 27, 2023

@David-Sharma do you mind directly fork and update the respective files under https://github.com/mlc-ai/mlc-llm/tree/main/docs

@junrushao
Copy link
Member

I think It's fixed now :))

@junrushao
Copy link
Member

Let's consolidate this to #1135. Let me know if it works now btw!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs
Projects
None yet
Development

No branches or pull requests

5 participants