-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Nuget PDB files should not be included in the nuget #2742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Just a quick note, we prefer having these symbols part of the package. It makes a lot of things easier. The overhead is also not that big compared to the size of dependencies. |
Hi and Happy New year @nietras I think that the decompressed package now weighs in at 578MBs with 2 symbol packs and the included MacOS dylib. As @hanabi1224 explains here the size is an issue for redistribution to users rather than developers. I make simplification packages for end user tools so the debugging portion isn't something that my end users should need to worry about. They build the tool and use it. If or when they find a bug I fix it and re-ship the package. I would like them to be able to build from a package that could be around a 5fith of the size, I appreciate that other users have different needs. So I suggested the symbol packages even though my research suggests they may be a bit cumbersome to use. @nietras, What overhead do you encounter? |
Happy new year to you too @YanYas I definitely understand and think disk footprint is important, but I do not think the pdb files are the main culprit here. Having pdb files available in a symbol server is an option, but often this becomes a problem, if the server is not available for eternity. The main problem is how OnnxRuntime employs a "one size fits all" approach to nuget packaging. Made worse by there being "one big" nuget package for different execution providers. This means class library use of OnnxRuntime is problematic since you have to deal with this then defining the execution provider for any users of the class library. In my humble opinion the packaging and project structure needs to be overhauled to be modular and pluggable, this would match the true scope and benefit of the runtime. Below I try to give an example of how OnnxRuntime could be packaged as a small underlying set of packages that can then be references from meta-packages if need be. Each execution provider should have it's own set of packages. Prefix below with
Naming is just examples :) You can replace You can still have a single meta-package If there is any interest in this I might make a more formal issue proposing the restructuring to give us the above. |
I now understand and appreciate your point about only being able to debug if the symbol server is available, which isn't always the case.I think that's a really good proposal. I'd like to hear what others think about this one, but you'd have my vote |
Has a user explicitly complained about the size on disk? The primary disadvantage would be download time -- the loading or running time should not be impacted (only relevant dlls are loaded).
Fragmenting a single package into multiple packages is not without drawbacks (e.g. one for native assets, another for managed, a third for PDB files). Besides the user confusion of which package to install, there is also a namespace explosion, which increases maintenance and versioning friction. @YanYas, can you simply delete the PDB and other unrequired files after the package is installed, if disk space is an issue? This way your application can keep the disk utilization to a bare minimum for the end users. |
I think the package should at least provide a way for ppl to be able to opt-out the large PDB file in CI environment instead of doing it manaually. Personally I use below custom build target in Directory.Build.targets to achive it. <Target Name="RemoveOnnxRuntimePdb" AfterTargets="AfterBuild">
<WriteLinesToFile
File="$(OutDir)onnxruntime.pdb"
Lines="dummy"
Overwrite="true"
Encoding="Unicode" />
</Target>
<Target Name="RemoveRuntimePdbFromNuget" AfterTargets="ComputeFilesToPublish">
<ItemGroup>
<RuntimePdbFromNugetToRemove
Include="@(ResolvedFileToPublish)"
Condition=" '%(ResolvedFileToPublish.PackageName)' != ''
and '%(ResolvedFileToPublish.AssetType)' == 'native'
and '%(Extension)' == '.pdb'
" >
<Dummy>$(MSBuildThisFileDirectory)dummy.pdb</Dummy>
</RuntimePdbFromNugetToRemove>
<ResolvedFileToPublish Remove="@(RuntimePdbFromNugetToRemove)" />
<RuntimeDummyPdbToAdd Include="%(RuntimePdbFromNugetToRemove.Dummy)">
<AssetType>%(RuntimePdbFromNugetToRemove.AssetType)</AssetType>
<CopyToPublishDirectory>%(RuntimePdbFromNugetToRemove.CopyToPublishDirectory)</CopyToPublishDirectory>
<DestinationSubPath>%(RuntimePdbFromNugetToRemove.DestinationSubPath)</DestinationSubPath>
<PackageName>%(RuntimePdbFromNugetToRemove.PackageName)</PackageName>
<PackageVersion>%(RuntimePdbFromNugetToRemove.PackageVersion)</PackageVersion>
<RelativePath>%(RuntimePdbFromNugetToRemove.RelativePath)</RelativePath>
</RuntimeDummyPdbToAdd>
<ResolvedFileToPublish Include="@(RuntimeDummyPdbToAdd)" />
<Message Text="Removed native pdb files from nuget: @(RuntimePdbFromNugetToRemove)" Importance="High" Condition=" '@(RuntimePdbFromNugetToRemove)' != '' " />
</ItemGroup>
</Target> |
Nuget specifically has separate symbol packages for a reason. It is not good practice to include PDB files in your package, especially when they are 130MB. :-( It is also not reasonable to ask people to manually remove files from a nuget package they've downloaded. All this is to say... yes please on this issue. |
@nietras Thanks for writing up the proposal. In the upcoming 1.2 release (early March), we plan on separating the managed assembly into a separate package called Microsoft.ML.OnnxRuntime.Managed. Each execution provider will be delivered in it's own separate package without the managed assembly. This aligns with your proposal partially. At this time we do not plan to separate the pdb files and x86/linux/mac binaries into separate packages for the sake of simplicity of our first party users. xref: #2184 |
I'm having the same issue. Including pdb file vastly increase the pack size, could we put pdb file into seperate folder (at least not runtime..), or put them in a separate symbol package, like what ML.Net do. |
I encountered this issue just now. With the PDB files included release size goes beyond acceptable for a small inference project. Glad to hear that this is going to be fixed in the near future. |
Change has been merged. The issue will be fixed in onnxruntime 1.9 release. |
@YanYas Now I know: nuget.org doesn't accept symbol packages for native C/C++ code. |
@snnn So symbols are unavailable now? Is there an open issue for this? |
You can get it from https://github.com/microsoft/onnxruntime/releases/ . And in the future releases, we plan to publish them to https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/microsoft-public-symbols |
@snnn unfortunately releases does not include all releases, that is it does not include CUDA 11 builds which means we cannot find pdbs for these https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/onnxruntime-cuda-11/NuGet/Microsoft.ML.OnnxRuntime.Gpu/versions/1.21.0-dev-20241028-0514-dd28f09ce2
Source link and proper symbols upload to server would be greatly appreciated, it's now been over 3 years whats the progress on that? |
I appreciate that debug symbols are important for developers but they Symbol packages can be published separately, saving a lot of space for the reuse of the Nuget. as mentioned here.
To see how to publish the symbol packages, there is a guide here.
https://docs.microsoft.com/en-us/nuget/create-packages/symbol-packages-snupkg
The text was updated successfully, but these errors were encountered: