You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Your kernel implementation appears to be more straightforward than anticipated. Unfortunately, it does not outperform cuSPARSE or other rapid sparse kernel libraries in terms of efficiency. Concerning your benchmarks, the evaluation seems biased, as all your test files focus on single-layer operations without considering any framework overhead, whereas your baseline measurements are taken within a framework context. For a more equitable comparison, could you supply relevant artifacts or adjust your testing methodology to include framework overheads in both your implementation and the baseline?
The text was updated successfully, but these errors were encountered:
I agree with you that the experiment result could be biased due to the framework overhead, since we didn't consider the framework overhead when design the baselines. For DGL, we run the spmm by using the update_all function under torch.no_grad.
And it is reasonable that our implementation does not outperform cuSPARSE, since our main effort is to use the experiment to prove our observations.
Unfortunately, due to my career change, I do not have the bandwidth to give you more results for a accurate comparison with baseline. But I am willing to answer any questions you may have related any technical detail of our work.
Your kernel implementation appears to be more straightforward than anticipated. Unfortunately, it does not outperform cuSPARSE or other rapid sparse kernel libraries in terms of efficiency. Concerning your benchmarks, the evaluation seems biased, as all your test files focus on single-layer operations without considering any framework overhead, whereas your baseline measurements are taken within a framework context. For a more equitable comparison, could you supply relevant artifacts or adjust your testing methodology to include framework overheads in both your implementation and the baseline?
The text was updated successfully, but these errors were encountered: