DFSU.read() very slow on AMD CPU or linux #709

bhlevca · 2024-08-01T01:57:01Z

Describe the bug
A clear and concise description of what the bug is.
I am expecting some differences but not 30x.

There is something in the read algorithm that may be favoured by Intel CPU.

I have a code

dfs = Dfsu(filename)
dsp = dfs.read(x=x, y=y)

The dfsu file is large ~ 52 GB.
The same read operation takes 17-18 sec on an Intel i7 CPU laptop running Windows 10 and 420 sec on a powerful AMD Ryzen 3950X desktop running Linux.

To Reproduce
Steps to reproduce the behavior:

Get a large DFSU file and use read (x,y)

System information:

Python version 3.12.4
MIKE IO version 1.7.1

The text was updated successfully, but these errors were encountered:

jsmariegaard · 2024-09-11T06:29:01Z

@JesperGr - this most be related to MIKE Core ...

bhlevca · 2024-09-19T13:02:18Z

@JesperGr - this most be related to MIKE Core ...

Is there anything we can do about it? It may be related to what you do in the read function. Is it using pandas in the underlying code? I know that pandas code is slower on AMD platforms.

jsmariegaard · 2024-09-19T15:16:07Z

@bhlevca could you try with a profiler - I hear Scalene is great https://github.com/plasma-umass/scalene :-)

bhlevca · 2024-09-19T15:44:35Z

I guess that I need to use the mikeio source files to do useful profiling

jsmariegaard · 2024-09-20T07:44:14Z

You get those when you install MIKE IO (pure python)

bhlevca · 2024-09-20T10:04:51Z

Usually, When I debug, I point Pythonpath to the git folder
From what you're saying, it will be enough to use the pip installed mikeio for the profiling purposes

bhlevca · 2024-10-25T18:20:34Z

I used Scalene but i didn't get extra information other than the read() function take minutes on an AMD CPU whether is Linux or Windows. Scalene did not get inside theread() function.
I looked at mikecore source files but it is complex and it would be easier for you to determine the problem if you have access to an AMD CPU.

I tested on an AMD computer with dual boot:

- on Windows on an Intel CPU   -  16 sec
- on  Windows on an AMD CPU    -  85 sec
- on Linux on the same AMS CPU - 442 sec

If you don't have the time please give me some instruction where to look, what files and what is calling what and how I can debug this thing.. Thanks

jsmariegaard · 2024-10-27T15:27:36Z

@JesperGr do you know anything about read speed of dfsu files on AMD using MIKE Core?

bhlevca · 2024-10-27T20:24:11Z

I don't, but if you give some guidance on how to do it, I will try to test the MIKE CORE read()

JesperGr · 2024-10-28T10:56:29Z

I am not aware of any performance differences when reading DFS files for AMD compared to Intel processors. Common performance issues are usually related to disc performance and not processor performance.

To test that, you could try run a raw MIKE-Core Python read test, i.e. not involving mikeio at all. Something similar to the ReadingDfs2File method in:

https://github.com/DHI/mikecore-python/blob/master/tests/examples_dfs2.py

which loops over all items and time steps.

bhlevca · 2024-10-28T11:45:09Z

There is a known problem with Pandas performance on AMD processors when compared with Intel processors because of the MKL library. I assumed that a 3D DFSU has the same issue because I thought that some calculations are needed to decrypt when reading the file. I am going to put the file on a SSD to test your assumption. Also, I am going to try your suggestions on the Intel laptop and on the big AMD workstation on the current disks

…

On Mon, Oct 28, 2024, 06:56 Jesper Grooss ***@***.***> wrote: I am not aware of any performance differences when reading DFS files for AMD compared to Intel processors. Common performance issues are usually related to disc performance and not processor performance. To test that, you could try run a raw MIKE-Core Python read test, i.e. not involving mikeio at all. Something similar to the ReadingDfs2File method in: https://github.com/DHI/mikecore-python/blob/master/tests/examples_dfs2.py which loops over all items and time steps. — Reply to this email directly, view it on GitHub <#709 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABMRXVVNEKHSOP472ANPYDZ5YC6HAVCNFSM6AAAAABLZSLYZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRGI2TINZQGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

bhlevca · 2024-10-29T20:14:01Z

You are right. The disk transfer speed is the main problem.
I tested on various disks, including SDD, and the read() function time varies wildly.
Closing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DFSU.read() very slow on AMD CPU or linux #709

DFSU.read() very slow on AMD CPU or linux #709

bhlevca commented Aug 1, 2024

jsmariegaard commented Sep 11, 2024

bhlevca commented Sep 19, 2024

jsmariegaard commented Sep 19, 2024

bhlevca commented Sep 19, 2024

jsmariegaard commented Sep 20, 2024

bhlevca commented Sep 20, 2024

bhlevca commented Oct 25, 2024 •

edited

Loading

jsmariegaard commented Oct 27, 2024

bhlevca commented Oct 27, 2024

JesperGr commented Oct 28, 2024

bhlevca commented Oct 28, 2024 via email •

edited

Loading

bhlevca commented Oct 29, 2024

DFSU.read() very slow on AMD CPU or linux #709

DFSU.read() very slow on AMD CPU or linux #709

Comments

bhlevca commented Aug 1, 2024

jsmariegaard commented Sep 11, 2024

bhlevca commented Sep 19, 2024

jsmariegaard commented Sep 19, 2024

bhlevca commented Sep 19, 2024

jsmariegaard commented Sep 20, 2024

bhlevca commented Sep 20, 2024

bhlevca commented Oct 25, 2024 • edited Loading

jsmariegaard commented Oct 27, 2024

bhlevca commented Oct 27, 2024

JesperGr commented Oct 28, 2024

bhlevca commented Oct 28, 2024 via email • edited Loading

bhlevca commented Oct 29, 2024

bhlevca commented Oct 25, 2024 •

edited

Loading

bhlevca commented Oct 28, 2024 via email •

edited

Loading