Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-Wframe-larger-than= in drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c #1455

Closed
nathanchance opened this issue Sep 9, 2021 · 8 comments
Closed
Labels
-Wframe-larger-than= [ARCH] arm32 This bug impacts ARCH=arm [ARCH] s390 This bug impacts ARCH=s390 [BUG] Untriaged Something isn't working Clean build Issue needs to be fixed to get a warning-free all*config build [CONFIG] allmodconfig Issue affects allmodconfig on certain architectures CONFIG_WERROR Has in an error with CONFIG_WERROR (all{mod,yes}config) (or emits a non-compiler warning) duplicate This issue or pull request already exists
Milestone

Comments

@nathanchance
Copy link
Member

In several different configurations, I see an excessive amount of stack usage in certain functions within drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:

arm32-allmodconfig.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3043:6: error: stack frame size (1384) exceeds limit (1024) in function 'bw_calcs' [-Werror,-Wframe-larger-than]
arm32-allmodconfig.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: error: stack frame size (5560) exceeds limit (1024) in function 'calculate_bandwidth' [-Werror,-Wframe-larger-than]

arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3043:6: error: stack frame size (1376) exceeds limit (1024) in function 'bw_calcs' [-Werror,-Wframe-larger-than]
arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: error: stack frame size (5384) exceeds limit (1024) in function 'calculate_bandwidth' [-Werror,-Wframe-larger-than]

s390x-allmodconfig.log:drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: error: stack frame size (5184) exceeds limit (2048) in function 'calculate_bandwidth' [-Werror,-Wframe-larger-than]

Initial report.

@nathanchance nathanchance added [BUG] Untriaged Something isn't working CONFIG_WERROR Has in an error with CONFIG_WERROR (all{mod,yes}config) (or emits a non-compiler warning) -Wframe-larger-than= labels Sep 9, 2021
@kees kees added the Clean build Issue needs to be fixed to get a warning-free all*config build label Sep 22, 2021
@nathanchance nathanchance added [ARCH] arm32 This bug impacts ARCH=arm [ARCH] s390 This bug impacts ARCH=s390 [CONFIG] allmodconfig Issue affects allmodconfig on certain architectures labels Dec 10, 2021
@nathanchance nathanchance added this to the allmodconfig milestone Dec 10, 2021
@nathanchance
Copy link
Member Author

With ARCH=arm allmodconfig:

calculate_bandwidth

$ python3 $CBL_GIT/frame-larger-than/frame_larger_than.py drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.o calculate_bandwidth
calculate_bandwidth:
	4	int32_t                       	number_of_aligned_displays_with_no_margin
	4	int32_t                       	number_of_displays_enabled_with_margin
	4	int32_t                       	number_of_displays_enabled
	4	enum bw_defines               	nbp_state_change_enable_blank
	4	enum bw_defines               	mode_check
	4	enum bw_defines               	rotation_check
	4	enum bw_defines               	fbc_check
	4	enum bw_defines               	lb_size_check
	4	enum bw_defines               	vsr_check
	4	enum bw_defines               	hsr_check
	4	enum bw_defines               	pipe_check
	4	enum bw_defines               	voltage
	4	enum bw_defines               	yclk_message
	4	enum bw_defines               	sclk_message
	1	bool                          	lpt_enabled
	1	bool                          	fbc_enabled
	1	bool                          	d1_underlay_enable
	1	bool                          	d0_underlay_enable
	4	int32_t                       	k
	4	int32_t                       	j
	4	int32_t                       	i
	4	int32_t                       	num_cursor_lines
	4	uint32_t                      	max_chunks_fbc_mode
	0	const uint32_t                	dmif_chunk_buff_margin
	0	const uint32_t                	s_high
	0	const uint32_t                	s_mid6
	0	const uint32_t                	s_mid5
	0	const uint32_t                	s_mid4
	0	const uint32_t                	s_mid3
	0	const uint32_t                	s_mid2
	0	const uint32_t                	s_mid1
	0	const uint32_t                	s_low
	0	const int32_t                 	low
	0	const int32_t                 	mid
	0	const int32_t                 	high
	0	const int32_t                 	pixels_per_chunk
	4	enum bw_defines*              	surface_type
	4	enum bw_defines*              	tiling_mode
	4	struct bw_fixed*              	sclk
	4	struct bw_fixed*              	yclk
kcalloc:
kmalloc_array:
	4	size_t                        	bytes
	4	size_t                        	__a
	4	size_t                        	__b
	4	size_t*                       	__d
	4	size_t                        	bytes
kmalloc:
	4	unsigned int                  	index
kcalloc:
kmalloc_array:
	4	size_t                        	bytes
	4	size_t                        	__a
	4	size_t                        	__b
	4	size_t*                       	__d
	4	size_t                        	bytes
kmalloc:
	4	unsigned int                  	index
kcalloc:
kmalloc_array:
	4	size_t                        	bytes
	4	size_t                        	__a
	4	size_t                        	__b
	4	size_t*                       	__d
	4	size_t                        	bytes
kmalloc:
	4	unsigned int                  	index
kcalloc:
kmalloc_array:
	4	size_t                        	bytes
	4	size_t                        	__a
	4	size_t                        	__b
	4	size_t*                       	__d
	4	size_t                        	bytes
kmalloc:
	4	unsigned int                  	index
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_equ:
bw_equ:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_div:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_neq:
bw_mtn:
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_leq:
bw_neq:
bw_mtn:
bw_mtn:
bw_mtn:
bw_div:
bw_div:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_min2:
bw_mtn:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_mtn:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_max2:
bw_leq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_leq:
bw_leq:
bw_leq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_leq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_max2:
bw_div:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_ltn:
bw_div:
bw_div:
bw_div:
bw_div:
bw_ltn:
bw_div:
bw_div:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_mtn:
bw_min2:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_max2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_equ:
bw_div:
bw_min2:
bw_max2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_mod:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_equ:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_max2:
bw_div:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_mtn:
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_max2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_max2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_max3:
bw_max2:
bw_max2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_max3:
bw_max2:
bw_max2:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_div:
bw_min2:
bw_div:
bw_div:
bw_div:
bw_min2:
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_div:
bw_div:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_min2:
bw_mtn:
bw_div:
bw_div:
bw_max2:
bw_div:
bw_div:
bw_max2:
bw_max3:
bw_max2:
bw_max2:
bw_div:
bw_min2:
bw_div:
bw_div:
bw_div:
bw_min2:
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_equ:
bw_mtn:
bw_ltn:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_equ:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_leq:
bw_leq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_min2:
bw_mtn:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_max3:
bw_max2:
bw_max2:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_ltn:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_ltn:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_max3:
bw_max2:
bw_max2:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_leq:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_ltn:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_mtn:
bw_ltn:
bw_ltn:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_max2:
bw_ltn:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_max2:
bw_ltn:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_mtn:
bw_equ:
bw_equ:
bw_equ:
bw_equ:
bw_equ:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_max2:
bw_fixed_to_int:
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_mtn:
bw_leq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_min2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_fixed_to_int:
bw_min2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_mtn:
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_max2:
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_ltn:
bw_mtn:
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_ltn:
bw_mtn:
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_div:
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_mtn:
bw_mtn:
bw_max2:
bw_ltn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_ltn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_ltn:
bw_meq:
bw_ltn:
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_mtn:
bw_ltn:
bw_ltn:
bw_leq:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_mtn:
bw_div:
bw_div:
bw_max2:
bw_div:
bw_div:
bw_max2:
bw_max3:
bw_max2:
bw_max2:
bw_max2:
bw_div:
bw_div:
bw_div:
bw_max2:
bw_max2:
bw_ltn:
bw_ltn:
bw_div:
bw_max2:
bw_max2:
bw_max3:
bw_max2:
bw_max2:
bw_max3:
bw_max2:
bw_max2:
bw_max2:
bw_max2:
bw_max2:
bw_max2:
bw_ltn:
bw_ltn:
bw_mtn:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_ltn:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_ltn:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_mtn:
bw_div:
bw_max2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_min2:
bw_div:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_div:
bw_div:
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_max2:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_mtn:
bw_mtn:
bw_div:
bw_div:
bw_min2:
bw_mtn:
bw_div:
bw_div:
bw_min2:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_div:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_sub:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_div:
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_add:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_max2:
bw_div:
bw_ltn:
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_div:
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_add:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_div:
bw_mtn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_sub:
	8	struct bw_fixed               	res
	8	struct bw_fixed               	res
bw_min2:
bw_div:
bw_fixed_to_int:
bw_meq:
bw_div:
bw_fixed_to_int:
bw_meq:
bw_div:
bw_fixed_to_int:
bw_meq:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_div:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_equ:
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_equ:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_equ:
bw_ltn:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_equ:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_equ:
bw_int_to_fixed:
	8	struct bw_fixed               	res

bw_calcs

$ python3 $CBL_GIT/frame-larger-than/frame_larger_than.py drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.o bw_calcs
bw_calcs:
	4	struct bw_calcs_data*         	data
kzalloc:
kmalloc:
	4	unsigned int                  	index
	4	unsigned int                  	index
kmalloc_large:
	4	unsigned int                  	order
	4	unsigned int                  	order
populate_initial_data:
	4	int                           	num_displays
	4	int                           	j
	4	int                           	i
	1	bool                          	__ret_do_once
	4	int                           	__ret_warn_on
	4	unsigned int                  	pixel_clock_100hz
	1	bool                          	__already_done
	4	int                           	num_displays
	4	int                           	j
	4	int                           	i
	1	bool                          	__ret_do_once
	4	int                           	__ret_warn_on
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
fixed31_32_to_bw_fixed:
	8	struct bw_fixed               	result
fixed31_32_to_bw_fixed:
	8	struct bw_fixed               	result
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
fixed31_32_to_bw_fixed:
	8	struct bw_fixed               	result
fixed31_32_to_bw_fixed:
	8	struct bw_fixed               	result
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
	4	unsigned int                  	pixel_clock_100hz
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
fixed31_32_to_bw_fixed:
	8	struct bw_fixed               	result
fixed31_32_to_bw_fixed:
	8	struct bw_fixed               	result
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
all_displays_in_sync:
	0	const struct pipe_ctx*[]      	active_pipes
	4	int                           	i
	4	int                           	num_active_pipes
	0	const struct pipe_ctx*[]      	active_pipes
	4	int                           	i
	4	int                           	num_active_pipes
	8	struct bw_fixed               	high_sclk
	1	uint8_t                       	yclk_lvl
	8	struct bw_fixed               	mid1_sclk
	8	struct bw_fixed               	mid2_sclk
	8	struct bw_fixed               	mid3_sclk
	8	struct bw_fixed               	mid4_sclk
	8	struct bw_fixed               	mid5_sclk
	8	struct bw_fixed               	mid6_sclk
	8	struct bw_fixed               	low_sclk
	8	struct bw_fixed               	high_yclk
	8	struct bw_fixed               	mid_yclk
	8	struct bw_fixed               	low_yclk
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
bw_int_to_fixed:
	8	struct bw_fixed               	res
bw_fixed_to_int:
bw_fixed_to_int:
bw_fixed_to_int:
is_display_configuration_supported:
	4	uint32_t                      	int_max_clk
	4	uint32_t                      	int_max_clk
bw_fixed_to_int:
bw_fixed_to_int:

Seems like we are just getting destroyed by excessive inlining, which makes sense given the inlined functions are small. One odd piece about all of this code is that struct bw_fixed just has an integer value:

struct bw_fixed {
	int64_t value;
};

Seems like it would just be better to use int64_t directly, which should help clang out with with constant propagation, which seems like might be the issue here (or perhaps clang should just do a better job figuring out that is what is going on?). This diff massively improves the situation. With ARCH=arm allmodconfig:

Prior to patch:

drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3051:6: warning: stack frame size (1384) exceeds limit (1024) in 'bw_calcs' [-Wframe-larger-than]
bool bw_calcs(struct dc_context *ctx,
     ^
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: warning: stack frame size (5560) exceeds limit (1024) in 'calculate_bandwidth' [-Wframe-larger-than]
static void calculate_bandwidth(
            ^
2 warnings generated.

After patch:

drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: warning: stack frame size (1152) exceeds limit (1024) in 'calculate_bandwidth' [-Wframe-larger-than]
static void calculate_bandwidth(
            ^
1 warning generated.

After patch (CONFIG_KASAN=n):

drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: warning: stack frame size (1088) exceeds limit (1024) in 'calculate_bandwidth' [-Wframe-larger-than]
static void calculate_bandwidth(
            ^
1 warning generated.

This diff gets me slightly closer (with KASAN: 1144, without: 1072):

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/bw_fixed.c b/drivers/gpu/drm/amd/display/dc/calcs/bw_fixed.c
index 39b4d87ec1e9..c817d0e36e6b 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/bw_fixed.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/bw_fixed.c
@@ -46,14 +46,6 @@ static uint64_t abs_i64(int64_t arg)
                return (uint64_t)(-arg);
 }

-int64_t bw_int_to_fixed_nonconst(int64_t value)
-{
-       int64_t res;
-       ASSERT(value < BW_FIXED_MAX_I32 && value > BW_FIXED_MIN_I32);
-       res = value << BW_FIXED_BITS_PER_FRACTIONAL_PART;
-       return res;
-}
-
 int64_t bw_frc_to_fixed(int64_t numerator, int64_t denominator)
 {
        int64_t res;
diff --git a/drivers/gpu/drm/amd/display/dc/inc/bw_fixed.h b/drivers/gpu/drm/amd/display/dc/inc/bw_fixed.h
index 97c4d90a11f4..a7072112025e 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/bw_fixed.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/bw_fixed.h
@@ -62,16 +62,14 @@ static inline int64_t bw_max3(int64_t v1,
        return bw_max2(bw_max2(v1, v2), v3);
 }

-int64_t bw_int_to_fixed_nonconst(int64_t value);
 static inline int64_t bw_int_to_fixed(int64_t value)
 {
-       if (__builtin_constant_p(value)) {
-               int64_t res;
+       if (__builtin_constant_p(value))
                BUILD_BUG_ON(value > BW_FIXED_MAX_I32 || value < BW_FIXED_MIN_I32);
-               res = value << BW_FIXED_BITS_PER_FRACTIONAL_PART;
-               return res;
-       } else
-               return bw_int_to_fixed_nonconst(value);
+       else
+               ASSERT(value < BW_FIXED_MAX_I32 && value > BW_FIXED_MIN_I32);
+
+       return value << BW_FIXED_BITS_PER_FRACTIONAL_PART;
 }

 static inline int32_t bw_fixed_to_int(int64_t value)

There are probably further ways to clean this up but at the top of dce_calcs.c, there is a comment:

/*
 * NOTE:
 *   This file is gcc-parseable HW gospel, coming straight from HW engineers.
 *
 * It doesn't adhere to Linux kernel style and sometimes will do things in odd
 * ways. Unless there is something clearly wrong with it the code should
 * remain as-is as it provides us with a guarantee from HW that it is correct.
 */

so we probably cannot modify it too much (that sentiment seems recent too). I will continue to poke my head around tomorrow.

@nickdesaulniers
Copy link
Member

This was worked around in
6f6cb17

moving the description from #1918 to track this here:

6f6cb17 mentions 5k+ stack usage. (#1455)

After clang-18's (https://reviews.llvm.org/rGe698695fbbf62e6676f8907665187f2d2c4d814b), I only see:

$ ARCH=arm make LLVM=1 -j128 allyesconfig drivers/gpu/drm/
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dce_calcs.c:77:13: error: stack frame size (1056) exceeds limit (1024) in 'calculate_bandwidth' [-Werror,-Wframe-larger-than]
   77 | static void calculate_bandwidth(
      |             ^

So there's a slight improvement in clang-18, but more is needed to be below the -Wframe-larger-than= threshold.

Tangentially related to #1752.

See also: llvm/llvm-project#41896

@nickdesaulniers
Copy link
Member

bw_int_to_fixed, bw_add, bw_sub, and bw_mod seem to be producing lots of stack variables according to frame-larger-than.py. Making those all noinline decreases what frame-larger-than reports, but doesn't change (or makes worse) the amount of stack usage. (marking them all __always_inline doesn't help either).

FWICT, it looks like SROA is failing to decompose the many little struct bw_fixed because they escape to bw_int_to_fixed and bw_frc_to_fixed. I wonder if there's a phase ordering issue where SROA is running before inlining and not after?


So if I build this driver for 32b x86 calculate_bandwidth only uses 156B of stack and 240B for x86_64, but 1096B for arm32.

For example, on x86_64 -mllvm -debug-only=sroa:

SROA function: calculate_bandwidth
SROA alloca:   %agg.tmp12775 = alloca %struct.bw_fixed, align 8
  Rewriting FCA loads and stores...
Slices of alloca:   %agg.tmp12775 = alloca %struct.bw_fixed, align 8
  [0,8) slice #0 (splittable)
    used by:   call void @llvm.lifetime.start.p0(i64 8, ptr %agg.tmp12775) #17
  [0,8) slice #1 (splittable)
    used by:   store i64 %call12776, ptr %coerce.dive12777, align 8
  [0,8) slice #2 (splittable)
    used by:   %5950 = load i64, ptr %coerce.dive12779, align 8
  [0,8) slice #3 (splittable)
    used by:   call void @llvm.lifetime.end.p0(i64 8, ptr %agg.tmp12775) #17
Pre-splitting loads and stores
  Searching for candidate loads and stores
Rewriting alloca partition [0,8) to:   %agg.tmp12775.sroa.0 = alloca i64, align 8
  rewriting [0,8) slice #0 (splittable)
   Begin:(0, 8) NewBegin:(0, 8) NewAllocaBegin:(0, 8)
    original:   call void @llvm.lifetime.start.p0(i64 8, ptr %agg.tmp12775) #17
          to:   call void @llvm.lifetime.start.p0(i64 8, ptr %agg.tmp12775.sroa.0)
  rewriting [0,8) slice #1 (splittable)
   Begin:(0, 8) NewBegin:(0, 8) NewAllocaBegin:(0, 8)
    original:   store i64 %call12776, ptr %coerce.dive12777, align 8
          to:   store i64 %call12776, ptr %agg.tmp12775.sroa.0, align 8
  rewriting [0,8) slice #2 (splittable)
   Begin:(0, 8) NewBegin:(0, 8) NewAllocaBegin:(0, 8)
    original:   %5950 = load i64, ptr %coerce.dive12779, align 8
          to:   %agg.tmp12775.sroa.0.0.load = load i64, ptr %agg.tmp12775.sroa.0, align 8
  rewriting [0,8) slice #3 (splittable)
   Begin:(0, 8) NewBegin:(0, 8) NewAllocaBegin:(0, 8)
    original:   call void @llvm.lifetime.end.p0(i64 8, ptr %agg.tmp12775) #17
          to:   call void @llvm.lifetime.end.p0(i64 8, ptr %agg.tmp12775.sroa.0)
  Speculating PHIs
  Rewriting Selects
Deleting dead instruction:   call void @llvm.lifetime.end.p0(i64 8, ptr %agg.tmp12775) #17
Deleting dead instruction:   %5950 = load i64, ptr %coerce.dive12779, align 8
Deleting dead instruction:   %coerce.dive12779 = getelementptr inbounds %struct.bw_fixed, ptr %agg.tmp12775, i32 0, i32 0
Deleting dead instruction:   store i64 %call12776, ptr %coerce.dive12777, align 8
Deleting dead instruction:   %coerce.dive12777 = getelementptr inbounds %struct.bw_fixed, ptr %agg.tmp12775, i32 0, i32 0
Deleting dead instruction:   call void @llvm.lifetime.start.p0(i64 8, ptr %agg.tmp12775) #17
Deleting dead instruction:   %agg.tmp12775 = alloca %struct.bw_fixed, align 8
SROA alloca:   %agg.tmp12761 = alloca %struct.bw_fixed, align 8
...

but 32b arm:

SROA function: calculate_bandwidth
SROA alloca:   %agg.tmp9057 = alloca %struct.bw_fixed, align 8
  Rewriting FCA loads and stores...
    original:   %5950 = load [1 x i64], ptr %coerce.dive9059, align 8
          to:   %.fca.0.load = load i64, ptr %.fca.0.gep, align 8
Can't analyze slices for alloca:   %agg.tmp9057 = alloca %struct.bw_fixed, align 8
  A pointer to this alloca escaped by:
    call void @bw_int_to_fixed(ptr sret(%struct.bw_fixed) align 8 %agg.tmp9057, i64 noundef 8)
SROA alloca:   %agg.tmp9050 = alloca %struct.bw_fixed, align 8
  Rewriting FCA loads and stores...
    original:   %5944 = load [1 x i64], ptr %coerce.dive9052, align 8
          to:   %.fca.0.load9119 = load i64, ptr %.fca.0.gep9118, align 8
Can't analyze slices for alloca:   %agg.tmp9050 = alloca %struct.bw_fixed, align 8
  A pointer to this alloca escaped by:
    call void @bw_int_to_fixed(ptr sret(%struct.bw_fixed) align 8 %agg.tmp9050, i64 noundef 4)
...

By the time we get to the initial SROA pass on calculate_bandwidth though, on x86_64 the temporary aggregates have no users (I think SROA just deletes them as dead). But on arm32 we have a bunch of memcpy's:

  call void @llvm.lifetime.start.p0(i64 8, ptr %tmp235) #17
  call void @bw_int_to_fixed(ptr sret(%struct.bw_fixed) align 8 %tmp235, i64 noundef 1)
  call void @llvm.memcpy.p0.p0.i32(ptr align 8 %arrayidx234, ptr align 8 %tmp235, i32 8, i1 false)
  call void @llvm.lifetime.end.p0(i64 8, ptr %tmp235) #17

or, it seems that for x86_64 SROA is able to boil away the above somehow. x86 version:

  call void @llvm.lifetime.start.p0(i64 8, ptr %tmp236) #17
  %call237 = call i64 @bw_int_to_fixed(i64 noundef 1) #19
  %coerce.dive238 = getelementptr inbounds %struct.bw_fixed, ptr %tmp236, i32 0, i32 0
  store i64 %call237, ptr %coerce.dive238, align 8
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %arrayidx235, ptr align 8 %tmp236, i64 8, i1 false)
  call void @llvm.lifetime.end.p0(i64 8, ptr %tmp236) #17

oh, is clang unwrapping the struct for x86_64 but not arm32? Yeah: https://godbolt.org/z/hzxbsKbc3 the struct gets unwrapped by clang's codegen of LLVM IR for x86, x86_64, aarch64, but not arm32...

@nickdesaulniers
Copy link
Member

From what I've tracked down so far, there seems to be a difference between X86_64ABIInfo::classifyArgumentType and ARMABIInfo::classifyArgumentType. Reading ARM AAPCS32 §8.2 Argument Passing Conventions and §6.5 Parameter Passing
we have:

When a Composite Type argument is assigned to core registers (either fully or partially), the behavior is as if the argument had been stored to memory at a word-aligned (4-byte) address and then loaded into consecutive registers using a suitable load-multiple instruction.

I think ARMABIInfo::classifyArgumentType is actually missing support for that.

@nickdesaulniers
Copy link
Member

Looking at this further today, I think it's actually -freg-struct-return that is helping 32b x86.

x86 (before SROA):

  %compression_rate234 = getelementptr inbounds %struct.bw_calcs_data, ptr %162, i32 0, i32 162
  %arrayidx235 = getelementptr [12 x %struct.bw_fixed], ptr %compression_rate234, i32 0, i32 1
  call void @llvm.lifetime.start.p0(i64 8, ptr %tmp236) #17
  %call237 = call i64 @bw_int_to_fixed(i64 inreg noundef 1) #19
  %coerce.dive238 = getelementptr inbounds %struct.bw_fixed, ptr %tmp236, i32 0, i32 0
  store i64 %call237, ptr %coerce.dive238, align 4
  call void @llvm.memcpy.p0.p0.i32(ptr align 4 %arrayidx235, ptr align 4 %tmp236, i32 8, i1 false)
  call void @llvm.lifetime.end.p0(i64 8, ptr %tmp236) #17

x86 (after SROA):

  %compression_rate234 = getelementptr inbounds %struct.bw_calcs_data, ptr %data, i32 0, i32 162
  %arrayidx235 = getelementptr [12 x %struct.bw_fixed], ptr %compression_rate234, i32 0, i32 1
  %call237 = call i64 @bw_int_to_fixed(i64 inreg noundef 1) #18
  store i64 %call237, ptr %arrayidx235, align 4

arm (before SROA):

  %compression_rate233 = getelementptr inbounds %struct.bw_calcs_data, ptr %data, i32 0, i32 162
  %arrayidx234 = getelementptr [12 x %struct.bw_fixed], ptr %compression_rate233, i32 0, i32 1
  call void @llvm.lifetime.start.p0(i64 8, ptr %tmp235) #18
  call void @bw_int_to_fixed(ptr sret(%struct.bw_fixed) align 8 %tmp235, i64 noundef 1)
  call void @llvm.memcpy.p0.p0.i32(ptr align 8 %arrayidx234, ptr align 8 %tmp235, i32 8, i1 false)
  call void @llvm.lifetime.end.p0(i64 8, ptr %tmp235) #18

(after SROA, same)

ARM AAPCS32 §6.4 Result Return mentions:

A Composite Type not larger than 4 bytes is returned in r0. The format is as if the result had been stored in memory at a word-aligned address and then loaded into r0 with an LDR instruction. Any bits in r0 that lie outside the bounds of the result have unspecified values.
A Composite Type larger than 4 bytes, or whose size cannot be determined statically by both caller and callee, is stored in memory at an address passed as an extra argument when the function was called

bw_fixed is a struct wrapping an int64_t (8B).

@nickdesaulniers
Copy link
Member

https://godbolt.org/z/vvMoza9es demonstrates the issue more. When returning an argument greater than the word size, the IR isn't great for structures/aggregates/composite types. 32b x86 is saved by -freg-struct-return.

It's really hard to understand what is getting placed in what stack slot though, I wonder if stack-slot-coloring is having some kind of issue. Maybe it's time to beef up the debug output from that pass.

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Aug 23, 2023

I was able to claw back tens of bytes by rewriting calculate_bandwidth a bit (removing the two stupid kcallocs at the beginning), but I think there might be a larger win in rewriting bw_fixed.h to accept a pointer to a struct bw_fixed rather than return a struct bw_fixed. I'll try to see tomorrow how awful a yak shave that is. Maybe we can get away with bw_frc_to_fixed (and the 3 other fn's in that TU) since it's not defined in the same TU as calculate_bandwidth, I don't think the return value can be decomposed as easily.

Or move most of drivers/gpu/drm/amd/display/dc/dml/calcs/bw_fixed.c into drivers/gpu/drm/amd/display/dc/dml/calcs/dce_calcs.c since that's pretty much the only user.

@nickdesaulniers
Copy link
Member

duplicating this to #39. Once the two issues identified by #39 (comment) are fixed in clang, we can reopen this if necessary.

1 additional thing is conspiring here (that are unavoidable):

  1. AAPC32 6.4 Result Return "A Composite Type larger than 4 bytes, or whose size cannot be determined statically by both caller and callee, is stored in memory at an address passed as an extra argument when the function was called" which is leading to issues, particularly in AMDGPU.

Dunno about 390 though... but one of the two issues in clang is very much a result of dce_calcs.

@nickdesaulniers nickdesaulniers added the duplicate This issue or pull request already exists label Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-Wframe-larger-than= [ARCH] arm32 This bug impacts ARCH=arm [ARCH] s390 This bug impacts ARCH=s390 [BUG] Untriaged Something isn't working Clean build Issue needs to be fixed to get a warning-free all*config build [CONFIG] allmodconfig Issue affects allmodconfig on certain architectures CONFIG_WERROR Has in an error with CONFIG_WERROR (all{mod,yes}config) (or emits a non-compiler warning) duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

3 participants