Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3d backports from 6.13 #6620

Draft
wants to merge 786 commits into
base: rpi-6.12.y
Choose a base branch
from
Draft

Conversation

6by9
Copy link
Contributor

@6by9 6by9 commented Jan 21, 2025

@roliver-rpi for testing.
I have compile tested it, but not booted it.

6by9 and others added 30 commits January 17, 2025 14:31
The default link frequency of 450MHz has been noted to interfere
with GPS if they are in close proximty.
Add the option for 453 and 456MHz to move the signal slightly out
of the band. (447MHz can not be offered as corruption is then observed
on the 133x992 10bit mode).

Signed-off-by: Dave Stevenson <[email protected]>

fixup imx477 gps
Copy of the imx708 change.

Signed-off-by: Dave Stevenson <[email protected]>
Attempting to start a non-idle channel causes an error message to be
logged, and is inefficient. Test for emptiness of the desc_issued list
before doing so.

Signed-off-by: Phil Elwell <[email protected]>
The Raspberry Pi RP1 includes 2 M3 cores running firmware. This driver
adds a mailbox communication channel to them via a doorbell and some
shared memory.

Signed-off-by: Phil Elwell <[email protected]>
The RP1 firmware runs a simple communications channel over some shared
memory and a mailbox. This driver provides access to that channel.

Signed-off-by: Phil Elwell <[email protected]>
Declare the communications channel to RP1.

Signed-off-by: Phil Elwell <[email protected]>
Provide remote access to the PIO hardware in RP1. There is a single
instance, with 4 state machines.

Signed-off-by: Phil Elwell <[email protected]>
Declare the device that proxies RP1's PIO hardware.

Signed-off-by: Phil Elwell <[email protected]>
Use the PIO hardware on RP1 to implement a PWM interface.

Signed-off-by: Phil Elwell <[email protected]>
Enable building of the pwm-pio-rp1 driver, which is Pi 5-specific.

Signed-off-by: Phil Elwell <[email protected]>
Add an overlay to enable a single-channel PIO-assisted PWM interface on any
header pin.

Signed-off-by: Phil Elwell <[email protected]>
This is a SPI to powerline chipset with host-side Ethernet interface.
Is is usually used in e-mobility environments, e.g. on
Electrical Vehicle Supply Equipment (EVSE) side.

Signed-off-by: Michael Heimpold <[email protected]>
The documentation isn't very clear explaining how to enable SPI CS
active-high and it takes a long time to understand it. Adding a specific
overlay as a simple example on how to invert this signal can help
understand the solution.

Link: https://forums.raspberrypi.com/viewtopic.php?t=378222
Signed-off-by: Iker Pedrosa <[email protected]>
If autonomous speed negotiation is unreliable then brcm_pcie_set_gen()
can be used to override/limit this behaviour. However, setting the limit
after the linkup poll means the link can temporarily enter a bad speed
which may result in link failure. Move the speed setup to just prior to
releasing perst_n.

Fixes: 0693b42 ("PCI: brcmstb: Split post-link up initialization to brcm_pcie_start_link()")

Signed-off-by: Jonathan Bell <[email protected]>
Using increased bit depth for no reason increases power
consumption, and differs from the behaviour prior to the
conversion to use the HDMI helper functions.

Initialise the state max_bpc and requested_max_bpc to the
minimum value supported. This only affects Raspberry Pi,
as the other users of the helpers (rockchip/inno_hdmi and
sunx4i) only support a bit depth of 8.

Signed-off-by: Dave Stevenson <[email protected]>
If an infoframe was ever enabled, duplicate_state would
memcpy the infoframe including the "set" flag. The
infoframe functions then never cleared it, so once set
it was always set. This was most obvious with the HDR
infoframe as it resulted in bad colour rendering.

Signed-off-by: Dave Stevenson <[email protected]>
Drop from RGB to YUV422 output if RGB couldn't be supported
within the defined max_bpc and TMDS rates, and then try
dropping max_bpc.

Signed-off-by: Dave Stevenson <[email protected]>
Signed-off-by: Giedrius Trainavičius <[email protected]>
Signed-off-by: Giedrius Trainavičius <[email protected]>
"media: i2c: imx477: Add options for slightly modifying the link freq"
created a link frequency menu with 2 items in instead of one.
Correct this.

Signed-off-by: Dave Stevenson <[email protected]>
As per the subject, there was a copy/paste error that caused
pio_sm_unclaim from a driver to result in a call to
pio_sm_claim. Fix it.

Signed-off-by: Phil Elwell <[email protected]>
Passing bad parameters to an API call without a pio pointer will cause
a NULL pointer exception when the persistent error is set. Guard
against that.

Signed-off-by: Phil Elwell <[email protected]>
DSI0 and DSI1 have different widths for the command FIFO (24bit
vs 32bit), but the driver was assuming the 32bit width of DSI1
in all cases.
DSI0 also wants the data packed as 24bit big endian, so the
formatting code needs updating.

Handle the difference via the variant structure.

Signed-off-by: Dave Stevenson <[email protected]>
Noted setting up a display on CM5IO. Add
"dtoverlay=vc4-kms-dsi-ili7881-7inch" fails as it tries to
find the regulator/backlight/touch on i2c_csi_dsi, which pointed
at i2c_csi_dsi0 by default.

Adding the dsi0 override updated to point at dsi0, and pointed
the i2c at i2c_csi_dsi0, which all works.

The default with i2c_csi_dsi needs to be consistent in using
dsi1/csi1 and the corresponding i2c interface (i2c_csi_dsi1).

Signed-off-by: Dave Stevenson <[email protected]>
Although VBUS_EN on GPIO42 appears on the CM5's 100-way headers,
USB_OC_N on GPIO43 does not. Remove the signal name to avoid further
confusion and disappointment.

Signed-off-by: Phil Elwell <[email protected]>
The deep link into the website is not that stable, so let's
replace it with a textual description where to find the
product information.

Signed-off-by: Michael Heimpold <[email protected]>
pelwell and others added 27 commits January 17, 2025 14:32
The DT property arm,cpu-registers-not-fw-configured tells the kernel
that the ARM architectural timer has not been configured by the
firmware. This prevents the use of a vDSO - a faster alternative to a
syscall for some common kernel operations.

However, on Pi 4 the firmware does configure the timer, so this property
is unnecessary. Delete it.

Signed-off-by: Phil Elwell <[email protected]>
Make sure the sdhost driver doesn't use requests bigger than SWIOTLB
can handle.

Copied from [1].

Link: raspberrypi#6589
Signed-off-by: Phil Elwell <[email protected]>
[1] d4dd9bc ("mmc: bcm2835: Take SWIOTLB memory size limitation
into account")
We can avoid calling the v3d_clock_up_put and v3d_clock_up_get
when a job is submitted to a CPU queue. We don't need to change
the V3D core frequency to run a CPU job as it is executed on
the CPU. This way we avoid delaying timestamps CPU jobs by 4.5ms
that is the time that it takes the firmware to increase the V3D
core frequency.

Fixes: fe6a858 ("drm/v3d: Correct clock settng calls to new APIs")
Signed-off-by: Jose Maria Casanova Crespo <[email protected]>
Reviewed-by: Maíra Canal <[email protected]>
Prior to [1], an fb_ops member of 0 was intepreted as a request for a
default value. This saves source code but requires special handling by
the framework, slowing down all accesses for no runtime benefit.

Use the new __FB_DEFAULT_ macros to explicitly select default handlers
in the bcm2708_fb driver. Also remove the pointless wrappers around
cfb_fillrect and cfb_imageblit - call them directly.

Link: https://forums.raspberrypi.com/viewtopic.php?p=2286016#p2286016
Signed-off-by: Phil Elwell <[email protected]>
[1] 8813e86 ("fbdev: Remove default file-I/O implementations")
Now that the upstream driver is overclockable, switch to using it in
preference of the downstream driver (which can be deleted by a followup
commit).

Signed-off-by: Phil Elwell <[email protected]>
The principal differences between the downstream SDHOST driver and the
version accepted upstream driver are that the upstream version loses the
overclock support and DMA configuration via DT, but gains some tidying
up (and maintenance by the upstream devs).

Add the missing features (with the exception of the low-overhead logging)
as a patch to the upstream driver.

Signed-off-by: Phil Elwell <[email protected]>
Commit ceddfd4 ("media: i2c: imx219: Support four-lane operation")
added support for device tree to allow configuration of the sensor to
use 4 lanes with a link frequency of 363MHz, and amended the advertised
pixel rate to 280.8MPix/s.

However it didn't change any of the PLL settings, so actually it would
have been running effectively overclocked in the MIPI block, and with
the frame rate and exposure calculations being wrong.

The pixel rate and link frequency advertised were taken from the "Clock
Setting Example" section of the datasheet. However those are based on an
external clock of 12MHz, and are unachievable with a clock of 24MHz (it
seems PREPLLCLK_VT_DIV and PREPLLCK_OP_DIV can ONLY be set via the
automatic configuration doumented in "9-1-2 EXCK_FREQ setting depend on
INCK frequency).

Dropping all support for the 363MHz link frequency would cause problems
for existing users, so allow it from device tree, but log a warning that
the requested value is not being truly applied.

Fixes: ceddfd4 ("media: i2c: imx219: Support four-lane operation")
Co-developed-by: Peyton Howe <[email protected]>
Signed-off-by: Peyton Howe <[email protected]>
Signed-off-by: Dave Stevenson <[email protected]>
List V4L2_PIX_FMT_YUV422P as supported by the PiSP backend hardware.

Signed-off-by: Naushir Patuck <[email protected]>
These fields should not be set by either the user or the kernel driver
so remove them. Replace them with padding bytes to maintain backward
compatibility with existing userland applications.

Signed-off-by: Naushir Patuck <[email protected]>
Commit e442e5c ("arch:arm:boot:dts:overlays: Added waveshare 13.3inch
panel support") added an extra touch controller for the new panels.
On systems with old panels, it ends up spamming the kernel log as that
touch controller isn't there to respond.

Fixes: e442e5c ("arch:arm:boot:dts:overlays: Added waveshare 13.3inch panel support")
Signed-off-by: Dave Stevenson <[email protected]>
Commit be431df upstream

Create a function `drm_gem_shmem_create_with_mnt()`, similar to
`drm_gem_shmem_create()`, that has a mountpoint as a argument. This
function will create a shmem GEM object in a given tmpfs mountpoint.

This function will be useful for drivers that have a special mountpoint
with flags enabled.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 0992b25 upstream

For some applications, such as applications that uses huge pages, we might
want to have a different mountpoint, for which we pass mount flags that
better match our usecase.

Therefore, create a new function `drm_gem_object_init_with_mnt()` that
allow us to define the tmpfs mountpoint where the GEM object will be
created. If this parameter is NULL, then we fallback to `shmem_file_setup()`.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit f2a4bcb usptream

Replace the open-coded v3d_perfmon_find() with the real thing.

Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Maíra Canal <[email protected]>
Signed-off-by: Maíra Canal <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 56cf76e upstream

If the scheduler initialization fails, GEM initialization must fail as
well. Therefore, if `v3d_sched_init()` fails, free the DMA memory
allocated and return the error value in `v3d_gem_init()`.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit eb8d395 upstream

Create a separate "tmpfs" kernel mount for V3D. This will allow us to
move away from the shmemfs `shm_mnt` and gives the flexibility to do
things like set our own mount options. Here, the interest is to use
"huge=", which should allow us to enable the use of THP for our
shmem-backed objects.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 8dd6074 upstream

Currently, we are using an alignment of 128 kB to insert a node, which
ends up wasting memory as we perform plenty of small BOs allocations
(<= 4 kB). We require that allocations are aligned to 128Kb so for any
allocation smaller than that, we are wasting the difference.

This implies that we cannot effectively use the whole 4 GB address space
available for the GPU in the RPi 4. Currently, we can allocate up to
32000 BOs of 4 kB (~140 MB) and 3000 BOs of 400 kB (~1,3 GB). This can be
quite limiting for applications that have a high memory requirement, such
as vkoverhead [1].

By reducing the page alignment to 4 kB, we can allocate up to 1000000 BOs
of 4 kB (~4 GB) and 10000 BOs of 400 kB (~4 GB). Moreover, by performing
benchmarks, we were able to attest that reducing the page alignment to
4 kB can provide a general performance improvement in OpenGL
applications (e.g. glmark2).

Therefore, this patch reduces the alignment of the node allocation to 4
kB, which will allow RPi users to explore the whole 4GB virtual
address space provided by the hardware. Also, this patch allow users to
fully run vkoverhead in the RPi 4/5, solving the issue reported in [1].

[1] zmike/vkoverhead#14

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit e4c1772 upstream

The V3D MMU also supports 64KB and 1MB pages, called big and super pages,
respectively. In order to set a 64KB page or 1MB page in the MMU, we need
to make sure that page table entries for all 4KB pages within a big/super
page must be correctly configured.

In order to create a big/super page, we need a contiguous memory region.
That's why we use a separate mountpoint with THP enabled. In order to
place the page table entries in the MMU, we iterate over the 16 4KB pages
(for big pages) or 256 4KB pages (for super pages) and insert the PTE.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 20d69e8 upstream

Although Big/Super Pages could appear naturally, it would be quite hard
to have 1MB or 64KB allocated contiguously naturally. Therefore, we can
force the creation of large pages allocated contiguously by using a
mountpoint with "huge=within_size" enabled.

Therefore, as V3D has a mountpoint with "huge=within_size" (if user has
THP enabled), use this mountpoint for BO creation if available. This
will allow us to create large pages allocated contiguously and make use
of Big/Super Pages.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 0df4a13 upstream

Add a modparam for turning off Big/Super Pages to make sure that if an
user doesn't want Big/Super Pages enabled, it can disabled it by setting
the modparam to false.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 9f8e1c9 upstream

Add a new V3D parameter to expose the support of Super Pages to
userspace. The userspace might want to know this information to
apply optimizations that are specific to kernels with Super Pages
enabled.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit d28292a upstream

Function drm_gem_shmem_create_with_mnt() creates an object
without using the mountpoint if gemfs is NULL.

Drop the else branch calling drm_gem_shmem_create().

Signed-off-by: Matthias Brugger <[email protected]>
Signed-off-by: Maíra Canal <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit e987e22 upstream

When the new register addresses were introduced for V3D 7.x, we added
new masks for performance counter sources on V3D 7.x.  Nevertheless,
we never apply these new masks when setting the sources.

Fix the performance counter source settings on V3D 7.x by introducing
a new macro, `V3D_SET_FIELD_VER`, which allows fields setting to vary
by version. Using this macro, we can provide different values for
source mask based on the V3D version, ensuring that sources are
correctly configure on V3D 7.x.

Fixes: 0ad5bc1 ("drm/v3d: fix up register addresses for V3D 7.x")
Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 21f1435 upstream

If the active performance monitor (`v3d->active_perfmon`) is being
destroyed, stop it first. Currently, the active perfmon is not
stopped during destruction, leaving the `v3d->active_perfmon` pointer
stale. This can lead to undefined behavior and instability.

This patch ensures that the active perfmon is stopped before being
destroyed, aligning with the behavior introduced in commit
7d1fd36 ("drm/v3d: Stop the active perfmon before being destroyed").

Cc: [email protected] # v5.15+
Fixes: 26a4dc2 ("drm/v3d: Expose performance counters to userspace")
Signed-off-by: Christian Gmeiner <[email protected]>
Signed-off-by: Maíra Canal <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit c6eabba upstream

Add a new ioctl, DRM_IOCTL_V3D_PERFMON_SET_GLOBAL, to allow
configuration of a global performance monitor (perfmon).
Use the global perfmon for all jobs to ensure consistent
performance tracking across submissions. This feature is
needed to implement a Perfetto datasources in user-space.

Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Maíra Canal <[email protected]>
Signed-off-by: Maíra Canal <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 4ee06e3 upstream.
This commit fixes several miscellaneous documentation errors. Mostly,
delete/update comments that are outdated or are leftovers from past code
changes. Apart from that, remove double-spaces in several comments.

Signed-off-by: Maíra Canal <[email protected]>
Acked-by: Iago Toral Quiroga <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit dc4afc0de9654f88676d77094a38f9451d519011 upstream.

CPU jobs, like Cache Clean jobs, execute synchronously once the DRM
scheduler starts running them. Consequently, a global `v3d->cpu_job`
variable is unnecessary, as everything is managed within the
`v3d_cpu_job_run()` function.

This commit removes the `v3d->cpu_job` pointer, as it is not needed.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
@popcornmix
Copy link
Collaborator

Need to confirm this does not introduce the NULL job issue in 6.13 reported here: #6624

@pelwell
Copy link
Contributor

pelwell commented Jan 22, 2025

Either way, the result would be a useful datapoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.