[Pallas] Unable to Modify Input Ref in Pallas Kernel #24656

shangz-ai · 2024-11-01T05:18:26Z

Description

Hello,
I'm wondering whether it is feasible for my pallas kernel to update on input refs? For example, if i want to both read and write to the same tensor via my pallas kernel, how can I achieve that?
I'm putting down a minimal script and it seems that it doesn't work as I expect.
Thanks!

System info (python version, jaxlib version, accelerator, etc.)

import jax
import jax.numpy as jnp
from jax.experimental import pallas as pl

def add_kernel(x_ref, y_ref, o_ref):
  # In this code, `x_ref`, `y_ref` and `o_ref` are (8,)-shaped `Ref`s
  x = x_ref[:]
  y = y_ref[:]
  o_ref[:] = x + y

def add_inplace_kernel(x_ref, y_ref, o_ref):
  # In this code, `x_ref`, `y_ref` and `o_ref` are (8,)-shaped `Ref`s
  x = x_ref[:]
  y = y_ref[:]
  o = x + y
  y_ref[:] = o

x, y = jnp.arange(8), jnp.arange(8, 16)

print("=======regular add=========")
add = pl.pallas_call(add_kernel, out_shape=jax.ShapeDtypeStruct((8,), jnp.int32))
print("x ", x)
print("y ", y)
o = add(x, y)
print("o ", o)
print("=======regular add=========")

print("=======inplace add=========")
inplace_add = pl.pallas_call(add_inplace_kernel, out_shape=jax.ShapeDtypeStruct((8,), jnp.int32))
print("x ", x)
print("y ", y)
o_dummy = inplace_add(x, y)
print("after inplace add y ", y)
print("o_dummy ", o_dummy)
print("=======inplace add=========")

From this script I see

=======regular add=========
x  [0 1 2 3 4 5 6 7]
y  [ 8  9 10 11 12 13 14 15]
o  [ 8 10 12 14 16 18 20 22]
=======regular add=========
=======inplace add=========
x  [0 1 2 3 4 5 6 7]
y  [ 8  9 10 11 12 13 14 15]
after inplace add y  [ 8  9 10 11 12 13 14 15]
o_dummy  [0 0 0 0 0 0 0 0]
=======inplace add=========

but I'm trying to get the inplace_add to return me y [ 8 10 12 14 16 18 20 22]. Is that possible?

The text was updated successfully, but these errors were encountered:

shangz-ai · 2024-11-01T05:19:00Z

Also I found this issue quite relevant #22276

justinjfu · 2024-11-01T18:09:51Z

This doesn't work as intended because x, y (the JAX tensors) live in HBM, and Pallas will copy them to the innermost memory hierarchy (e.g. VMEM on TPUs) before invoking the kernel. Therefore, when you modify y_ref you're only modifying the copy and it's not updating the actual y that's resident in HBM.

Try aliasing the input and output to the same ref using the input_output_aliases argument to pallas call. In your case for the in-place add, you need to use:

inplace_add = pl.pallas_call(add_inplace_kernel,
  out_shape=jax.ShapeDtypeStruct((8,), jnp.int32),
  input_output_aliases={1:0})

Pallas will copy outputs back to HBM, so this will trigger Pallas to copy your updates to y_ref back to HBM.

shangz-ai · 2024-11-01T21:41:05Z

Thanks a lot! I think it is exactly what I want. Let me try to work on my real kernel to see if I can get what I need.

shangz-ai added the bug Something isn't working label Nov 1, 2024

shangz-ai changed the title ~~[Pallas] Unable to Update Input Ref in Pallas Kernel~~ [Pallas] Unable to Modify Input Ref in Pallas Kernel Nov 1, 2024

justinjfu added the pallas Issues pertaining to Pallas (GPU or TPU) label Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pallas] Unable to Modify Input Ref in Pallas Kernel #24656

[Pallas] Unable to Modify Input Ref in Pallas Kernel #24656

shangz-ai commented Nov 1, 2024 •

edited

Loading

shangz-ai commented Nov 1, 2024

justinjfu commented Nov 1, 2024 •

edited

Loading

shangz-ai commented Nov 1, 2024

[Pallas] Unable to Modify Input Ref in Pallas Kernel #24656

[Pallas] Unable to Modify Input Ref in Pallas Kernel #24656

Comments

shangz-ai commented Nov 1, 2024 • edited Loading

Description

System info (python version, jaxlib version, accelerator, etc.)

shangz-ai commented Nov 1, 2024

justinjfu commented Nov 1, 2024 • edited Loading

shangz-ai commented Nov 1, 2024

shangz-ai commented Nov 1, 2024 •

edited

Loading

justinjfu commented Nov 1, 2024 •

edited

Loading