Skip to content

Conversation

@SyamGadde
Copy link
Contributor

Another fix to allow arrays backed by PooledBuffers to be used with offsets.
Also, there's no reason someone shouldn't be able to use different slices of the same array as input and output, so also check if the offset matches to determine whether it is an in-place transform -- does not check that the two slices don't overlap in some way, a general solution to which would be complicated.

@zachjweiner
Copy link

zachjweiner commented Mar 22, 2020

Hi @SyamGadde - sorry to drag up an old PR, but I would love to get gpyfft working for PooledBuffers with offsets. Unfortunately, your proposed fix here no longer seems to work: PooledBuffers do not (currently) have a buf attribute. I'm struggling to see how to go about an alternative fix - do you happen to have any ideas?

@geggo
Copy link
Owner

geggo commented Mar 23, 2020

Do you have a short example that shows what you would like to achieve but which fails? I am using gpyfft with memory pool allocated arrays, so I don‘t see the problem.

Check for offsets in this PR is useful, will add this.

Gregor

@zachjweiner
Copy link

I'll share one in a little bit - but the issue is using sliced pool-allocated arrays (i.e., with offsets) - in which case the attempt to slice and pass a 0-offset array to the transform fails.

@zachjweiner
Copy link

Here's a reproducer:

import pyopencl as cl
import pyopencl.array as cla
import numpy as np

context = cl.create_some_context()
queue = cl.CommandQueue(context)
pool = cl.tools.MemoryPool(cl.tools.ImmediateAllocator(queue))

a = np.random.rand(256, 256, 256)
a_gpu = cla.to_device(queue, a)
b_gpu = cla.zeros(queue, (256, 256, 129), 'complex128')

from gpyfft import FFT
transform = FFT(context, queue, a_gpu, out_array=b_gpu, real=True)

a_pooled = cla.to_device(queue, np.random.rand(3, 256, 256, 256), allocator=pool)

# succeeds because no offset
event, = transform.enqueue_arrays(data=a_pooled[0], result=b_gpu, forward=True)
event.wait()

# fails with offset
event, = transform.enqueue_arrays(data=a_pooled[1], result=b_gpu, forward=True)
event.wait()

@geggo
Copy link
Owner

geggo commented Mar 24, 2020

Thanks,
it seems the problem is rooted in the fact that a PooledBuffer does not expose the same interface as a Buffer. In particular it is missing get_sub_region and getitem, needed to slice a subarray which is needed for nonzero offset. clFFT, on the other hand, expects that data starts at offset=0.

Looking at pyopencl sources (wrap_mempool.cpp) I see that PooledBuffer is derived from memory_object_holder, not buffer. This might be an oversight, but I don't know. You might ping Andreas Klöckner, author of pyopencl.

Alternatively as a workaround, the gpyfft wrapper could temporarily create sub buffers via OpenCL low level calls if it encounters nonzero offset, but this should be better handled by pyopencl.

best
Gregor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants