Conversation
|
@JerrySievert looks ok to me, wdyt? |
|
I'll reproduce locally and give you an informed response soon :) |
|
I also have noticed bug in current implementation. For PostgreSQL 16+ you have function #if PG_VERSION_NUM >= PG_VERSION_16
/* Copied from postgres 15 source, since it was removed from 16. */
static bool
MemoryContextContains(MemoryContext context, void *pointer)
{
MemoryContext ptr_context;
/*
* NB: Can't use GetMemoryChunkContext() here - that performs assertions
* that aren't acceptable here since we might be passed memory not
* allocated by any memory context.
*
* Try to detect bogus pointers handed to us, poorly though we can.
* Presumably, a pointer that isn't MAXALIGNED isn't pointing at an
* allocated chunk.
*/
if (pointer == NULL || pointer != (void *) MAXALIGN(pointer))
return false;
/*
* OK, it's probably safe to look at the context.
*/
ptr_context = *(MemoryContext *) (((char *) pointer) - sizeof(void *));
return ptr_context == context;
}
#endifBut this DOES NOT WORK in pg 16 anymore. Starting from this version, theese 8 bytes store bit fields: size + memory context method. This is NOT POINTER. So, this function will always return This function called in /* FreeChunkValueArrayBuffer relase valueBufferArray memory. */
void FreeChunkBufferValueArray(ChunkData *chunkData)
{
uint32 columnIndex = 0;
if (chunkData == NULL)
{
return;
}
for (columnIndex = 0; columnIndex < chunkData->columnCount; columnIndex++)
{
if (chunkData->valueBufferArray[columnIndex] != NULL && !MemoryContextContains(ColumnarCacheMemoryContext(), chunkData->valueBufferArray[columnIndex]))
{
pfree(chunkData->valueBufferArray[columnIndex]->data);
pfree(chunkData->valueBufferArray[columnIndex]);
}
}
} |
|
Hm, in this case can be that the idea behind |
|
@JerrySievert @wuputah I have create another PR that fixes this behaviour #274 |
|
Updated MR with new info about I think this realization is simplier comparing to #274. Here we only NULLify. But in 274 have added a new struct, repalloc... How do you think @JerrySievert @wuputah @ashenBlade ? |
If we nullify this, it will not be available for us anymore. We will have to read and decompress it again. |
|
@ashenBlade
|
|
I think a better solution is to make a copy of the data, not mess with trying to synchronize a lower and higher level caching mechanism, nor try to cover all of the possible edge-cases. and add a test replicating the issue to the regression tests. |
|
@JerrySievert Maybe simply remove I'll think how implement a test. We need to determine that cache was used. The only idea came to mind is to add elog that writes "Columnar: clearing the cache by 10 percent has started". If it will be NOTICE level, it may be spam in daily workload. If it will have a DEBUG level, it will spam in test, all DEBUG messages from postgres will be written to test output.. |
|
just setting the cache level to fairly low and using the example you provided, but pared down, should trigger the issue. there should be a logical existing test file to put it into, though |
|
if caching is disabled, the |
2f982fb to
3a2505f
Compare
3a2505f to
0b1c73c
Compare
|
ahh... found where it's read (columnar_write.c) @JerrySievert I've added copying and a test. But from performance view, better to use approach without copying from #274 |
tests were given in hydradatabase#273
|
@JerrySievert hello, what do you think?
|
|
I'm just here as an advisor, I have no merge powers :) but, it would be up to @wuputah to determine whether the merge makes sense. this doesn't seem to deal with the multiple copy and cache level synchronization though, which is a much bigger task |

Hello.
This MR fixes an issue with a cache. Currently cache entries contain already freed memory. MR fixes this.
SQL code to reproduce
broken_cache.txt
Steps:
Description
Currently an error
occurs when cache is full and we try to free up space by 10%.
We call in the
EvictCachebut
str->datacontains a garbage, soBogusFreewith anERRORis executed (see Path to BogusFree chapter below)The reason to a garbage is the following.
A
valueBuffer (StringInfo)is added to a cache in theColumnarAddCacheEntryAlso it's added tochunkData->valueBufferArrayIn the function
FreeChunkBufferValueArrayvalueBufferwill be freedand also
EvictCachetries to freed it as our cache is fullif (totalAllocationLength >= (columnar_page_cache_size * 1024 * 1024))And we got an ERROR.
To fix this, we mustn't add
valueBufferArraytochunkData->valueBufferArraywhen previously have added it to a cachePath to BogusFree
Backtrace: