Conversation
|
It looks good to me. Since the storage culling does not require running EVM, I think it would be simple to do some tests, such as https://github.com/GatlingX/GPU-EVM/blob/main/src/common/fuzz_algos/tests/test_fuzz_dedup.cu. |
| void cull_storages_mark(FuzzStorageBuffer& storage_corpus, GpuArrayView<bool> passes_mask) { | ||
| assert(storage_corpus.size <= passes_mask.size); | ||
| // We calculate the max storage kv size, and then don't allow any max-size storages to be culled | ||
| auto max_size = max_storage_size(storage_corpus.kv_sizes.view()); |
There was a problem hiding this comment.
convert max_storage_size() to an inline function or put it into cull_storages_mark() if no other function will use it
| void add_votes_to_storages(const GpuArrayView<bool> interesting_mask, const GpuArrayView<size_t> storage_indices, GpuArrayView<StorageCaseMetadata> metadata) { | ||
| gpuevm_nvtx_push("add_votes_to_storages"); | ||
| auto [grid_size, block_size] = get_cuda_max_occupancy(interesting_mask.size, (void*)add_votes_to_storages_kernel); | ||
| add_votes_to_storages_kernel<<<grid_size, block_size>>>(interesting_mask, storage_indices, metadata); |
There was a problem hiding this comment.
should we add some asserts here to make sure the kernel function has enough threads?
| __global__ void set_storage_indices_kernel(GpuArrayView<size_t> storage_indices, uint32_t n_tested_storages, uint32_t n_test_threads) { | ||
| auto tid = blockIdx.x * blockDim.x + threadIdx.x; | ||
| // For tid < n_tested_storages, we're already done with the cyclical assignment | ||
| if (tid < n_tested_storages) return; |
There was a problem hiding this comment.
we can set the index of storage_indices with an offset like n_tested_storages. For example, updating storage_indices by using storage_indices[n_tested_storages + tid];
| return marked_inputs; | ||
| } | ||
|
|
||
| __global__ void add_votes_to_storages_kernel(const GpuArrayView<bool> interesting_mask, const GpuArrayView<size_t> storage_indices, GpuArrayView<StorageCaseMetadata> metadata) { |
There was a problem hiding this comment.
yes, use the indices to access the metadata
There was a problem hiding this comment.
or we can use the 2darray to make the code clearer
This is squashed version of the full storage culling implementation, we'll need to review the whole of it to be able to bring it in to the current version of gpu-evm (moving it manually into the .cu and .cuh files is probably the right way)