Avro-4193: [c] segfault cannot handle empty byte array#3527
Avro-4193: [c] segfault cannot handle empty byte array#3527steven-aerts wants to merge 1 commit intoapache:mainfrom
Conversation
|
Please rebase! |
When avro-c tries to encode an empty byte array it will avro_malloc(0) which on some architectures will return NULL. Make sure this is not interpreted as an error or or dereferenced causing a segfault.
ea27931 to
89fb9db
Compare
|
@martin-g sorry for that, rebased on |
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a segmentation fault that occurs when avro-c attempts to encode empty byte arrays. The issue arises because avro_malloc(0) can return NULL on some architectures, which was incorrectly treated as an allocation error and caused null pointer dereferences.
Key changes:
- Updated NULL checks to distinguish between allocation failures and valid zero-size allocations
- Added null pointer guard in the test allocator's deallocation path
- Extended test coverage to validate empty byte array handling
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| lang/c/src/datum.c | Modified NULL checks in avro_bytes() and avro_bytes_set() to only treat NULL as an error when size is non-zero |
| lang/c/src/value-json.c | Updated NULL checks in encode_utf8_bytes() and avro_value_to_json_t() to handle zero-length allocations correctly |
| lang/c/tests/test_avro_data.c | Added null pointer guard in test allocator and introduced test_empty_bytes() to verify empty byte array handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| static int test_empty_bytes(void) | ||
| { | ||
| char bytes[] = { }; |
There was a problem hiding this comment.
Empty array initializer { } is a GNU extension and not standard C. Consider using char *bytes = NULL; or char bytes[1]; for better portability across compilers.
| char bytes[] = { }; | |
| char *bytes = NULL; |
| avro_datum_decref(datum); | ||
| avro_datum_decref(expected_datum); | ||
|
|
||
| avro_schema_decref(writer_schema); |
There was a problem hiding this comment.
Double free: writer_schema is already decremented at line 230, so this second decrement at line 243 will cause a use-after-free error.
| avro_schema_decref(writer_schema); |
| { | ||
| char *bytes_copy = (char *) avro_malloc(size); | ||
| if (!bytes_copy) { | ||
| if (!bytes_copy && size) { |
There was a problem hiding this comment.
What would happen now if size is 0?
It looks to me that no matter what is the value of bytes_copy it will never enter the if and it will call memcpy(bytes_copy, bytes, 0)
There was a problem hiding this comment.
IMO the there must be a check for size == 0 as a first step in this function that returns early and avoids any mallocs and memcpys.
| } | ||
|
|
||
| json_t *result = json_stringn_nocheck((const char *) encoded, encoded_size); | ||
| json_t *result = json_stringn_nocheck((const char *) encoded ? encoded : "", encoded_size); |
There was a problem hiding this comment.
| json_t *result = json_stringn_nocheck((const char *) encoded ? encoded : "", encoded_size); | |
| json_t *result = json_stringn_nocheck((const char *) encoded ? encoded : "", encoded ? encoded_size : 0); |
?
There was a problem hiding this comment.
There is a similar code for FIXED at https://github.com/steven-aerts/avro/blob/89fb9db03478b793be7188c5683c6560c2eff115/lang/c/src/value-json.c#L244
Does it need the same improvements ?
| { | ||
| char *bytes_copy = (char *) avro_malloc(size); | ||
| if (!bytes_copy) { | ||
| if (!bytes_copy && size) { |
There was a problem hiding this comment.
IMO the there must be a check for size == 0 as a first step in this function that returns early and avoids any mallocs and memcpys.
| int rval; | ||
| char *bytes_copy = (char *) avro_malloc(size); | ||
| if (!bytes_copy) { | ||
| if (!bytes_copy && size) { |
What is the purpose of the change
When avro-c tries to encode an empty byte array it will avro_malloc(0)
which on some architectures will return NULL. Make sure this is not
interpreted as an error or or dereferenced causing a segfault.
Verifying this change
Extended unit tests to validate this case
Documentation