Skip to content

Add CBOR parsing primitives#72

Merged
werwurm merged 24 commits intomainfrom
werwurm/cbor_parser
Sep 25, 2025
Merged

Add CBOR parsing primitives#72
werwurm merged 24 commits intomainfrom
werwurm/cbor_parser

Conversation

@werwurm
Copy link
Contributor

@werwurm werwurm commented Sep 3, 2025

Add primitives for parsing CBOR headers, extracting slices of
octet and text strings, and recursively skipping unknown fields.

@werwurm werwurm requested review from Copilot and seidelrj September 3, 2025 05:47

This comment was marked as outdated.

Copy link
Contributor

@seidelrj seidelrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like Copilot, I'm also confused about n20_cbor_read_iterate_item. It's not in the header either.

@werwurm werwurm requested a review from seidelrj September 8, 2025 21:59
Similar to n20_stream_t which offers safe writing to a buffer
n20_istream_t offers a safe way to read from a buffer lowering the risk
for out of bounds access.
Add primitives for parsing CBOR headers, extracting slices of
octet and text strings, and recursively skipping unknown fields.
@werwurm werwurm requested a review from Copilot September 10, 2025 18:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds CBOR parsing primitives to support reading and parsing CBOR-encoded data. The changes implement functions for reading CBOR headers and recursively skipping CBOR items, complementing the existing CBOR write functionality.

  • Adds n20_cbor_read_header function to parse CBOR type and value information from an input stream
  • Adds n20_cbor_read_skip_item function to recursively skip over CBOR items including nested structures
  • Comprehensive test coverage for both functions with various CBOR types and edge cases

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
include/nat20/cbor.h Adds function declarations for the new CBOR reading primitives with documentation
src/core/cbor.c Implements the CBOR header reading and item skipping functions
src/core/test/cbor.cpp Extensive test suite covering various CBOR types, edge cases, and error conditions

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

src/core/cbor.c Outdated
*n = 0;

uint8_t additional_bytes = 1 << (additional_info - 24);
for (int i = 0; i < additional_bytes; i++) {
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable type mismatch: additional_bytes is uint8_t but i is int in the loop condition. This could cause warnings or unexpected behavior. Change i to uint8_t or size_t for consistency.

Suggested change
for (int i = 0; i < additional_bytes; i++) {
for (uint8_t i = 0; i < additional_bytes; i++) {

Copilot uses AI. Check for mistakes.

switch (type) {
case n20_cbor_type_array_e:
for (size_t i = 0; i < n; i++) {
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential integer overflow: n is uint64_t but loop variable i is size_t. On 32-bit systems, this could cause infinite loops if n exceeds SIZE_MAX. Add a bounds check or use a safer iteration approach.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have the check above to make sure n is <= SIZE_MAX before we get to this for loop.

Comment on lines 148 to 155
for (size_t i = 0; i < n; i++) {
if (!n20_cbor_read_skip_item(s)) {
return false;
}
}
break;
case n20_cbor_type_map_e:
for (size_t i = 0; i < n; i++) {
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential integer overflow: n is uint64_t but loop variable i is size_t. On 32-bit systems, this could cause infinite loops if n exceeds SIZE_MAX. Add a bounds check or use a safer iteration approach.

Suggested change
for (size_t i = 0; i < n; i++) {
if (!n20_cbor_read_skip_item(s)) {
return false;
}
}
break;
case n20_cbor_type_map_e:
for (size_t i = 0; i < n; i++) {
if (n > SIZE_MAX) {
/* Too many items to skip safely on this platform. */
return false;
}
for (size_t i = 0; i < (size_t)n; i++) {
if (!n20_cbor_read_skip_item(s)) {
return false;
}
}
break;
case n20_cbor_type_map_e:
if (n > SIZE_MAX) {
/* Too many items to skip safely on this platform. */
return false;
}
for (size_t i = 0; i < (size_t)n; i++) {

Copilot uses AI. Check for mistakes.
src/core/cbor.c Outdated
break;
case n20_cbor_type_bytes_e:
case n20_cbor_type_string_e: {
uint8_t const *slice = n20_istream_get_slice(s, n);
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential integer overflow: n is uint64_t but n20_istream_get_slice likely expects size_t. On 32-bit systems, this could cause truncation or incorrect behavior if n exceeds SIZE_MAX. Add a bounds check before the call.

Suggested change
uint8_t const *slice = n20_istream_get_slice(s, n);
if (n > SIZE_MAX) {
return false;
}
uint8_t const *slice = n20_istream_get_slice(s, (size_t)n);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n20_istream_get_slice will return NULL if the requested size is too large. The NULL check below is the correct behavoir.

@werwurm werwurm requested a review from Copilot September 10, 2025 18:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

src/core/cbor.c Outdated
Comment on lines 120 to 136
if (additional_info < 24 || additional_info > 27) {
*n = additional_info;
return true;
}

*n = 0;

uint8_t additional_bytes = 1 << (additional_info - 24);
for (uint8_t i = 0; i < additional_bytes; i++) {
uint8_t byte = 0;
if (!n20_istream_get(s, &byte)) {
return false;
}
*n = (*n << 8) | byte;
}

return true;
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition on line 121 will incorrectly handle additional_info values 28-31. These should be treated as special cases (28-30 are invalid, 31 indicates indefinite length), but the current logic assigns them directly to *n which is incorrect.

Suggested change
if (additional_info < 24 || additional_info > 27) {
*n = additional_info;
return true;
}
*n = 0;
uint8_t additional_bytes = 1 << (additional_info - 24);
for (uint8_t i = 0; i < additional_bytes; i++) {
uint8_t byte = 0;
if (!n20_istream_get(s, &byte)) {
return false;
}
*n = (*n << 8) | byte;
}
return true;
if (additional_info < 24) {
*n = additional_info;
return true;
} else if (additional_info >= 24 && additional_info <= 27) {
*n = 0;
uint8_t additional_bytes = 1 << (additional_info - 24);
for (uint8_t i = 0; i < additional_bytes; i++) {
uint8_t byte = 0;
if (!n20_istream_get(s, &byte)) {
return false;
}
*n = (*n << 8) | byte;
}
return true;
} else if (additional_info >= 28 && additional_info <= 30) {
// Invalid/reserved additional_info values
return false;
} else if (additional_info == 31) {
// Indefinite length
*n = (uint64_t)-1;
return true;
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think copilot is on to something here. Right now, if you get an additional_data > 27, the current code will mark the header as valid, and return n as a value between 28-31. This will mess up n20_cbor_read_skip_item. E.g., if you had an array type, and additional_data=28, we would say the array had length 28 and skip that many items. We should also return an error on value 31 since we don't support it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I saw that. I have to think about this because. I tried to avoid returning error codes here. I may have to rethink this stance.

Comment on lines +147 to +151
case n20_cbor_type_array_e:
if (n > SIZE_MAX) {
/* Prevent overflow in the loop counter. */
return false;
}
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check n > SIZE_MAX is always false since n is a uint64_t and SIZE_MAX is typically also 64-bit on 64-bit systems. This check should use a more appropriate limit or handle the case where n cannot fit in a size_t for the loop counter.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks are really irrelevant because the loops will break when the buffer underruns, and because there are no zero sized cbor items to skip there is no real chance spin indefinitely.

Comment on lines +159 to +160
if (n > SIZE_MAX) {
/* Prevent overflow in the loop counter. */
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as with arrays - the check n > SIZE_MAX is ineffective. Additionally, for maps, the actual number of items to skip is 2 * n (key-value pairs), so the overflow check should account for this multiplication.

Suggested change
if (n > SIZE_MAX) {
/* Prevent overflow in the loop counter. */
if (n > SIZE_MAX / 2) {
/* Prevent overflow in the loop counter (2 * n items to skip). */

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not apply, because we are actually skipping two items per iteration. Also, because an item is at least one byte in size, the underlying stream will necessarily underrun before the counter can loop.

Comment on lines +174 to +177
if (n > SIZE_MAX) {
/* Prevent uncaught truncation. */
return false;
}
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same SIZE_MAX comparison issue exists here. This check will not prevent truncation when casting uint64_t to size_t on systems where they are the same size.

Copilot uses AI. Check for mistakes.
src/core/cbor.c Outdated
Comment on lines 120 to 136
if (additional_info < 24 || additional_info > 27) {
*n = additional_info;
return true;
}

*n = 0;

uint8_t additional_bytes = 1 << (additional_info - 24);
for (uint8_t i = 0; i < additional_bytes; i++) {
uint8_t byte = 0;
if (!n20_istream_get(s, &byte)) {
return false;
}
*n = (*n << 8) | byte;
}

return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think copilot is on to something here. Right now, if you get an additional_data > 27, the current code will mark the header as valid, and return n as a value between 28-31. This will mess up n20_cbor_read_skip_item. E.g., if you had an array type, and additional_data=28, we would say the array had length 28 and skip that many items. We should also return an error on value 31 since we don't support it.


switch (type) {
case n20_cbor_type_array_e:
for (size_t i = 0; i < n; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have the check above to make sure n is <= SIZE_MAX before we get to this for loop.

@werwurm werwurm requested a review from a team as a code owner September 19, 2025 22:13
@werwurm werwurm changed the base branch from werwurm/istream to main September 19, 2025 22:14
@github-actions
Copy link

github-actions bot commented Sep 19, 2025

LCOV of commit 598c17e during lcov-test-coverage-report #35

Summary coverage rate:
  lines......: 95.2% (1794 of 1884 lines)
  functions..: 99.4% (162 of 163 functions)
  branches...: 83.5% (1000 of 1197 branches)

Files changed coverage rate: n/a

@werwurm werwurm requested review from Copilot and seidelrj September 20, 2025 00:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

#include <variant>
#include <vector>

#include "gtest/gtest.h"
Copy link

Copilot AI Sep 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This include is redundant as gtest/gtest.h is already included at line 17. Remove the duplicate include.

Suggested change
#include "gtest/gtest.h"

Copilot uses AI. Check for mistakes.
Comment on lines 20 to 23
#include <sys/types.h>

#include <cstdint>
#include <initializer_list>
Copy link

Copilot AI Sep 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The includes sys/types.h and initializer_list appear to be unused in the visible code. Consider removing them unless they are used elsewhere in the file.

Suggested change
#include <sys/types.h>
#include <cstdint>
#include <initializer_list>
#include <cstdint>

Copilot uses AI. Check for mistakes.
Comment on lines 20 to 23
#include <sys/types.h>

#include <cstdint>
#include <initializer_list>
Copy link

Copilot AI Sep 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The includes sys/types.h and initializer_list appear to be unused in the visible code. Consider removing them unless they are used elsewhere in the file.

Suggested change
#include <sys/types.h>
#include <cstdint>
#include <initializer_list>
#include <cstdint>

Copilot uses AI. Check for mistakes.
Comment on lines +166 to +169
if (n > SIZE_MAX) {
/* Prevent overflow in the loop counter. */
return false;
}
Copy link

Copilot AI Sep 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overflow check has the same issue as above. The condition n > SIZE_MAX will never be true on systems where uint64_t and size_t have the same width. Consider using if (n > SIZE_MAX / 2) or implementing a different overflow prevention strategy.

Copilot uses AI. Check for mistakes.
n20_istream_get_slice now returns a boolean and fills an n20_slice_t
structure, which allows for more consistent control flow.
This patch also adds the n20_istream_get_string_slice variant, which
fills an n20_string_slice_t structure instead.

Complete test coverage for stream module.
@werwurm werwurm changed the base branch from main to werwurm/revisit_istream_get_slice September 20, 2025 02:39
src/core/cbor.c Outdated

if (additional_info < 24 || additional_info > 27) {
if (additional_info > 27) {
/* Reserved additional info value. And this code does
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Reserved additional info value. And this code does
/* Reserved additional info value. And this code does not

* @brief Alias for @ref n20_istream_s
*/
typedef struct n20_istream_s n20_istream_t;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* specified size and returns a true if the stream is in a good state
* specified size and returns true if the stream is in a good state

* the slice_out structure will point to the slice of the stream
* buffer beginning at the current read position and of size @p size.
*
* @param s The input stream to read from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** @brief Gets a buffer slice from the input stream.
/** @brief Gets a string slice from the input stream.

@werwurm werwurm changed the base branch from werwurm/revisit_istream_get_slice to main September 25, 2025 16:24
@werwurm werwurm requested a review from seidelrj September 25, 2025 16:24
@werwurm werwurm merged commit 277a90e into main Sep 25, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants