Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix signed vs unsigned comparisons #3515

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

ffesti
Copy link
Contributor

@ffesti ffesti commented Jan 10, 2025

This is just the first half. More to come.

Related: #3226

Use unsigned int for all kind of counts

Related: rpm-software-management#3226
Cast results from strlen to ssize_t as our string handling code uses
pointers whihc are signed. These places should probably replaced by
proper C++ strings at some point.

Related: rpm-software-management#3226
@ffesti ffesti requested a review from a team as a code owner January 10, 2025 12:02
@ffesti ffesti requested review from pmatilai and removed request for a team January 10, 2025 12:02
Use the right type of counting variable in for loops. Match the thing
being counted instad of just using int.

Related: rpm-software-management#3226
RPM tags are unsigned. Use a constant in the right range instead of -2.

Related: rpm-software-management#3226
Fwrite returns ssize_t while the sizes we want to right are unsigned.
Cast them to match in the comparison.

Related: rpm-software-management#3226
Use unsigned variable where appropriate and also use unsignend constants
for offset calculations.

Related: rpm-software-management#3226
Indices in rpmfi are signed to allow for -1 while counts are unsigned
rpm_count_t. Cast the indices after they have been checked to be >= 0

Related: rpm-software-management#3226
Use corrected signedness for variabes or cast when necessary.

Related: rpm-software-management#3226
Turn linelen to unsigned internally.

Related: rpm-software-management#3226
@pmatilai
Copy link
Member

pmatilai commented Jan 10, 2025

Eeek. One can spend weeks reviewing this kind of stuff, and those weeks we should be spending on other stuff just now.

Maybe lets just reduce this to the stuff where an obviously wrong type is used originally for now, like size_t vs int in the strpool (although the commit message talks about unsigned int which it isn't). Needing casts tends to suggests it needs a wholly different approach.

Mass-changes are better done at a beginning of a cycle, not when we're nearing the end.

@@ -58,18 +58,21 @@ static char *base64_encode_block(const char *plaintext_in, int length_in, char *

#define BASE64_DEFAULT_LINE_LENGTH 64

char *rpmBase64Encode(const void *data, size_t len, int linelen)
char *rpmBase64Encode(const void *data, size_t len, int linelen_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ew, don't change parameter names for internal tweaks, this makes the code mightily confusing.

@@ -976,12 +976,12 @@ static int isHardLink(FileListRec flp, FileListRec tlp)
*/
static int checkHardLinks(FileRecords & files)
{
for (int i = 0; i < files.size(); i++) {
for (rpm_count_t i = 0; i < files.size(); i++) {
Copy link
Member

@pmatilai pmatilai Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think rpm_count_t is right here: rpm_count_t reflects the limit inside headers, but here the data is in a vector that can hold much more, so given enough files in a buildroot, rpm_count_t could roll over and behave in various interesting ways.

I almost wrote it should use the same type as files.size() returns, but what this really should do is use iterators to avoid the issue entirely.

@pmatilai
Copy link
Member

...etc. This is tricky stuff.

@@ -337,7 +337,7 @@ static rpmRC parseForVerify(char * buf, int def, FileEntry entry)
if ((p = strstr(buf, name)) == NULL)
return RPMRC_OK;

for (pe = p; (pe-p) < strlen(name); pe++)
for (pe = p; (pe-p) < (ssize_t) strlen(name); pe++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another observation on this is all these strlen() casts to signed is actually an anti-pattern, because in this context we know pe is always >= p so the difference is always unsigned and casting it to signed makes it look otherwise.

Casting en masse is a clear sign that here's an issue that needs a different solution. Casts should only really be used as the last resort on isolated spots where they tell the reader "watchout, there's something special going on." And what happens in here is nothing special, it's just ancient C code that needs a better overall solution.

Finally, we shouldn't be adding C-style casts at all anymore. For the initial C++ build conversion there was simply no choice, but we're in permanent C++ land now and should use the C++ forms: https://en.cppreference.com/w/cpp/language/explicit_cast

@@ -26,9 +26,9 @@ struct poolHashBucket_s {
};

struct poolHash_s {
int numBuckets;
size_t numBuckets;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poolHashCreate() and poolHashResize() numBuckets argument type should also change accordingly.

fprintf(stderr, "Hashsize: %i\n", ht->numBuckets);
fprintf(stderr, "Keys: %i\n", ht->keyCount);
fprintf(stderr, "Hashsize: %li\n", ht->numBuckets);
fprintf(stderr, "Keys: %li\n", ht->keyCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens to compile on x86_64 but isn't right: size_t isn't guaranteed long, and "i" denotes signed integer where this is changing them to unsigned. %zu is the appropriate formatting for size_t.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants