Skip to content

Conversation

@DeeDeeG
Copy link
Contributor

@DeeDeeG DeeDeeG commented Nov 2, 2025

Context

See #115.
(Fixes: #115)

Explanation

Strings starting with "arm[...?]" should not be checked before strings starting with "arm_aarch64[...?]", since the former will prematurely match, preventing the latter from ever matching.

[EDIT: a variant of one of the alternate solutions below was used instead.]

Available Alternate Solutions

Alternatively, the longer max comparison character count could be restored (was 20 chars before #114), so that the comparison is more "exact match" rather than "starts with," per documentation of this comparison function: https://docs.gtk.org/glib/func.ascii_strncasecmp.html. (I tested this and it works. I don't have a strong preference but I simply picked one of the available solutions to get a PR out quickly for your consideration.)

[EDIT: Lengthened all of the string comparisons to "the fixed string's non-null ASCII character count + 1" bytes, to include the fixed string's terminating NUL byte, so as to remove ambiguities between longer strings and shorter substrings that could otherwise be matched at the beginnings of those fixed strings. Particularly addresses the fact that the first three characters of "arm" == the first three characters of "arm_arch64" (bug).]

We could drop the check for the "arm" string entirely, which it seems may have been the intended solution in the review discussions of #114, but I'm not sure if anyone's scripts out there use that, or what, it would have been harder for me to verify that approach doesn't break anyone's stuff so I opted for this that doesn't remove any string checks, only reorders them.

There are likely even more solutions to the problem... But this PR is a simple one, and works for me in testing.

Verification Process

I ran a CI workflow with and without this change.

That shows some change from current main branch is indeed needed, and that this PR does indeed work as a fix.

@black-sliver
Copy link

black-sliver commented Nov 2, 2025

I think a decision needs to be made if this is supposed to be "startswith" or "equals". For "startswith", this change looks correct, however that makes it easier to accidentally match something. Say a future architecture is called "aarch64v2", then this will possibly produce an invalid output instead of giving an error.

To change it back to "equals", you'd need to add +1 to each of the lengths because the comparison will then include the NUL byte.

@probonopd probonopd requested a review from TheAssassin November 2, 2025 12:47
@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Nov 2, 2025

Okay, along the lines of what @black-sliver just said, I can confirm all of the following approaches work, regardless of the order the various strings are checked, and should be basically equivalent to each-other in practice AFAICT:

Those all increase the string comparison to be long enough to include the NUL byte at the end of the hard-coded string side of the comparison.

(See docs for g_ascii_strncasecmp: https://gnome.pages.gitlab.gnome.org/libsoup/glib/glib-String-Utility-Functions.html#g-ascii-strncasecmp)

So, with any of the above, the comparison function is looking for "exact match," e.g. "starting and ending with arm[NUL]" rather than "starts with arm" where any trailing characters after that are ignored, so longer strings starting with arm would match undesirably, e.g. "first three chars of arm == first three chars of arm_aarch64" (overly broad comparison catching false matches).

Those are all viable solutions to #115, as is the current PR as already written. As long as it fixes the problem, I'm willing to post whatever solution maintainers here prefer, having just shown four of them that all work. 👍

@TheAssassin
Copy link
Member

I think a decision needs to be made if this is supposed to be "startswith" or "equals". For "startswith", this change looks correct, however that makes it easier to accidentally match something. Say a future architecture is called "aarch64v2", then this will possibly produce an invalid output instead of giving an error.

In my opinion, we should just literally match the strings. Checking only the prefix makes no sense. Typically, however, someone will show up depending on the old behavior. I think it would be ok to break these people's workflows...

We had that docs page explaining our interpretation of ISAs and the identifier strings we use and how they compare to, e.g., Debian's packaging (and we may want to add others, like Docker platforms, GOARCH, ...).

@black-sliver
Copy link

I think that makes DeeDeeG@1dd660a the preferable solution then.

@TheAssassin
Copy link
Member

Agreed. @DeeDeeG could you make sure that this solution is provided here?

(Side note: C string comparisons are so annoying...)

@DeeDeeG DeeDeeG force-pushed the fix-archname-comparison-order-arm_aarch64 branch from f7e34d8 to 9212dff Compare November 9, 2025 16:04
@DeeDeeG DeeDeeG changed the title Fix comparison order for arch names to not skip arm_aarch64 Lengthen string comparisons for arch names, so as to not skip arm_aarch64 Nov 9, 2025
@DeeDeeG
Copy link
Contributor Author

DeeDeeG commented Nov 9, 2025

Hi again folks, I updated the PR with the solution from DeeDeeG@1dd660a as requested.

Effectively checks that the strings being compared start and end with
the specified characters (by including the terminating NUL byte of the
fixed string), rather than only checking that a string *starts with*
the non-null ASCII characters of the fixed string in the comparison.

For example, comparing the first three characters of "pen" and "pencil"
would be a match ("pen" == "pen"). Comparing the first four bytes
instead, we get "pen[NUL byte]" != "penc".

Prevents ambiguity between "arm" and "arm_aarch64", which makes the
"arm_aarch64" code path reachable again in the large if/else of
string comparisons.
@DeeDeeG DeeDeeG force-pushed the fix-archname-comparison-order-arm_aarch64 branch from 9212dff to 0d511be Compare November 9, 2025 16:25
Copy link

@black-sliver black-sliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks correct, have not tested it myself though.

@TheAssassin
Copy link
Member

Thank you very much!

@TheAssassin TheAssassin merged commit 80404e9 into AppImage:main Nov 12, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Changed behavior in extract_arch_from_text

3 participants