Skip to content

Conversation

@vighnesh-sawant
Copy link

#42
Pending:- Add more tests for the .sh file
Need some inputs from maintainers/anyone. Please look at the comments on the issue.
Any suggestions , criticism are welcome

@CLAassistant
Copy link

CLAassistant commented Sep 22, 2025

CLA assistant check
All committers have signed the CLA.

@vighnesh-sawant
Copy link
Author

@aaronbrethorst Can you please look at this and #42.

@aaronbrethorst
Copy link
Member

aaronbrethorst commented Oct 12, 2025

@vighnesh-sawant

Thanks for tackling the config endpoint! The overall approach here is solid—using go:generate to capture build information and embedding it with go:embed is exactly the right pattern for Go. You've clearly put thought into how to gather git metadata and structure the response to match the OneBusAway API.

That said, there are some issues that need addressing before this can merge. Let me walk through what needs to change and why.

Critical Issues That Block Merging

1. The Four TODO Fields

Lines 50-54 in internal/restapi/config.go pass "TODO" for id, name, serviceDateFrom, and serviceDateTo. While the production API response shows these populated, looking at different deployments suggests these might be bundle-specific metadata that isn't always present.

For now, just use empty strings instead of "TODO":

configData := models.NewConfigData(
    git,
    "",  // id - bundle-specific, empty for now
    "",  // name - bundle-specific, empty for now
    "",  // serviceDateFrom - will implement later
    "",  // serviceDateTo - will implement later
)

The service date fields (serviceDateFrom and serviceDateTo) should represent the date range covered by your GTFS data—Unix timestamps in milliseconds. These would come from querying the calendar and calendar_dates tables. But that's complex enough that it should be a separate PR. Empty strings are fine for the initial implementation.

2. The Git Dirty Logic is Backwards

Lines 21-25 in config.go have the wrong logic for interpreting git's dirty state:

if git_dirty == "1" {
    git_dirty = "true"
} else {
    git_dirty = "false"
}

The issue is that git diff --exit-code returns exit code 0 when the working directory is clean and 1 when it's dirty. But your shell script captures $? (the exit code), which means 0 should map to "false" and 1 should map to "true". Your current logic does the opposite.

Fix it like this:

if git_dirty == "0" {  // exit code 0 = clean
    git_dirty = "false"
} else {
    git_dirty = "true"
}

3. Shell Script Won't Work on macOS

Line 3 of scripts/git-properties.sh uses date --rfc-3339=seconds, which is a GNU-specific flag. This will break on macOS (which uses BSD date) and probably on other Unix systems too.

Replace it with a more portable version:

date -u +"%d.%m.%Y @ %H:%M:%S UTC"

This works identically across GNU and BSD date implementations.

4. Fragile String Parsing

Line 20 of config.go assumes the shell script outputs exactly 20 lines in exactly the right order:

git_branch, git_build_host, git_build_time, ... := lines[0], lines[1], lines[2], ...

This is brittle. If the script changes, if a git command fails, if someone adds a line—everything breaks. Even worse, it'll panic at runtime if there aren't enough lines.

A more robust approach is to use key-value pairs in the shell script output:

echo "git.branch=$(git rev-parse --abbrev-ref HEAD)"
echo "git.build.host=$(hostname)"
echo "git.build.time=$(date -u +"%d.%m.%Y @ %H:%M:%S UTC")"
# ... and so on

Then parse with strings.Cut:

props := make(map[string]string)
for _, line := range lines {
    key, value, ok := strings.Cut(line, "=")
    if ok {
        props[key] = strings.TrimSpace(value)
    }
}

git := models.NewGitProperties(
    props["git.branch"],
    props["git.build.host"],
    props["git.build.time"],
    // ...
)

This handles missing values gracefully (they'll be empty strings) and makes the script easier to maintain.

Major Issues That Should Be Fixed

5. Incomplete Git Metadata

Lines 6-8, 11-12, and 24 of the shell script return "N/A" for tag-related fields. The production API populates these when tags exist, and they're useful for identifying deployed versions.

Here's how to populate them:

git describe --tags --always  # git.build.version
git describe --tags --abbrev=0  # git.closest.tag.name
git rev-list $(git describe --tags --abbrev=0)..HEAD --count  # git.closest.tag.commit.count
git describe --tags  # git.commit.id.describe
git describe --tags --abbrev=0  # git.commit.id.describe.short
git tag -l --points-at HEAD | tr '\n' ',' | sed 's/,$//'  # git.tags

Each of these can fail if no tags exist, so wrap them in conditionals:

if git describe --tags >/dev/null 2>&1; then
    echo "git.build.version=$(git describe --tags --always)"
else
    echo "git.build.version=N/A"
fi

This way you get real version information when tags are available, but degrade gracefully when they're not.

6. Build User vs. Commit User Confusion

Lines 4-5 use git config user.email and git config user.name for git.build.user.* fields. These return the committer's configured identity, not the person/machine running the build.

For build context, you probably want:

echo "git.build.user.name=$(whoami)"
echo "git.build.user.email=${USER}@$(hostname)"

Or in CI environments, read from environment variables like $CI_USER_NAME or $GITHUB_ACTOR. The current approach works locally but will be wrong in automated builds.

7. Missing Directory Creation

The shell script writes to internal/restapi/git/git_properties.txt, but that directory might not exist initially (especially in CI or when someone clones fresh). Add this to your script:

mkdir -p internal/restapi/git

Or handle it in the Makefile's generate target. Otherwise the first build on a clean checkout will fail.

Minor Issues and Style Notes

8. Go Naming Conventions

The GitProperties struct uses Git_branch, Git_build_host, etc. Go convention is to use camelCase for struct fields: GitBranch, GitBuildHost. The JSON tags are correct (json:"git.branch"), but the field names should follow Go idioms.

This is mostly cosmetic, but it'll matter if other code ever needs to work with these fields directly.

9. The NewGitProperties Constructor is Unwieldy

That function signature has 20 string parameters. It works, but it's error-prone—it's easy to pass arguments in the wrong order, and if you ever need to add a field, every call site breaks.

The key-value parsing approach I suggested in issue #4 eliminates this problem entirely. But if you keep the current approach, at least add a comment documenting the parameter order.

10. Test Coverage Doesn't Test the Important Parts

Your tests in internal/models/config_test.go verify JSON marshaling and unmarshaling, which is great. But they don't test the shell script output, the string parsing logic, or error cases (what if git isn't installed? what if the properties file is malformed?).

Consider adding tests for:

  • Parsing well-formed properties
  • Handling missing lines or fields
  • Handling empty values
  • Verifying the dirty state logic

The marshaling tests are fine to keep, but the real complexity is in the parsing layer.

11. Documentation

Add a comment at the top of scripts/git-properties.sh explaining the output format and any dependencies (git must be available, script must be run from repo root, etc.). Also document in config.go that go generate must be run before building, and that the Makefile handles this automatically.

Design Considerations

12. When Does Regeneration Happen?

The Makefile adds generate as a dependency for both run and test. This means git properties get regenerated every time you run tests or start the server. That's probably fine, but it means the embedded build info reflects the most recent make invocation, not necessarily when the binary was compiled.

This could be surprising if you build once, then later run tests without rebuilding—the binary's embedded git info will update even though the binary itself didn't change.

Consider whether you want generation tied to build instead of run/test. There's no universally right answer here, but it's worth thinking about the behavior you want.

13. What Happens When the Script Fails?

If any git command fails (repo is corrupt, git isn't installed, script runs outside a repo), the shell script will output partial or garbled data. The build will succeed, but the config endpoint will return nonsense.

You could add error handling to the script (exit non-zero on failure, which would break the build), or you could add validation in Go (check that required fields are non-empty before responding). I'd probably do both—fail the build if git info can't be gathered, and log a warning at runtime if embedded data looks wrong.

Summary

The implementation is on the right track. Using go:generate and go:embed is the correct approach, and the structure matches the production API. The main blockers are:

  1. Make the four TODO fields blank (id, name, service dates)
  2. Fix the git dirty logic (it's backwards)
  3. Fix shell script portability (macOS compatibility)
  4. Add robust parsing (handle missing/malformed data)

Once those are addressed, the rest is polish. The minor issues won't break anything, but fixing them will make the code more maintainable and idiomatic.

Let me know if you have questions about any of this. Happy to discuss the service date calculation or help with testing strategies if needed.

@vighnesh-sawant
Copy link
Author

Hey i fixed some issues, have not yet fixed all yet! due to time constraints at college.
I'll tag you once It's ready.

@vighnesh-sawant
Copy link
Author

@aaronbrethorst
6. Will using hostname and whoami work on CI environments too? Since different CI environments have different env variables supporting all of them wont be possible.
Could you please confirm what would be best for
git.build.host (hostname?)
git.build.user.email(${USER}@$(hostname))
git.build.user.name(whoami?)

  1. For this i have used a different approach, is this approach fine?

10,11,13 is something i am still working on!

@vighnesh-sawant
Copy link
Author

vighnesh-sawant commented Oct 15, 2025

13->
Is failing the build a good idea?
I was thinking of leaving the fields empty and logging a warning if something unexpected happens.
Please let me know what will be better!
Should having git installed be necessary for building the project?

@aaronbrethorst
Copy link
Member

The Java app is using git-commit-id-plugin for this purpose, so I'd suggest doing some research into how it handles these problems. If the information isn't available in CI, then I'd suggest leaving it ""

@aaronbrethorst
Copy link
Member

Hi @vighnesh-sawant are there still more changes to do here?

@vighnesh-sawant
Copy link
Author

Tests and making the script handle failures is left, I will try to get it implemented soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants