Skip to content

Added compute script, removed unused scripts, cleaned up script forma…#10

Open
lintyfresh wants to merge 5 commits intomainfrom
feature/compute
Open

Added compute script, removed unused scripts, cleaned up script forma…#10
lintyfresh wants to merge 5 commits intomainfrom
feature/compute

Conversation

@lintyfresh
Copy link
Collaborator

…tting, updated README

Copy link
Member

@natalya-patrikeeva natalya-patrikeeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the scripts are correct, I made a few code suggestions to make it more elegant, and performant (avoid loop, use vectorized ops, doc your functions) but looks great!


# what name should the investigate_df be saved under? change each time you
# run this script to avoid overriding
investigate_json = "investigate_2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a better way is to check if file exists, and remove and save with same name? We can probably use one log file and append to it. Or are you saying you want to keep several copies like investigate_1, investigate_2?

# run this script to avoid overriding
investigate_json = "investigate_2"

# what name should the investigate_df be saved under? change each time you
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about this comment and below file metrics_2? is there metrics_1?


# what name should the investigate_df be saved under? change each time you
# run this script to avoid overriding
metrics_json = "metrics_2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same idea as above, let's not make new files unless we can't avoid it



def sort_outputs(s):
# opens all JSON files within a file path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to keep all results in one json, we could have a key for error and filter that later instead of having two files

# Functions


def sort_outputs(s):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would pick a more meaningful arg than "s"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add doc strings to all functions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typing can also be helpful!



def lookup_truth(row):
# looks up ground truth for a given row based on the benchmark name and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc string here please

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



def compute_accuracy(row):
# compares accuracy of llm output to ground_truth based on benchmark_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add doc string

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't love the if /elses and the repeated return 1 and 0. Can rewrite with less code using functions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with open("/zfs/projects/students/ltdarc-usf-intern-2025/LLM_benchmarks/inputs/ground_truth.json", "r") as f:
truth = json.load(f)
image_index = list(truth.keys())
for image in image_index:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can load all images at once avoiding the loop

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return full_investigate_df


def lookup_truth(row):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this function should take in df not a row and run on the whole df not row by row

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


full_investigate_df.to_json(
f"/zfs/projects/students/ltdarc-usf-intern-2025/LLM_benchmarks/outputs/metrics/{investigate_json}.json",
orient='records')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this overwrite each run?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants