To ensure people are actually solving the problem rather than abusing the visibility of challenge test cases to write code that only works for the tests, there need to be secret test cases that are run upon submission.
There are different ways secret test cases could work:
- Enumerated – like public test cases, secret test cases would be specified explicitly in the challenge file, making them effectively transparent for anyone reading them.
- Generative – Challenges would somehow describe the format of their input and based on this, valid inputs could be generated. With these generated inputs, submissions could be tested in one of the following ways:
- if we have a known solution available, we could compare the output of this solution with the output of the submission
- if we have a validator (a function taking an
(input, output) pair and deciding if output is a correct solution for input), we could do property-based testing
It would probably be most beneficial to have generative tests for our secret tests, as this would reduce the burden on challenge contributors and enable us to be more confident that a solution doesn't just work for the explicit test cases. As a "downside", we would have to add the requirement to specify the input format – however, this description could be used in other contexts as well (see #2).
Whether we should use validators (possibly a collection of validators per challenge that each only test a single property) or a complete solution is a different discussion. Specifying the relationship between inputs and outputs may be easier, more obvious and less prone to errors. It is also more akin to how we typically unit test actual code.
On the other hand, full-blown-solutions are very straight forward to integrate – we could send them to the execution service in the same way as all other submissions.
For validators/property definitions, it would likely make the most sense to write them in some embeddable language that we can call directly from our backend code.
To ensure people are actually solving the problem rather than abusing the visibility of challenge test cases to write code that only works for the tests, there need to be secret test cases that are run upon submission.
There are different ways secret test cases could work:
(input, output)pair and deciding ifoutputis a correct solution forinput), we could do property-based testingIt would probably be most beneficial to have generative tests for our secret tests, as this would reduce the burden on challenge contributors and enable us to be more confident that a solution doesn't just work for the explicit test cases. As a "downside", we would have to add the requirement to specify the input format – however, this description could be used in other contexts as well (see #2).
Whether we should use validators (possibly a collection of validators per challenge that each only test a single property) or a complete solution is a different discussion. Specifying the relationship between inputs and outputs may be easier, more obvious and less prone to errors. It is also more akin to how we typically unit test actual code.
On the other hand, full-blown-solutions are very straight forward to integrate – we could send them to the execution service in the same way as all other submissions.
For validators/property definitions, it would likely make the most sense to write them in some embeddable language that we can call directly from our backend code.