test(refactor): add withToxiproxy* in withTools#4833
test(refactor): add withToxiproxy* in withTools#4833mkleczek wants to merge 1 commit intoPostgREST:mainfrom
Conversation
da857ef to
4edd002
Compare
4edd002 to
3e5c575
Compare
5232a7a to
03abf8d
Compare
wolfgangwalther
left a comment
There was a problem hiding this comment.
Just some comments about the code while I am reading through and making myself familiar with toxiproxy.
I have not reached a conclusion about higher level design questions, yet - so don't take my current review as an approval of the general approach, yet :). Will need to get familiar with the tool, the follow up PRs and the bigger picture first.
| proxy_port=''$(${randomPort}) | ||
|
|
||
| ${withToxiproxyServer} ${withToxiproxyProxy} -l "$TCP_PGHOST:$proxy_port" -u "$TCP_PGHOST:$PGPORT" \ | ||
| env "TOXI_PGPORT=$proxy_port" "$_arg_command" "''${_arg_leftovers[@]}" |
There was a problem hiding this comment.
I think a better user-interface would be something where PGPORT was replaced with the proxy port, so that downstream consumers don't need to know whether they need to connect to TOXI_PGPORT or to PGPORT.
The addition of postgrest-with-toxiproxy-xxx should ideally be transparent.
WDYT?
There was a problem hiding this comment.
Good point - if we don't need unix socket access then I agree.
| UNIX_PGHOST="$PGHOST" | ||
| export TCP_PGHOST="localhost" |
There was a problem hiding this comment.
I'm not sure whether we still need to keep unix sockets here at all. We originally did this to prevent port-in-use errors during tests. We now need to add the complexity of finding the right port etc. - so there is no point in keeping additional complexity in supporting both unix sockets and tcp.
There was a problem hiding this comment.
I'd agree. The only thing is load tests and the performance impact (if any) of running them with TCP.
There was a problem hiding this comment.
Why don't we create another dedicated test suite like postgrest-test-recovery (suggested before here) only focused on TCP? That way we don't have to modify previous test infra?
There was a problem hiding this comment.
Why don't we create another dedicated test suite like
postgrest-test-recovery(suggested before here) only focused on TCP?
Fine for me.
In #4860 I've extracted a (private) library test-utils that should be shared across different test suites and will make creating additional test suites easier.
But I don't think it is relevant to this PR really (see below).
That way we don't have to modify previous test infra?
This PR does not really change anything in existing infra (except starting the db server on TCP in addition to unix socket, which required very small adjustments in tests - adding explicit PGPORT)
There was a problem hiding this comment.
except starting the db server on TCP in addition to unix socket, which required very small adjustments in tests - adding explicit PGPORT
That looks like a change, unless if it was gated by bool env var like:
postgrest/nix/tools/withTools.nix
Line 33 in 08ec8d1
There was a problem hiding this comment.
Why don't we create another dedicated test suite like
postgrest-test-recovery(suggested before here) only focused on TCP?
Essentially, I conclude the same in #4868. I would not call it "recovery", see my reasoning in that issue. The key thing is: I don't want to run TCP connections on the host system, because that just leads to port conflicts down the road - we've always had these eventually.
In #4860 I've extracted a (private) library
test-utilsthat should be shared across different test suites and will make creating additional test suites easier.
But I don't think it is relevant to this PR really (see below).
See that same issue above why I think what we're currently doing in the observability test suite is fundamentally wrong - and as such that PR is, too. We can't really use haskell code to "inspect" PostgREST, because we'd always only test the library, not the executable in its entirety. Which is fine for some things, but surely not for listener related things.
There was a problem hiding this comment.
See that same issue above why I think what we're currently doing in the
observabilitytest suite is fundamentally wrong - and as such that PR is, too. We can't really use haskell code to "inspect" PostgREST, because we'd always only test the library, not the executable in its entirety. Which is fine for some things, but surely not for listener related things.
That's obviously not true, see for example https://github.com/mkleczek/postgrest/blob/9af29496ad943381233d2752ca8f6dd6c10487c4/test/observability/Observation/ToxiSpec.hs#L43
There was a problem hiding this comment.
I'm not sure what the link is supposed to tell me, but it's certainly not a counter point to my argument. Just because you are testing the listener there, doesn't invalidate my point which is:
- We will very likely want to split library from executable.
- Listener should be in the executable (even if it was technically implemented as a cabal library, that's not the point)
- You can't test the full executable by referencing it as a library - and we certainly are not doing so right now, because the observability tests are setting the config via Haskell, for example.
- Once you run the executable, you can't use haskell code to inspect it anymore.
Your tests are not running the executable, they are using the still-not-split postgrest as a library.
There was a problem hiding this comment.
I'm not sure what the link is supposed to tell me, but it's certainly not a counter point to my argument. Just because you are testing the listener there, doesn't invalidate my point which is:
- We will very likely want to split library from executable.
Then it is even more important to test the library, isn't it?
- Listener should be in the executable (even if it was technically implemented as a cabal library, that's not the point)
I am not sure I understand that. Listener is both a library and in the executable.
- You can't test the full executable by referencing it as a library - and we certainly are not doing so right now, because the observability tests are setting the config via Haskell, for example.
- Once you run the executable, you can't use haskell code to inspect it anymore.
Your tests are not running the executable, they are using the still-not-split postgrest as a library.
I know that, but I don't understand why it would invalidate the tests. There are various kinds of tests necessary to ensure high quality: unit tests, white box tests, black box tests - all with different trade-offs.
You could use the same argument for postgrest-test-spec - they don't test the whole executable. It does not make them invalid or not useful.
And again - there are good arguments for writing tests in Haskell - that already was discussed in #4671.
There was a problem hiding this comment.
I think this discussion is probably better off in #4868, because many of the points you are addressing here, I already did there.
6cf4996 to
0077aa2
Compare
This commit introduces toxiproxy in withTools adding the following helpers: * withToxiproxyServer * withToxiproxyProxy * withToxiproxyPgProxy
0077aa2 to
2783535
Compare
I have thought about this a bit more now, see #4868. I don't think That's what these helpers are for, to provide different conditions. Not to provide an environment. Toxiproxy is different, because we need to provide an environment with toxiproxy available - and that always needs to be the same if the tests make use of it. So we need to solve this on the "environment" level - and first discuss how we'd like to set up our environment in general. I mentioned virtual machines / NixOS VM tests before, you mentioned containers. I wrote some things up in #4868, but they don't really touch on the details, yet. Some rough brainstorming for which kinds of things we'll want to do with the environment:
Still have to think more about this, but at this stage I'm confident we'll not want |
Sidestepping a little the discussion about any future directions, IMHO you are going into the territory of "make better the enemy of the good". The point is that we don't have any infrastructure to test recovery. And just writing a simple "showcase" test allowed to spot #4681 . I am not sure if what you oppose is I also quite don't understand what you are saying about |
The first.
I am still working on that part.
You can do What alternatives do you have to If we were using such a
Of course these scripts provide some kind of environment, they need to by the nature of how they work. But they are really bad at providing an isolated environment, which is really what we need for any of these. |
|
(there is one other option of when the with helpers are useful, which I did not consider before: If we're reusing that part of code for multiple test-suites - however, I don't see us using toxiproxy across multiple test-suites) |
But it is!
The discussion whether
I agree here and that was my idea - not sure if they should be added now though. OTOH you also need to have the possibility to manipulate |
Yeah, I had assumed that to be the basic pre-requisite to make toxiproxy useful. That's what I mean by not being transparent to the test. Once you make use of toxiproxy in a test-suite, it becomes a requirement.
Then maybe I don't understand how these are supposed to be used, yet. My understanding was that these conditions are to be managed by each test case, not by the environment for the whole test suite. Isn't that the great advantage of toxiproxy over something like slocat? Once they are managed by each test case we will very quickly come up with the requirement to isolate proxies for each test case, so that they can still run in parallel, without affecting each other. At that point, a |
So really - there is a couple important but separate things to consider:
To be honest - this PR was started when I get where you are coming from and I am happy to restrict the scope of this PR to starting Pg on TCP socket and providing |
I disagree on that part. The Yes, we would certainly want to test the effects of pipeling on performance. But not with our loadtests.
This is the most critical piece for me right now. As discussed earlier, I'm not convinced that we'll want to Once we solve this piece and know where we want to go, we can start providing the right tools for that test framework to work well with toxiproxy. Of course this interacts with how we provide what I call the "environment", i.e. various tools all running at the same time and us being able to manage them well. Since I'm not so convinced about using containers, I am thinking primarily about how to set these environments up in a NixOS VM test - and then do one of:
That's the two things I am working on right now. |
FWIW, on |
I am not sure what your idea is but Toxiproxy is a server that you can manipulate in runtime with REST API. So you can either use your language's REST client to inject "toxics" or you can use |
This commit introduces toxiproxy in withTools adding the following helpers: