You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This anti-pattern was formerly known as [Agent obsession](https://github.com/lucasvegi/Elixir-Code-Smells/tree/main#agent-obsession).
190
190
191
+
## Sending unnecessary data
192
+
193
+
#### Problem
194
+
195
+
Sending a message to a process can be an expensive operation if the message is big enough. That's because that message will be fully copied to the receiving process, which may be CPU and memory intensive. This is due to Erlang's "share nothing" architecture, where each process has its own memory, which simplifies and speeds up garbage collection.
196
+
197
+
This is more obvious when using `send/2`, `GenServer.call/3`, or the initial data in `GenServer.start_link/3`. Notably this also happens when using `spawn/1`, `Task.async/1`, `Task.async_stream/3`, and so on. It is more subtle here as the anonymous function passed to these functions captures the variables it references, and all captured variables will be copied over. By doing this, you can accidentally send way more data to a process than you actually need.
198
+
199
+
#### Example
200
+
201
+
Imagine you were to implement some simple reporting of IP addresses that made requests against your application. You want to do this asynchronously and not block processing, so you decide to use `spawn/1`. It may seem like a good idea to hand over the whole connection because we might need more data later. However passing the connection results in copying a lot of unnecessary data like the request body, params, etc.
202
+
203
+
```elixir
204
+
# log_request_ip send the ip to some external service
205
+
spawn(fn->log_request_ip(conn) end)
206
+
```
207
+
208
+
This problem also occurs when accessing only the relevant parts:
209
+
210
+
```elixir
211
+
spawn(fn->log_request_ip(conn.remote_ip) end)
212
+
```
213
+
214
+
This will still copy over all of `conn`, because the `conn` variable is being captured inside the spawned function. The function then extracts the `remote_ip` field, but only after the whole `conn` has been copied over.
215
+
216
+
`send/2` and the `GenServer` APIs also rely on message passing. In the example below, the `conn` is once again copied to the underlying `GenServer`:
217
+
218
+
```elixir
219
+
GenServer.cast(pid, {:report_ip_address, conn})
220
+
```
221
+
222
+
#### Refactoring
223
+
224
+
This anti-pattern has many potential remedies:
225
+
226
+
* Limit the data you send to the absolute necessary minimum instead of sending an entire struct. For example, don't send an entire `conn` struct if all you need is a couple of fields.
227
+
* If the only process that needs data is the one you are sending to, consider making the process fetch that data instead of passing it.
228
+
* Some abstractions, such as [`:persistent_term`](https://www.erlang.org/doc/man/persistent_term.html), allows you to share data between processes, as long as such data changes infrequently.
229
+
230
+
In our case, limiting the input data is a reasonable strategy. If all we need *right now* is the IP address, then let's only work with that and make sure we're only passing the IP address into the closure, like so:
0 commit comments