feat: Add support to send OTEL traces via OTLP#4899
feat: Add support to send OTEL traces via OTLP#4899jamescrosswell wants to merge 7 commits intomainfrom
Conversation
Semver Impact of This PR⚪ None (no version bump detected) 📋 Changelog PreviewThis is how your changes will appear in the changelog. Features ✨
Dependencies ⬆️
Other
🤖 This preview updates automatically when you update the PR. |
Instructions and example for changelogPlease add an entry to Example: ## Unreleased
### Features
- Add support to send OTEL traces via OTLP ([#4899](https://github.com/getsentry/sentry-dotnet/pull/4899))If none of the above apply, you can opt out of this check by adding |
| // Finally we configure OpenTelemetry to send traces to Sentry | ||
| .AddSentry() | ||
| // Finally, we configure OpenTelemetry to send traces to Sentry | ||
| .AddSentry(dsn) |
There was a problem hiding this comment.
This is unfortunately a bit more complex for SDK users to initialise because the TracerProviderBuilder.AddOtlpExporter method has no override with an IServiceProvider parameter (so there's no way to read the DSN from the sentry options that get added to the service registry later on).
| <PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.8.1" /> | ||
| <PackageReference Include="OpenTelemetry.Instrumentation.Http" Version="1.8.1" /> | ||
| <PackageReference Include="OpenTelemetry.Instrumentation.Runtime" Version="1.8.0" /> | ||
| <PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.8.1" /> |
There was a problem hiding this comment.
Without adding an explicit reference to this package, we get an error because the other packages try to load types from a different assembly version. It's the workaround for this problem basically.
| /// </para> | ||
| /// </param> | ||
| /// <returns>The supplied <see cref="TracerProviderBuilder"/> for chaining.</returns> | ||
| public TracerProviderBuilder AddSentry(TextMapPropagator? defaultTextMapPropagator = null) |
There was a problem hiding this comment.
We could mark this overload obsolete if we wanted to encourage SDK users to move to the OTLP version of the integration... then remove it in a future major release.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4899 +/- ##
==========================================
- Coverage 73.87% 73.69% -0.18%
==========================================
Files 494 495 +1
Lines 17868 17919 +51
Branches 3509 3523 +14
==========================================
+ Hits 13200 13206 +6
- Misses 3808 3851 +43
- Partials 860 862 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Odd that we're still seeing this after having moved to conventional commits. It doesn't look like the danger workflow is aware of our release.yml file or accounts for this in it's logic. @Flash0ver any ideas what we might be doing wrong? |
I'm not sure how to remove the Danger Changelog Missing automation off the top of my head ... I'll look into it tomorrow. |
src/Sentry/Internal/Hub.cs
Outdated
| return NoOpTransaction.Instance; | ||
| } | ||
|
|
||
| if (_options.Instrumenter == Instrumenter.OpenTelemetry) |
There was a problem hiding this comment.
This would be a pretty major change...
- It would break the
Sentry.Extensions.AIintegration for a start (we'd need to provide aSentry.Extensions.AI.OpenTelemetryintegration instead). - It may break lots of user code that assumes we support mixed instrumentation in a single process (which we did up until now)
Potentially we leave this commented out as a todo for v7.0.0 and we just strongly advise SDK users not to use mixed instrumentation until then as we can't guarantee it will work as expected.
@Flash0ver @dingsdax thoughts?
| <ItemGroup> | ||
| <!-- Version 1.6.0 is the minimum version that does not have trim warnings --> | ||
| <PackageReference Include="OpenTelemetry" Version="1.6.0" /> | ||
| <PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.6.0" /> |
There was a problem hiding this comment.
next major: consider bumping to >= 1.11.0, as the dependency graph looks cleaner
| /// </summary> | ||
| /// <exception cref="ArgumentOutOfRangeException">Can be thrown when using OpenTelemetry instrumentation and | ||
| /// System.Diagnostics.Activity.Current is null. This method should not be used when instrumenting with OTEL. | ||
| /// </exception> |
There was a problem hiding this comment.
This is problematic... the alternative would be to drop the guard here:
sentry-dotnet/src/Sentry/DynamicSamplingContext.cs
Lines 34 to 38 in 6f8312d
...or to replace it with a log rather than throwing an exception. However then I think we risk sending invalid envelopes to Sentry and losing events if Sentry drops these.
I think this is the correct approach. The only code that we have which was using this was in the SentryMessageHandler implementations.
There is one small additional risk because it's also used by the GetBaggage method, which is part of the public SDK. I added some code to log a warning if that method is called when OTEL instrumentation is being used:
sentry-dotnet/src/Sentry/Internal/Hub.cs
Lines 313 to 328 in 6f8312d
And I added some info about the possible exception to the public XML docs for that method:
sentry-dotnet/src/Sentry/IHub.cs
Lines 73 to 75 in 6f8312d
It's not great, but I can't think of any better ways to handle it. @Flash0ver thoughts?
There was a problem hiding this comment.
Hmmm .... not sure if ArgumentOutOfRangeException is the correct exception here, as it's usually used when numeric values are greater than or less than expected ... maybe just ArgumentException ... but I'm not really answering the question, aren't I?!
Another thought that I'm having is that we're quite inconsistent with Argument validation:
At some locations we throw, e.g. ArgumentNullException, at other places we don't throw but only log via the DiagnosticLogger if enabled. But again, no solution to the problem.
Perhaps we could solve this issue with an Analyzer.
Throw at run-time, and warn at build-time for invocations of this method when Sentry.OpenTelemetry is referenced. This may cause some false positives in cases where Sentry.OpenTelemetry is referenced, but actually not initialized and the Instrumenter is not Instrumenter.OpenTelemetry at runtime. But it would be a bit of a "best effort" solution. At least there shouldn't be false negatives ... unless I am missing something.
(thought: it seems that every time I'm out of ideas, I suggest some Compiler Extension to fix it 😉)
There was a problem hiding this comment.
Follow-up thought:
I suppose the best solution would be to make this method actually not invocable when the is Instrumenter.OpenTelemetry. Well, if the option is not set at runtime, we can't just remove this method at build time.
Just for some brain-storming: Maybe we could re-direct the invocation - via an Interceptor - to another method if Sentry.OpenTelemetry is referenced ... but this solution has the same issue as the Analyzer from above: at build-time, we only know whether Sentry.OpenTelemetry is referenced, not if OpenTelemetry is actually initialized .. well, we could scan for the respective AddSentry/UseSentry invocation, but what if there are more than one Hubs ... we can't really draw the connection there unambiguously.
There was a problem hiding this comment.
we're quite inconsistent with Argument validation:
At some locations we throw, e.g.ArgumentNullException, at other places we don't throw but only log via theDiagnosticLoggerif enabled. But again, no solution to the problem.
I don't really want to throw an exception here... it's not by design. However:
- I can't change the signature for the public API (it has a public
GetBaggagemethod that returns a non-nullable type) - The only other option would be to suppress the error and return an empty
SentryIdrepresenting theTraceId... but then we risk Sentry dropping events.
Maybe, if the GetBaggage() method is called when OTEL tracing is enabled, we log a warning and simply return an emply baggage header? That way we avoid the possibility of our code erroring out (we never call this method if OTEL tracing is enabled... we can but an assert in there to ensure this) but we do make it really clear the method shouldn't be used with OTEL tracing?
There was a problem hiding this comment.
Hmmm .... not sure if
ArgumentOutOfRangeExceptionis the correct exception here, as it's usually used when numeric values are greater than or less than expected ... maybe justArgumentException... but I'm not really answering the question, aren't I?!
We could change that... I think the traceId is essentially a GUID and you get an out of range if this is empty (i.e. all zeros)... if you squint, a guid is kind of a number but you have to squint 😜
Resolves: #4859