Bug Report
Describe the bug
If you extract all the systemd logs and attempt to write these to CloudWatch Logs then some of the UNIT values contain backslashes from the journal logs. These then fail to be written to CloudWatch.
If you take an actual source journal entry from the host that has the issue:
"Starting systemd-fsck@dev-disk-by\x2dlabel-BOOT.service - File System Check on /dev/disk/by-label/BOOT..."
Then if you are using dynamic log stream creation you get the following error in the pod logs for fluent-bit:
"2026/06/08 15:27:19.955] [ info] [output:cloudwatch_logs:cloudwatch_logs.3] Creating log stream simon-a-database-dev-001-journal.systemd-fsck@dev-disk-by\x2dlabel-BOOT.service in log group /simon-dev/journal [2026/06/08 15:27:19.957] [error] [output:cloudwatch_logs:cloudwatch_logs.3] CreateLogStream API responded with error='SerializationException'"
To Reproduce
Deploy the fluent-bit helm chart using the following values for INPUT and OUTPUTS:
#
# Full journald logs
#
[INPUT]
Name systemd
Tag journal.*
read_from_tail false
max_entries 200
Strip_Underscores On
DB /var/fluent-bit/state/flb_journal.db
# Fixed: Protects RAM from generic operating system log spikes
storage.type filesystem
#
# journal logs
#
[OUTPUT]
Name cloudwatch_logs
Match journal.*
region eu-west-2
log_group_name /simon-dev/journal
log_stream_prefix ${NODE_NAME}-
auto_create_group true
log_retention_days 1
Retry_Limit False
Workers 2
NODE name is defined by the following environment setting:
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- Steps to reproduce the problem:
Deploy fluent-bit v5.0.6 using helm and ensure the inputs/outputs are specified in your values file as above.
Expected behavior
The logs going through to CloudWatch Logs should not fail - any disallowed characters for log stream names should be corrected.
Screenshots
N/A as the pod errors have been provided.
Your Environment
- Version used: Fluent bit v5.0.6 from the repository: cr.fluentbit.io/fluent/fluent-bit
- Configuration: k8s v1.34.6 running on EC2 instances running Ubuntu 24.04
- Environment name and version (e.g. Kubernetes? What version?): v1.34.6
- Server type and version: EC2 instances - Ubuntu 24.04.4 LTS
- Operating System and version: Ubuntu 24.04.4 LTS
- Filters and plugins:
Additional context
We were attempting to choose a generic logging tool that would allow us to capture all logs from our deployments so we can interface with various log analysis tools.
We have actually patched the code to get this to work to Cloudwatch currently, but are concerned we may run into issues with other outputs to other logging solutions.
The patched code is given below:
--- cloudwatch_api.c
+++ cloudwatch_api.c
@@ -201,6 +201,124 @@
return FLB_TRUE;
}
+/*
+ * Escape a string so it can be safely embedded inside an AWS JSON request.
+ *
+ * CloudWatch log group/stream names may contain characters that are meaningful
+ * to JSON, such as a backslash. The value sent to CloudWatch must preserve
+ * the original bytes while serialising them as valid JSON. For example, a
+ * stream name containing the four bytes "\x2d" must be sent in JSON text
+ * as "\\x2d".
+ */
+static size_t json_escaped_len(const char *str)
+{
+ size_t i;
+ size_t len = 0;
+ unsigned char c;
+
+ if (str == NULL) {
+ return 0;
+ }
+
+ for (i = 0; str[i] != '\0'; i++) {
+ c = (unsigned char) str[i];
+
+ switch (c) {
+ case '"':
+ case '\\':
+ case '\b':
+ case '\f':
+ case '\n':
+ case '\r':
+ case '\t':
+ len += 2;
+ break;
+ default:
+ if (c < 0x20) {
+ len += 6; /* \u00XX */
+ }
+ else {
+ len++;
+ }
+ }
+ }
+
+ return len;
+}
+
+static char *json_escape_string(const char *str)
+{
+ static const char hex[] = "0123456789abcdef";
+ size_t i;
+ size_t out_size;
+ char *out;
+ char *p;
+ unsigned char c;
+
+ if (str == NULL) {
+ return flb_strdup("");
+ }
+
+ out_size = json_escaped_len(str) + 1;
+ out = flb_malloc(out_size);
+ if (out == NULL) {
+ flb_errno();
+ return NULL;
+ }
+
+ p = out;
+ for (i = 0; str[i] != '\0'; i++) {
+ c = (unsigned char) str[i];
+
+ switch (c) {
+ case '"':
+ *p++ = '\\';
+ *p++ = '"';
+ break;
+ case '\\':
+ *p++ = '\\';
+ *p++ = '\\';
+ break;
+ case '\b':
+ *p++ = '\\';
+ *p++ = 'b';
+ break;
+ case '\f':
+ *p++ = '\\';
+ *p++ = 'f';
+ break;
+ case '\n':
+ *p++ = '\\';
+ *p++ = 'n';
+ break;
+ case '\r':
+ *p++ = '\\';
+ *p++ = 'r';
+ break;
+ case '\t':
+ *p++ = '\\';
+ *p++ = 't';
+ break;
+ default:
+ if (c < 0x20) {
+ *p++ = '\\';
+ *p++ = 'u';
+ *p++ = '0';
+ *p++ = '0';
+ *p++ = hex[(c >> 4) & 0x0f];
+ *p++ = hex[c & 0x0f];
+ }
+ else {
+ *p++ = (char) c;
+ }
+ }
+ }
+
+ *p = '\0';
+
+ return out;
+}
+
static int entity_add_key_attributes(struct flb_cloudwatch *ctx, struct cw_flush *buf,
struct log_stream *stream, int *offset)
{
@@ -378,13 +496,22 @@
struct log_stream *stream, int *offset)
{
int ret;
+ char *escaped_group = NULL;
+ char *escaped_name = NULL;
+
+ escaped_group = json_escape_string(stream->group);
+ escaped_name = json_escape_string(stream->name);
+ if (escaped_group == NULL || escaped_name == NULL) {
+ goto error;
+ }
+
if (!try_to_write(buf->out_buf, offset, buf->out_buf_size,
"{\"logGroupName\":\"", 17)) {
goto error;
}
if (!try_to_write(buf->out_buf, offset, buf->out_buf_size,
- stream->group, 0)) {
+ escaped_group, 0)) {
goto error;
}
@@ -394,7 +521,7 @@
}
if (!try_to_write(buf->out_buf, offset, buf->out_buf_size,
- stream->name, 0)) {
+ escaped_name, 0)) {
goto error;
}
@@ -443,9 +570,14 @@
goto error;
}
+ flb_free(escaped_group);
+ flb_free(escaped_name);
+
return 0;
error:
+ flb_free(escaped_group);
+ flb_free(escaped_name);
return -1;
}
@@ -759,8 +891,8 @@
buf->event_index = 0;
buf->data_size = PUT_LOG_EVENTS_HEADER_LEN + PUT_LOG_EVENTS_FOOTER_LEN;
if (buf->current_stream != NULL) {
- buf->data_size += strlen(buf->current_stream->name);
- buf->data_size += strlen(buf->current_stream->group);
+ buf->data_size += json_escaped_len(buf->current_stream->name);
+ buf->data_size += json_escaped_len(buf->current_stream->group);
}
}
@@ -1722,18 +1854,26 @@
struct flb_aws_client *cw_client;
flb_sds_t body;
flb_sds_t tmp;
+ char *escaped_group = NULL;
flb_plg_info(ctx->ins, "Setting retention policy on log group %s to %dd", stream->group, ctx->log_retention_days);
- body = flb_sds_create_size(68 + strlen(stream->group));
+ escaped_group = json_escape_string(stream->group);
+ if (escaped_group == NULL) {
+ return -1;
+ }
+
+ body = flb_sds_create_size(68 + strlen(escaped_group));
if (!body) {
+ flb_free(escaped_group);
flb_sds_destroy(body);
flb_errno();
return -1;
}
- /* construct CreateLogGroup request body */
- tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\",\"retentionInDays\":%d}", stream->group, ctx->log_retention_days);
+ /* construct PutRetentionPolicy request body */
+ tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\",\"retentionInDays\":%d}", escaped_group, ctx->log_retention_days);
+ flb_free(escaped_group);
if (!tmp) {
flb_sds_destroy(body);
flb_errno();
@@ -1786,20 +1926,28 @@
flb_sds_t body;
flb_sds_t tmp;
flb_sds_t error;
+ char *escaped_group = NULL;
int ret;
flb_plg_info(ctx->ins, "Creating log group %s", stream->group);
+ escaped_group = json_escape_string(stream->group);
+ if (escaped_group == NULL) {
+ return -1;
+ }
+
/* construct CreateLogGroup request body */
if (ctx->log_group_class_type == LOG_CLASS_DEFAULT_TYPE) {
- body = flb_sds_create_size(30 + strlen(stream->group));
+ body = flb_sds_create_size(30 + strlen(escaped_group));
if (!body) {
+ flb_free(escaped_group);
flb_sds_destroy(body);
flb_errno();
return -1;
}
- tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\"}", stream->group);
+ tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\"}", escaped_group);
+ flb_free(escaped_group);
if (!tmp) {
flb_sds_destroy(body);
flb_errno();
@@ -1807,15 +1955,17 @@
}
body = tmp;
} else {
- body = flb_sds_create_size(37 + strlen(stream->group) + strlen(ctx->log_group_class));
+ body = flb_sds_create_size(37 + strlen(escaped_group) + strlen(ctx->log_group_class));
if (!body) {
+ flb_free(escaped_group);
flb_sds_destroy(body);
flb_errno();
return -1;
}
tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\", \"logGroupClass\":\"%s\"}",
- stream->group, ctx->log_group_class);
+ escaped_group, ctx->log_group_class);
+ flb_free(escaped_group);
if (!tmp) {
flb_sds_destroy(body);
flb_errno();
@@ -1897,14 +2047,26 @@
flb_sds_t body;
flb_sds_t tmp;
flb_sds_t error;
+ char *escaped_group = NULL;
+ char *escaped_name = NULL;
int ret;
flb_plg_info(ctx->ins, "Creating log stream %s in log group %s",
stream->name, stream->group);
- body = flb_sds_create_size(50 + strlen(stream->group) +
- strlen(stream->name));
+ escaped_group = json_escape_string(stream->group);
+ escaped_name = json_escape_string(stream->name);
+ if (escaped_group == NULL || escaped_name == NULL) {
+ flb_free(escaped_group);
+ flb_free(escaped_name);
+ return -1;
+ }
+
+ body = flb_sds_create_size(50 + strlen(escaped_group) +
+ strlen(escaped_name));
if (!body) {
+ flb_free(escaped_group);
+ flb_free(escaped_name);
flb_sds_destroy(body);
flb_errno();
return -1;
@@ -1913,8 +2075,10 @@
/* construct CreateLogStream request body */
tmp = flb_sds_printf(&body,
"{\"logGroupName\":\"%s\",\"logStreamName\":\"%s\"}",
- stream->group,
- stream->name);
+ escaped_group,
+ escaped_name);
+ flb_free(escaped_group);
+ flb_free(escaped_name);
if (!tmp) {
flb_sds_destroy(body);
flb_errno();
We want an officially fixed version that takes into account all output providers where possible. The above code only currently addresses CloudWatch Logs.
Bug Report
Describe the bug
If you extract all the systemd logs and attempt to write these to CloudWatch Logs then some of the UNIT values contain backslashes from the journal logs. These then fail to be written to CloudWatch.
If you take an actual source journal entry from the host that has the issue:
"Starting systemd-fsck@dev-disk-by\x2dlabel-BOOT.service - File System Check on /dev/disk/by-label/BOOT..."
Then if you are using dynamic log stream creation you get the following error in the pod logs for fluent-bit:
To Reproduce
Deploy the fluent-bit helm chart using the following values for INPUT and OUTPUTS:
Deploy fluent-bit v5.0.6 using helm and ensure the inputs/outputs are specified in your values file as above.
Expected behavior
The logs going through to CloudWatch Logs should not fail - any disallowed characters for log stream names should be corrected.
Screenshots
N/A as the pod errors have been provided.
Your Environment
Additional context
We were attempting to choose a generic logging tool that would allow us to capture all logs from our deployments so we can interface with various log analysis tools.
We have actually patched the code to get this to work to Cloudwatch currently, but are concerned we may run into issues with other outputs to other logging solutions.
The patched code is given below:
We want an officially fixed version that takes into account all output providers where possible. The above code only currently addresses CloudWatch Logs.