Skip to content

Journal logs containing backslashes fail to get written to CloudWatch Logs #11931

@simonjpickering

Description

@simonjpickering

Bug Report

Describe the bug
If you extract all the systemd logs and attempt to write these to CloudWatch Logs then some of the UNIT values contain backslashes from the journal logs. These then fail to be written to CloudWatch.

If you take an actual source journal entry from the host that has the issue:

"Starting systemd-fsck@dev-disk-by\x2dlabel-BOOT.service - File System Check on /dev/disk/by-label/BOOT..."

Then if you are using dynamic log stream creation you get the following error in the pod logs for fluent-bit:

"2026/06/08 15:27:19.955] [ info] [output:cloudwatch_logs:cloudwatch_logs.3] Creating log stream simon-a-database-dev-001-journal.systemd-fsck@dev-disk-by\x2dlabel-BOOT.service in log group /simon-dev/journal [2026/06/08 15:27:19.957] [error] [output:cloudwatch_logs:cloudwatch_logs.3] CreateLogStream API responded with error='SerializationException'"

To Reproduce
Deploy the fluent-bit helm chart using the following values for INPUT and OUTPUTS:

 #
    # Full journald logs
    #
    [INPUT]
        Name                systemd
        Tag                 journal.*
        read_from_tail      false
        max_entries         200
        Strip_Underscores   On
        DB                  /var/fluent-bit/state/flb_journal.db
        # Fixed: Protects RAM from generic operating system log spikes
        storage.type        filesystem

    #
    # journal logs
    #
    [OUTPUT]
        Name                cloudwatch_logs
        Match               journal.*
        region              eu-west-2
        log_group_name      /simon-dev/journal
        log_stream_prefix   ${NODE_NAME}-
        auto_create_group   true
        log_retention_days  1
        Retry_Limit         False
        Workers             2

NODE name is defined by the following environment setting:
env:
  - name: NODE_NAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName
  • Steps to reproduce the problem:

Deploy fluent-bit v5.0.6 using helm and ensure the inputs/outputs are specified in your values file as above.

Expected behavior

The logs going through to CloudWatch Logs should not fail - any disallowed characters for log stream names should be corrected.

Screenshots

N/A as the pod errors have been provided.

Your Environment

  • Version used: Fluent bit v5.0.6 from the repository: cr.fluentbit.io/fluent/fluent-bit
  • Configuration: k8s v1.34.6 running on EC2 instances running Ubuntu 24.04
  • Environment name and version (e.g. Kubernetes? What version?): v1.34.6
  • Server type and version: EC2 instances - Ubuntu 24.04.4 LTS
  • Operating System and version: Ubuntu 24.04.4 LTS
  • Filters and plugins:

Additional context
We were attempting to choose a generic logging tool that would allow us to capture all logs from our deployments so we can interface with various log analysis tools.

We have actually patched the code to get this to work to Cloudwatch currently, but are concerned we may run into issues with other outputs to other logging solutions.

The patched code is given below:

--- cloudwatch_api.c
+++ cloudwatch_api.c
@@ -201,6 +201,124 @@
     return FLB_TRUE;
 }
 
+/*
+ * Escape a string so it can be safely embedded inside an AWS JSON request.
+ *
+ * CloudWatch log group/stream names may contain characters that are meaningful
+ * to JSON, such as a backslash. The value sent to CloudWatch must preserve
+ * the original bytes while serialising them as valid JSON. For example, a
+ * stream name containing the four bytes "\x2d" must be sent in JSON text
+ * as "\\x2d".
+ */
+static size_t json_escaped_len(const char *str)
+{
+    size_t i;
+    size_t len = 0;
+    unsigned char c;
+
+    if (str == NULL) {
+        return 0;
+    }
+
+    for (i = 0; str[i] != '\0'; i++) {
+        c = (unsigned char) str[i];
+
+        switch (c) {
+        case '"':
+        case '\\':
+        case '\b':
+        case '\f':
+        case '\n':
+        case '\r':
+        case '\t':
+            len += 2;
+            break;
+        default:
+            if (c < 0x20) {
+                len += 6; /* \u00XX */
+            }
+            else {
+                len++;
+            }
+        }
+    }
+
+    return len;
+}
+
+static char *json_escape_string(const char *str)
+{
+    static const char hex[] = "0123456789abcdef";
+    size_t i;
+    size_t out_size;
+    char *out;
+    char *p;
+    unsigned char c;
+
+    if (str == NULL) {
+        return flb_strdup("");
+    }
+
+    out_size = json_escaped_len(str) + 1;
+    out = flb_malloc(out_size);
+    if (out == NULL) {
+        flb_errno();
+        return NULL;
+    }
+
+    p = out;
+    for (i = 0; str[i] != '\0'; i++) {
+        c = (unsigned char) str[i];
+
+        switch (c) {
+        case '"':
+            *p++ = '\\';
+            *p++ = '"';
+            break;
+        case '\\':
+            *p++ = '\\';
+            *p++ = '\\';
+            break;
+        case '\b':
+            *p++ = '\\';
+            *p++ = 'b';
+            break;
+        case '\f':
+            *p++ = '\\';
+            *p++ = 'f';
+            break;
+        case '\n':
+            *p++ = '\\';
+            *p++ = 'n';
+            break;
+        case '\r':
+            *p++ = '\\';
+            *p++ = 'r';
+            break;
+        case '\t':
+            *p++ = '\\';
+            *p++ = 't';
+            break;
+        default:
+            if (c < 0x20) {
+                *p++ = '\\';
+                *p++ = 'u';
+                *p++ = '0';
+                *p++ = '0';
+                *p++ = hex[(c >> 4) & 0x0f];
+                *p++ = hex[c & 0x0f];
+            }
+            else {
+                *p++ = (char) c;
+            }
+        }
+    }
+
+    *p = '\0';
+
+    return out;
+}
+
 static int entity_add_key_attributes(struct flb_cloudwatch *ctx, struct cw_flush *buf,
                                      struct log_stream *stream, int *offset)
 {
@@ -378,13 +496,22 @@
                             struct log_stream *stream, int *offset)
 {
     int ret;
+    char *escaped_group = NULL;
+    char *escaped_name = NULL;
+
+    escaped_group = json_escape_string(stream->group);
+    escaped_name = json_escape_string(stream->name);
+    if (escaped_group == NULL || escaped_name == NULL) {
+        goto error;
+    }
+
     if (!try_to_write(buf->out_buf, offset, buf->out_buf_size,
                       "{\"logGroupName\":\"", 17)) {
         goto error;
     }
 
     if (!try_to_write(buf->out_buf, offset, buf->out_buf_size,
-                      stream->group, 0)) {
+                      escaped_group, 0)) {
         goto error;
     }
 
@@ -394,7 +521,7 @@
     }
 
     if (!try_to_write(buf->out_buf, offset, buf->out_buf_size,
-                      stream->name, 0)) {
+                      escaped_name, 0)) {
         goto error;
     }
 
@@ -443,9 +570,14 @@
         goto error;
     }
 
+    flb_free(escaped_group);
+    flb_free(escaped_name);
+
     return 0;
 
 error:
+    flb_free(escaped_group);
+    flb_free(escaped_name);
     return -1;
 }
 
@@ -759,8 +891,8 @@
     buf->event_index = 0;
     buf->data_size = PUT_LOG_EVENTS_HEADER_LEN + PUT_LOG_EVENTS_FOOTER_LEN;
     if (buf->current_stream != NULL) {
-        buf->data_size += strlen(buf->current_stream->name);
-        buf->data_size += strlen(buf->current_stream->group);
+        buf->data_size += json_escaped_len(buf->current_stream->name);
+        buf->data_size += json_escaped_len(buf->current_stream->group);
     }
 }
 
@@ -1722,18 +1854,26 @@
     struct flb_aws_client *cw_client;
     flb_sds_t body;
     flb_sds_t tmp;
+    char *escaped_group = NULL;
 
     flb_plg_info(ctx->ins, "Setting retention policy on log group %s to %dd", stream->group, ctx->log_retention_days);
 
-    body = flb_sds_create_size(68 + strlen(stream->group));
+    escaped_group = json_escape_string(stream->group);
+    if (escaped_group == NULL) {
+        return -1;
+    }
+
+    body = flb_sds_create_size(68 + strlen(escaped_group));
     if (!body) {
+        flb_free(escaped_group);
         flb_sds_destroy(body);
         flb_errno();
         return -1;
     }
 
-    /* construct CreateLogGroup request body */
-    tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\",\"retentionInDays\":%d}", stream->group, ctx->log_retention_days);
+    /* construct PutRetentionPolicy request body */
+    tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\",\"retentionInDays\":%d}", escaped_group, ctx->log_retention_days);
+    flb_free(escaped_group);
     if (!tmp) {
         flb_sds_destroy(body);
         flb_errno();
@@ -1786,20 +1926,28 @@
     flb_sds_t body;
     flb_sds_t tmp;
     flb_sds_t error;
+    char *escaped_group = NULL;
     int ret;
 
     flb_plg_info(ctx->ins, "Creating log group %s", stream->group);
 
+    escaped_group = json_escape_string(stream->group);
+    if (escaped_group == NULL) {
+        return -1;
+    }
+
     /* construct CreateLogGroup request body */
     if (ctx->log_group_class_type == LOG_CLASS_DEFAULT_TYPE) {
-        body = flb_sds_create_size(30 + strlen(stream->group));
+        body = flb_sds_create_size(30 + strlen(escaped_group));
         if (!body) {
+            flb_free(escaped_group);
             flb_sds_destroy(body);
             flb_errno();
             return -1;
         }
 
-        tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\"}", stream->group);
+        tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\"}", escaped_group);
+        flb_free(escaped_group);
         if (!tmp) {
             flb_sds_destroy(body);
             flb_errno();
@@ -1807,15 +1955,17 @@
         }
         body = tmp;
     } else {
-        body = flb_sds_create_size(37 + strlen(stream->group) + strlen(ctx->log_group_class));
+        body = flb_sds_create_size(37 + strlen(escaped_group) + strlen(ctx->log_group_class));
         if (!body) {
+            flb_free(escaped_group);
             flb_sds_destroy(body);
             flb_errno();
             return -1;
         }
 
         tmp = flb_sds_printf(&body, "{\"logGroupName\":\"%s\", \"logGroupClass\":\"%s\"}",
-                             stream->group, ctx->log_group_class);
+                             escaped_group, ctx->log_group_class);
+        flb_free(escaped_group);
         if (!tmp) {
             flb_sds_destroy(body);
             flb_errno();
@@ -1897,14 +2047,26 @@
     flb_sds_t body;
     flb_sds_t tmp;
     flb_sds_t error;
+    char *escaped_group = NULL;
+    char *escaped_name = NULL;
     int ret;
 
     flb_plg_info(ctx->ins, "Creating log stream %s in log group %s",
                  stream->name, stream->group);
 
-    body = flb_sds_create_size(50 + strlen(stream->group) +
-                               strlen(stream->name));
+    escaped_group = json_escape_string(stream->group);
+    escaped_name = json_escape_string(stream->name);
+    if (escaped_group == NULL || escaped_name == NULL) {
+        flb_free(escaped_group);
+        flb_free(escaped_name);
+        return -1;
+    }
+
+    body = flb_sds_create_size(50 + strlen(escaped_group) +
+                               strlen(escaped_name));
     if (!body) {
+        flb_free(escaped_group);
+        flb_free(escaped_name);
         flb_sds_destroy(body);
         flb_errno();
         return -1;
@@ -1913,8 +2075,10 @@
     /* construct CreateLogStream request body */
     tmp = flb_sds_printf(&body,
                          "{\"logGroupName\":\"%s\",\"logStreamName\":\"%s\"}",
-                         stream->group,
-                         stream->name);
+                         escaped_group,
+                         escaped_name);
+    flb_free(escaped_group);
+    flb_free(escaped_name);
     if (!tmp) {
         flb_sds_destroy(body);
         flb_errno();

We want an officially fixed version that takes into account all output providers where possible. The above code only currently addresses CloudWatch Logs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions