Mutate string processors

You can change the way that a string appears by using a mutate string processesor. For example, you can use the uppercase_string processor to convert a string to uppercase, and you can use the lowercase_string processor to convert a string to lowercase. The following is a list of processors that allow you to mutate a string:

substitute_string

The substitute_string processor matches a key’s value against a regular expression (regex) and replaces all returned matches with a replacement string.

Configuration

You can configure the substitute_string processor with the following options.

OptionRequiredDescription
entriesYesA list of entries to add to an event.
sourceYesThe key to be modified.
fromYesThe regex string to be replaced. Special regex characters such as [ and ] must be escaped using \ when using double quotes and \ when using single quotes. For more information, see Class Pattern in the Java documentation.
toYesThe string that replaces each match of from.

Usage

To get started, create the following pipeline.yaml file:

  1. pipeline:
  2. source:
  3. file:
  4. path: "/full/path/to/logs_json.log"
  5. record_type: "event"
  6. format: "json"
  7. processor:
  8. - substitute_string:
  9. entries:
  10. - source: "message"
  11. from: ":"
  12. to: "-"
  13. sink:
  14. - stdout:

copy

Next, create a log file named logs_json.log. After that, replace the path of the file source in your pipeline.yaml file with your file path. For more detailed information, see Configuring Data Prepper.

Before you run Data Prepper, the source appears in the following format:

  1. {"message": "ab:cd:ab:cd"}

After you run Data Prepper, the source is converted to the following format:

  1. {"message": "ab-cd-ab-cd"}

from defines which string is replaced, and to defines the string that replaces the from string. In the preceding example, string ab:cd:ab:cd becomes ab-cd-ab-cd. If the from regex string does not return a match, the key is returned without any changes.

split_string

The split_string processor splits a field into an array using a delimiter character.

Configuration

You can configure the split_string processor with the following options.

OptionRequiredDescription
entriesYesA list of entries to add to an event.
sourceYesThe key to be split.
delimiterNoThe separator character responsible for the split. Cannot be defined at the same time as delimiter_regex. At least delimiter or delimiter_regex must be defined.
delimiter_regexNoA regex string responsible for the split. Cannot be defined at the same time as delimiter. Either delimiter or delimiter_regex must be defined.

Usage

To get started, create the following pipeline.yaml file:

  1. pipeline:
  2. source:
  3. file:
  4. path: "/full/path/to/logs_json.log"
  5. record_type: "event"
  6. format: "json"
  7. processor:
  8. - split_string:
  9. entries:
  10. - source: "message"
  11. delimiter: ","
  12. sink:
  13. - stdout:

copy

Next, create a log file named logs_json.log. After that, replace the path in the file source of your pipeline.yaml file with your file path. For more detailed information, see Configuring Data Prepper.

Before you run Data Prepper, the source appears in the following format:

  1. {"message": "hello,world"}

After you run Data Prepper, the source is converted to the following format:

  1. {"message":["hello","world"]}

uppercase_string

The uppercase_string processor converts the value (a string) of a key from its current case to uppercase.

Configuration

You can configure the uppercase_string processor with the following options.

OptionRequiredDescription
with_keysYesA list of keys to convert to uppercase.

Usage

To get started, create the following pipeline.yaml file:

  1. pipeline:
  2. source:
  3. file:
  4. path: "/full/path/to/logs_json.log"
  5. record_type: "event"
  6. format: "json"
  7. processor:
  8. - uppercase_string:
  9. with_keys:
  10. - "uppercaseField"
  11. sink:
  12. - stdout:

copy

Next, create a log file named logs_json.log. After that, replace the path in the file source of your pipeline.yaml file with the correct file path. For more detailed information, see Configuring Data Prepper.

Before you run Data Prepper, the source appears in the following format:

  1. {"uppercaseField": "hello"}

After you run Data Prepper, the source is converted to the following format:

  1. {"uppercaseField": "HELLO"}

lowercase_string

The lowercase string processor converts a string to lowercase.

Configuration

You can configure the lowercase string processor with the following options.

OptionRequiredDescription
with_keysYesA list of keys to convert to lowercase.

Usage

To get started, create the following pipeline.yaml file:

  1. pipeline:
  2. source:
  3. file:
  4. path: "/full/path/to/logs_json.log"
  5. record_type: "event"
  6. format: "json"
  7. processor:
  8. - lowercase_string:
  9. with_keys:
  10. - "lowercaseField"
  11. sink:
  12. - stdout:

copy

Next, create a log file named logs_json.log. After that, replace the path in the file source of your pipeline.yaml file with the correct file path. For more detailed information, see Configuring Data Prepper.

Before you run Data Prepper, the source appears in the following format:

  1. {"lowercaseField": "TESTmeSSage"}

After you run Data Prepper, the source is converted to the following format:

  1. {"lowercaseField": "testmessage"}

trim_string

The trim_string processor removes whitespace from the beginning and end of a key.

Configuration

You can configure the trim_string processor with the following options.

OptionRequiredDescription
with_keysYesA list of keys from which to trim the whitespace.

Usage

To get started, create the following pipeline.yaml file:

  1. pipeline:
  2. source:
  3. file:
  4. path: "/full/path/to/logs_json.log"
  5. record_type: "event"
  6. format: "json"
  7. processor:
  8. - trim_string:
  9. with_keys:
  10. - "trimField"
  11. sink:
  12. - stdout:

copy

Next, create a log file named logs_json.log. After that, replace the path in the file source of your pipeline.yaml file with the correct file path. For more detailed information, see Configuring Data Prepper.

Before you run Data Prepper, the source appears in the following format:

  1. {"trimField": " Space Ship "}

After you run Data Prepper, the source is converted to the following format:

  1. {"trimField": "Space Ship"}