Audit Logging
This feature requires an Enterprise license for self-managed deployments. To upgrade, contact Redpanda sales. |
Many scenarios for streaming data include the need for fine-grained auditing of user activity related to the system. This is especially true for regulated industries such as finance, healthcare, and the public sector. Complying with PCI DSS v4 standards, for example, requires verbose and detailed activity auditing, alerting, and analysis capabilities.
Redpanda’s auditing capabilities support recording both administrative and operational interactions with topics and with users. Redpanda complies with the Open Cybersecurity Schema Framework (OCSF), providing a predictable and extensible solution that works seamlessly with industry standard tools.
With audit logging enabled, there should be no noticeable changes in performance other than slightly elevated CPU usage.
Audit logging is configured at the cluster level. Redpanda supports excluding specific topics or principals from auditing to help reduce noise in the log. Audit logging is disabled by default. |
Audit log flow
The Redpanda audit log mechanism functions similar to the Kafka flow you may be familiar with. When a user interacts with another user or with a topics, Redpanda writes an event to a specialized audit topic. The audit topic is immutable. Only Redpanda can write to it. Users are prevented from writing to the audit topic directly and the Kafka API cannot create or delete it.
By default, any management and authentication actions performed on the cluster yield messages written to the audit log topic that are retained for seven days. Interactions with all topics by all principals are audited. Actions performed using the Kafka API and Admin API are all audited, as are actions performed directly through rpk
.
Messages recorded to the audit log topic comply with the open cybersecurity schema framework. Any number of analytics frameworks, such as Splunk or Sumo Logic, can receive and process these messages. Using an open standard ensures Redpanda’s audit logs coexist with those produced by other IT assets, powering holistic monitoring and analysis of your assets.
Audit log configuration options
Redpanda’s audit logging mechanism supports several options to control the volume and availability of audit records. Configuration is applied at the cluster level using the standard cluster configuration mechanism.
-
audit_enabled
: Boolean value to enable audit logging. When you set this totrue
, Redpanda checks for an existing topic named_redpanda.audit_log
. If none is found, Redpanda automatically creates one for you. Default:false
. -
audit_log_num_partitions
: Integer value defining the number of partitions used by a newly created audit topic. This configuration applies only to the audit log topic and may be different from the cluster or other topic configurations. This cannot be altered for an existing audit log topic. Default:12
. -
audit_log_replication_factor
: Optional Integer value defining the replication factor for a newly created audit log topic. This configuration applies only to the audit log topic and may be different from the cluster or other topic configurations. This cannot be altered for existing audit log topics. If a value is not provided, Redpanda will use theinternal_topic_replication_factor
cluster config value. Default:null
. -
audit_client_max_buffer_size
: Integer value defining the number of bytes allocated by the internal audit client for audit messages. When changing this, you must disable audit logging and then re-enable it for the change to take effect. Consider increasing this if your system generates a very large number of audit records in a short amount of time. Default:16777216
. -
audit_queue_max_buffer_size_per_shard
: Integer value defining the maximum amount of memory in bytes used by the audit buffer in each shard. Once this size is reached, requests to log additional audit messages will return a non-retryable error. You must restart the cluster when changing this value. Default:1048576
. -
audit_enabled_event_types
: List of strings in JSON style identifying the event types to include in the audit log. This may include any of the following -management, produce, consume, describe, heartbeat, authenticate, schema_registry, admin
. Default:'["management","authenticate","admin"]'
. -
audit_excluded_topics
: List of strings in JSON style identifying the topics the audit logging system should ignore. This list cannot include the_redpanda.audit_log
topic. Redpanda will reject the command if you do attempt to include that topic. Default:null
. -
audit_queue_drain_interval_ms
: Internally, Redpanda batches audit log messages in memory and periodically writes them to the audit log topic. This defines the period in milliseconds between draining this queue to the audit log topic. Longer intervals may help prevent duplicate messages, especially in high throughput scenarios, but they also increase the risk of data loss during hard shutdowns where the queue is lost. Default:500
. -
audit_excluded_principals
: List of strings in JSON style identifying the principals the audit logging system should ignore. Principals can be listed asUser:name
orname
, both are accepted. Default:null
.
Even though audited event messages are stored to a specialized immutable topic, standard topic settings still apply. For example, you can apply the same Tiered Storage, retention time, and replication settings available to normal topics. These particular options are important for controlling the amount of disk space utilized by your audit topics.
You must configure certain audit logging properties before enabling audit logging because these settings impact the creation of the _redpanda.audit_log topic itself. These properties include: audit_log_num_partitions and audit_log_replication_factor . The Kafka API allows you to add partitions or alter the replication factor after enabling audit logging, but Redpanda prevents you from altering these two configuration values directly.
|
Audit logging event types
Redpanda’s auditable events fall into one of eight different event types. The APIs associated with each event type are as follows.
Audit event type | Associated APIs |
---|---|
management |
|
produce |
|
consume |
|
describe |
|
heartbeat |
|
authenticate |
|
schema_registry |
|
admin |
|
Enable audit logging
All audit log settings are applied at the cluster level. You can configure audit log settings in the Redpanda Helm chart, using Helm values or the Redpanda resource with the Redpanda Operator.
Use the rpk cluster config
to configure audit logs. Some options will require a cluster restart. You can verify this using rpk cluster config status
.
Some key tuning recommendations for your audit logging settings include:
-
If you wish to change the number of partitions or the replication factor for your audit log topic, set the
audit_log_num_partitions
andaudit_log_replication_factor
properties respectively. -
Choose the type of events needed by setting
audit_enabled_event_types
to the desired list of event categories. Keep this as restrictive as possible based on your compliance and security needs to avoid excessive noise in your audit logs. -
Identify non-sensitive topics so that you can exclude them from auditing. Specify this list of topics in
audit_excluded_topics
. -
Identify non-sensitive principals so that you can exclude them from auditing. Specify this list of principals in
audit_excluded_principals
. This command accepts names in the form ofname
orUser:name
. -
Set
audit_enabled
totrue
.
The sequence of commands in rpk
for this audit log configuration is:
rpk cluster config set audit_log_num_partitions 6 rpk cluster config set audit_log_replication_factor 5 rpk cluster config set audit_enabled_event_types '["management","describe","authenticate"]' rpk cluster config set audit_excluded_topics '["topic1","topic2"]' rpk cluster config set audit_excluded_principals '["User:principal1", "principal2"]' rpk cluster config set audit_enabled true rpk topic alter-config _redpanda.audit_log --set retention.ms=259200000
Optimize costs for audit logging
When enabled, audit logging can quickly generate a very large amount of data, especially if all event types are selected. Proper configuration of audit logging is critical to avoid filling your disk or using excess Tiered Storage. The configuration options available help ensure your audit logs contain only the volume of data necessary to meeting your regulatory or legal requirements.
With audit logging, the pattern of message generation may be very different from your typical sources of data. These messages reflect usage of your system as opposed to the operational data your topics typically process. As a result, your retention, replication, and Tiered Storage requirements may differ from your other topics.
A typical scenario with audit logging is to route the messages to an analytics platform like Splunk. If your retention period is too long, you will find that you are storing excessive amounts of replicated messages in both Redpanda and in your analytics suite. Identifying the right balance of retention and replication settings minimizes this duplication while retaining your data in a system that provides actionable intelligence.
Assess the retention needs for your audit logs. You may not need to keep the logs around for the default seven days. This is controlled by setting retention.ms
for the _redpanda.audit_log
topic or by setting delete_retention_ms
at the cluster level.