How to Migrate from Fluentd to Vector for Kubernetes Log Aggregation in 2025
Why Migrate from Fluentd to Vector for Kubernetes Logs
If you're running Fluentd as a DaemonSet in your Kubernetes cluster and experiencing high CPU usage, memory bloat, or slow log processing, migrating to Vector can deliver dramatic improvements. Vector is a high-performance observability data pipeline built in Rust that's up to 10x faster than alternatives like Fluentd while consuming significantly fewer resources.
Companies like Atlassian, T-Mobile, and Discord have already made this transition, with Vector now processing over 100,000 downloads daily and handling deployments that process 500TB+ of observability data per day.
This guide walks you through migrating a production Kubernetes log aggregation setup from Fluentd to Vector, preserving your existing destinations and workflows.
Prerequisites for Migration
Before starting the migration, ensure you have:
- Kubernetes cluster version 1.20 or higher
- kubectl access with cluster-admin permissions
- Current Fluentd DaemonSet configuration files
- List of all log destinations (Elasticsearch, S3, Datadog, etc.)
- Understanding of your current log parsing and transformation rules
Step 1: Audit Your Current Fluentd Configuration
First, export your existing Fluentd ConfigMap to understand what you're migrating:
kubectl get configmap fluentd-config -n kube-system -o yaml > fluentd-backup.yaml
Document these critical elements from your Fluentd config:
- Input sources: Container logs, systemd, custom applications
- Filters: Parsing rules, field additions, record transformations
- Output destinations: Elasticsearch endpoints, S3 buckets, monitoring services
- Performance settings: Buffer configurations, flush intervals, retry logic
Step 2: Install Vector as a DaemonSet
Create a Vector ConfigMap that mirrors your Fluentd functionality. Here's a complete example for collecting Kubernetes container logs:
apiVersion: v1
kind: ConfigMap
metadata:
name: vector-config
namespace: kube-system
data:
vector.toml: |
[sources.kubernetes_logs]
type = "kubernetes_logs"
[transforms.parse_json]
type = "remap"
inputs = ["kubernetes_logs"]
source = '''
. = parse_json!(.message)
.kubernetes = .kubernetes
'''
[transforms.add_metadata]
type = "remap"
inputs = ["parse_json"]
source = '''
.cluster = "production"
.environment = get_env_var!("ENVIRONMENT")
'''
[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["add_metadata"]
endpoint = "https://elasticsearch.example.com:9200"
bulk.index = "kubernetes-logs-%Y.%m.%d"
[sinks.console]
type = "console"
inputs = ["add_metadata"]
encoding.codec = "json"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: vector
namespace: kube-system
spec:
selector:
matchLabels:
name: vector
template:
metadata:
labels:
name: vector
spec:
serviceAccountName: vector
containers:
- name: vector
image: timberio/vector:0.35.0-debian
env:
- name: ENVIRONMENT
value: "production"
- name: VECTOR_CONFIG
value: "/etc/vector/vector.toml"
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
volumeMounts:
- name: config
mountPath: /etc/vector
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: config
configMap:
name: vector-config
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
Step 3: Map Fluentd Plugins to Vector Components
Vector uses a different architecture than Fluentd. Here's how common Fluentd plugins map to Vector components:
| Fluentd Component | Vector Equivalent | Notes |
|-------------------|-------------------|-------|
| in_tail | kubernetes_logs source | Native Kubernetes integration |
| filter_parser | remap transform | Uses Vector Remap Language (VRL) |
| filter_record_transformer | remap transform | More performant than Ruby |
| out_elasticsearch | elasticsearch sink | Built-in support |
| out_s3 | aws_s3 sink | Native AWS integration |
| out_datadog | datadog_logs sink | Official Datadog support |
| buffer configuration | buffer on sinks | Automatic memory management |
Step 4: Convert Fluentd Parsing Rules to VRL
Vector's Remap Language (VRL) replaces Ruby code in Fluentd filters. Here's a common conversion:
Fluentd filter:
<filter kubernetes.**>
@type parser
key_name log
<parse>
@type json
</parse>
</filter>
Vector transform:
[transforms.parse_logs]
type = "remap"
inputs = ["kubernetes_logs"]
source = '''
.parsed = parse_json!(.message)
. = merge(., .parsed)
del(.parsed)
'''
Step 5: Create RBAC Permissions
Vector needs read access to Kubernetes API for log metadata:
apiVersion: v1
kind: ServiceAccount
metadata:
name: vector
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: vector
rules:
- apiGroups: [""]
resources: ["pods", "namespaces"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: vector
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: vector
subjects:
- kind: ServiceAccount
name: vector
namespace: kube-system
Step 6: Deploy Vector in Parallel with Fluentd
Don't immediately replace Fluentd. Run both pipelines simultaneously:
kubectl apply -f vector-rbac.yaml
kubectl apply -f vector-daemonset.yaml
Verify Vector pods are running:
kubectl get pods -n kube-system -l name=vector
kubectl logs -n kube-system -l name=vector --tail=50
Step 7: Validate Data Consistency
Compare outputs from both pipelines for 24-48 hours:
- Volume comparison: Check if Vector processes similar log volumes
- Field mapping: Verify all metadata fields are preserved
- Timestamp accuracy: Ensure timestamps match source events
- Error rates: Monitor Vector's error metrics
Use Vector's built-in metrics endpoint:
kubectl port-forward -n kube-system daemonset/vector 9090:9090
curl http://localhost:9090/metrics | grep component_errors_total
Step 8: Performance Comparison and Monitoring
Monitor resource usage with:
kubectl top pods -n kube-system -l name=vector
kubectl top pods -n kube-system -l name=fluentd
Expected improvements with Vector:
- Memory usage: 50-70% reduction (256MB vs 512MB+ for Fluentd)
- CPU usage: 60-80% reduction
- Throughput: 10x higher events per second
- Latency: Sub-second processing vs 3-5 seconds with Fluentd
Step 9: Gradual Traffic Migration
Use node labels to control which nodes use Vector:
# Label nodes for Vector
kubectl label nodes worker-1 worker-2 log-agent=vector
# Update Vector DaemonSet with nodeSelector
kubectl patch daemonset vector -n kube-system -p '{"spec":{"template":{"spec":{"nodeSelector":{"log-agent":"vector"}}}}}'
# Update Fluentd to exclude Vector nodes
kubectl patch daemonset fluentd -n kube-system -p '{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"log-agent","operator":"NotIn","values":["vector"]}]}]}}}}}}'
Step 10: Decommission Fluentd
After 1-2 weeks of stable operation:
# Scale down Fluentd
kubectl scale daemonset fluentd -n kube-system --replicas=0
# Wait 7 days for data retention verification
# Remove Fluentd completely
kubectl delete daemonset fluentd -n kube-system
kubectl delete configmap fluentd-config -n kube-system
Common Migration Issues and Solutions
Issue: Missing Kubernetes Metadata
Symptom: Pod names, namespaces missing from logs
Solution: Verify RBAC permissions and ensure Vector ServiceAccount has cluster-wide read access to pods and namespaces.
Issue: High Memory Usage Initially
Symptom: Vector using more memory than expected
Solution: Tune buffer settings in sinks:
[sinks.elasticsearch.buffer]
max_events = 500
type = "memory"
max_size = 104857600 # 100MB
Issue: VRL Syntax Errors
Symptom: Transform failures in logs
Solution: Use Vector's VRL REPL for testing:
vector vrl
Monitoring Vector in Production
Vector exposes Prometheus metrics. Deploy a ServiceMonitor:
apiVersion: v1
kind: Service
metadata:
name: vector-metrics
namespace: kube-system
labels:
app: vector
spec:
selector:
name: vector
ports:
- name: metrics
port: 9090
targetPort: 9090
Key metrics to monitor:
component_received_events_total: Events ingestedcomponent_sent_events_total: Events deliveredcomponent_errors_total: Error ratebuffer_events: Buffer utilization
Cost Savings and ROI
Organizations migrating from Fluentd to Vector typically see:
- Infrastructure costs: 40-60% reduction in node count needed for log processing
- Cloud egress: Lower costs due to efficient compression and batching
- Developer time: Less time debugging log pipeline issues
- Vendor flexibility: Easier to change observability providers without rewriting configs
Next Steps
After successful migration:
- Enable metrics collection: Vector supports Prometheus scraping and StatsD
- Add distributed tracing: Vector's trace support is in beta
- Implement log sampling: Reduce costs with intelligent sampling
- Deploy aggregators: Use Vector aggregators for centralized processing
For teams running observability infrastructure, consider hosting Vector aggregators on platforms like DigitalOcean or Render for centralized log processing before sending to final destinations. Supabase also offers Vector integration for application logging if you're building on their platform.
Vector's unified approach to logs, metrics, and traces makes it a future-proof choice for Kubernetes observability in 2025.
Recommended Tools
- DigitalOceanSimplicity in the cloud
- SupabaseThe open source Firebase alternative