架构设计
Tomcat(catalina.out) → Filebeat → Kafka → Logstash → Elasticsearch → Kibana
↓
Cerebro (ES 监控)
环境信息
| 节点 | IP | 组件 |
|---|---|---|
| data1 | 10.0.15.201 | ES/ZK/Kafka/Logstash |
| data2 | 10.0.15.202 | ES/ZK/Kafka/Logstash |
| data3 | 10.0.15.203 | ES/ZK/Kafka/Logstash |
1. JDK 1.8 安装
yum install wget which -y
cd /usr/local/src
wget http://mrchi-data.oss-cn-beijing.aliyuncs.com/Software/Jdk/jdk1.8.0_191.tar.gz
tar zxf jdk1.8.0_191.tar.gz
mv jdk1.8.0_191 /usr/local/
cat > /etc/profile.d/java.sh <<'EOF'
export JAVA_HOME=/usr/local/jdk1.8.0_191
export JRE_HOME=/usr/local/jdk1.8.0_191/jre
export PATH=$PATH:$JAVA_HOME/bin
EOF
source /etc/profile
java -version
2. Filebeat 配置(Java 项目日志采集)
配置文件
filebeat.inputs:
- type: log
enabled: true
paths:
- /data/tomcats/YXHBInsurance_hainan/logs/catalina.out
fields:
type: catalina
multiline.negate: true
multiline.match: after
multiline.Max_lines: 20
multiline.timeout: 10s
- type: log
enabled: true
paths:
- /var/log/messages
fields:
type: syslog
max_bytes: 1048576
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 3
setup.kibana:
output.kafka:
enabled: true
hosts: ["192.168.1.111:9092", "192.168.1.112:9092", "192.168.1.113:9092"]
topic: '%{[fields][type]}-logs'
partition.round_robin:
reachable_only: false
compression: gzip
max_message_bytes: 100000
processors:
- add_cloud_metadata: ~
- drop_fields:
fields: ["prospector", "source", "input", "beat", "offset", "tags"]
xpack.monitoring.enabled: true
xpack.monitoring:
enabled: true
elasticsearch:
hosts: ["192.168.1.111:9500", "192.168.1.112:9500", "192.168.1.113:9500"]
username: beats_system
password: Miao2019f
Supervisor 管理
[program:filebeat]
command=/usr/local/filebeat/filebeat -e -c /usr/local/filebeat/filebeat.yml -d "publish"
directory=/usr/local/filebeat
redirect_stderr=true
stdout_logfile=/usr/local/filebeat/filebeat.log
autostart=true
autorestart=true
supervisord -c /etc/supervisord.conf
supervisorctl status
# filebeat RUNNING pid 31124, uptime 00:00:46
3. Logstash 配置
input {
kafka {
bootstrap_servers => "192.168.1.111:9092,192.168.1.112:9092,192.168.1.113:9092"
group_id => "catalinalog"
topics => ["catalina-logs"]
codec => json
}
}
output {
elasticsearch {
hosts => ["192.168.1.111:9500", "192.168.1.112:9500", "192.168.1.113:9500"]
index => "%{[fields][type]}-%{+YYYY.MM.dd}"
user => elastic
password => Miao2019
}
stdout {
codec => rubydebug
}
}
Supervisor 配置
[program:logstash-catalina]
command=/usr/local/logstash-6.7.1/bin/logstash -f /usr/local/logstash-6.7.1/config/catalina.yml --path.data /usr/local/logstash-6.7.1/catalinadata
autostart=true
autorestart=true
startretries=3
user=root
priority=10
redirect_stderr=true
stdout_logfile=/data/elk6.7/logs/logstash_supervisor.log
4. Metricbeat 监控
安装配置
cd /usr/local/src
curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.13.1-linux-x86_64.tar.gz
tar -xzvf metricbeat-7.13.1-linux-x86_64.tar.gz -C ../
cd /usr/local/metricbeat-7.13.1-linux-x86_64
环境变量
cat >> /etc/profile <<'EOF'
export MB_HOME=/usr/local/metricbeat-7.13.1-linux-x86_64
export PATH=$MB_HOME:$PATH
EOF
source /etc/profile
Elasticsearch 模块监控
# 禁用 system,启用 ES 监控
./metricbeat modules disable system
./metricbeat modules enable elasticsearch-xpack
cat > $MB_HOME/modules.d/elasticsearch-xpack.yml <<'EOF'
- module: elasticsearch
metricsets:
- ccr
- cluster_stats
- enrich
- index
- index_recovery
- index_summary
- ml_job
- node_stats
xpack.enabled: true
period: 10s
hosts: ["http://data1:9200", "data2:9200", "data3:9200"]
username: "elastic"
password: "elastic"
EOF
# 启动
./metricbeat -e &
密钥管理(推荐)
cd $MB_HOME
./metricbeat keystore create
./metricbeat keystore add ES_USER
./metricbeat keystore add ES_PASSWD
./metricbeat keystore list
5. ES 监控开关
# 开启监控
PUT _cluster/settings
{
"persistent": {
"xpack.monitoring.collection.enabled": true
}
}
# 关闭内部采集(推荐 Metricbeat)
PUT _cluster/settings
{
"persistent": {
"xpack.monitoring.elasticsearch.collection.enabled": false
}
}
# 查看状态
GET _cluster/settings
复盘
问题根因:初期 Filebeat 直接写 ES,大日志量导致 ES 压力过高
解决:引入 Kafka 做削峰解耦
改进措施:
- Kafka 分区数按日志类型拆分
- Logstash 过滤规则优化(减少无用字段)
- Metricbeat 替代内部采集,降低 ES 负载
本文首发于 wr.mrchi.cn,转载请注明出处。