|
|
@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
<h1><center>LogStash 数据过滤</center></h1>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
作者:行癫(盗版必究)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## 一:grok插件
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### 1.简介
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
grok插件有非常强大的功能,他能匹配一切数据,但是他的性能和对资源的损耗同样让人诟病
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
filter的grok是目前logstash中解析非结构化日志数据最好的方式
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
grok位于正则表达式之上,所以任何正则表达式在grok中都是有效的
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### 2.语法格式
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
|
|
|
%{语法:语义}
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
注意:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
语法指的是匹配的模式
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
例如使用NUMBER模式可以匹配出数字,IP模式则会匹配出127.0.0.1这样的IP地址
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### 3.案例
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
实验数据:Nginx的访问日志
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Logstash输入输出配置文件:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
|
|
|
input {
|
|
|
|
|
|
|
|
stdin {
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
filter{
|
|
|
|
|
|
|
|
grok{
|
|
|
|
|
|
|
|
match => {"message" => "%{IP:client}"}
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
output {
|
|
|
|
|
|
|
|
stdout {
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|