Logstash - 插件

Logstash插件 - 从简单和简单的步骤学习Logstash，从基本到高级概念，包括简介，ELK堆栈，安装，内部架构，收集日志，支持的输入，解析日志，过滤器，转换日志，输出阶段，支持的输出，插件，监控API，安全和监控。

Logstash为其管道的所有三个阶段(输入，过滤器和输出)提供各种插件.这些插件可帮助用户从Web服务器，数据库，网络协议等各种来源捕获日志.

捕获后，Logstash可以根据需要解析数据并将其转换为有意义的信息.用户.最后，Logstash可以将有意义的信息发送或存储到各种目标来源，如Elasticsearch，AWS Cloudwatch等.

输入插件

Logstash中的输入插件帮助用户从各种来源提取和接收日志.使用输入插件的语法如下 :

Input {   Plugin name {      Setting 1……      Setting 2……..   }}

您可以使用以下命令下载输入插件 :

>Logstash-plugin install Logstash-input-

Logstash-plugin实用程序存在于Logstash安装目录的 bin文件夹中.下表列出了Logstash提供的输入插件.

Sr.No .	插件名称&说明
1	beats 从弹性节拍框架中获取日志记录数据或事件.
2	cloudwatch 提取事件来自CloudWatch，Amazon Web Services提供的API.
3	couchdb_changes 使用此插件发送的couchdb的_chages URI中的事件.
4	drupal_dblog 到使用启用的DBLog提取drupal的监视程序日志数据.
5	Elasticsearch 检索在Elasticsearch集群中执行的查询结果.
6	eventlog 从Windows事件日志中获取事件.
7	exec 将shell命令输出作为Logstash中的输入.
8	文件从输入文件中获取事件.当Logstash与输入源一起本地安装并且可以访问输入源日志时，这很有用.
9	生成器它用于测试目的，它会创建随机事件.
10	github 从GitHub webhook捕获事件.
11	graphite 从石墨监测工具获取指标数据.
12	心跳它也用于测试并产生类似心跳的事件
13	http 通过两种网络协议收集日志事件，这些协议是http和https.
14	http_poller 它用于解码事件的HTTP API输出.
15	jdbc 它将JDBC事务转换为Logstash中的事件.
16	jmx 使用JMX从远程Java应用程序中提取指标.
17	log4j 通过TCP套接字从Log4j的socketAppender对象捕获事件.
18	rss 将命令行工具的输出作为Logstash中的输入事件.
19	tcp 捕获事件TCP套接字.
20	twitter 从Twitter流媒体API收集活动.
21	unix 收集事件在UNIX套接字上.
22	websocket 通过websocket协议捕获事件.
23	xmpp 通过Jabber/xmpp协议读取事件.

插件设置

所有插件都有其特定设置，这有助于在插件中指定重要字段，如端口，路径等.我们将讨论一些输入插件的设置.

文件

此输入插件用于直接从存在的日志或文本文件中提取事件在输入源中.它的工作方式类似于UNIX中的tail命令，保存最后一个读取光标，只读取输入文件中的新附加数据，但可以使用star_position设置进行更改.以下是此输入插件的设置.

设置名称	默认值	描述
add_field	{}	在输入事件中附加一个新字段.
close_older	3600	关闭上次读取时间(以秒为单位)超过此插件中指定的文件.
编解码器	"plain"	用于在进入Logstash管道之前解码数据.
分隔符	"\ n"	用于指定新行分隔符.
discover_int erval	15	这是在指定路径中发现新文件之间的时间间隔(以秒为单位).
enable_metric	true	它用于启用或禁用指定插件的度量标准的报告和收集.
exclude		用于指定文件名或模式，应该从输入插件中排除.
Id		为该插件实例指定唯一标识.
max_open_files		它随时指定Logstash的最大输入文件数.
path		指定文件的路径，它可以包含文件名的模式.
start_position	"end"	如果你愿意，你可以改为"开始";最初Logstash应该从开始而不是新的日志事件开始读取文件.
start_interval	1	它指定以秒为单位的时间间隔，之后Logstash会检查修改后的文件.
tags		要添加任何其他信息，例如Logstash，它会在标记中添加"_grokparsefailure"，此时任何日志事件都不符合指定的grok过滤器.
type		这是一个特殊字段，您可以将其添加到输入事件中，它在过滤器和kibana中很有用.

Elasticsearch

此特定插件用于读取Elasticsearch集群中的搜索查询结果.以下是此插件中使用的设置 :

设置名称	默认值	描述
add_field	{}	与文件插件相同，它用于在输入事件中追加一个字段.
ca_file		它用于指定路径SSL证书颁发机构文件.
编解码器	"plain"	在进入Logstash管道之前，它用于解码来自Elasticsearch的输入事件.
docinfo	"false"	你可以改变它为true，如果你想提取其他信息，如来自Elasticsearch引擎的dex，type和id.
docinfo_fields	["_ index"，"_ type"，"_ id"]	您可以删除Logstash输入中不需要的任何字段.
enable_metric	true	它用于启用或禁用该插件实例的度量标准的报告和收集.
主持人		它用于指定所有elasticsearch引擎的地址，这些引擎将是该Logstash实例的输入源.语法是host:port或IP:port.
Id		它用于为该特定输入插件实例提供唯一的标识号.
index	"logstash - *"	它用于指定索引名称或模式，Logstash将通过Logstash监视输入.
密码		用于身份验证.
查询	"{"sort ":["_doc"]}"	查询执行情况.
ssl	false	启用或禁用安全套接字层.
tags		在输入事件中添加任何其他信息.
type		它用于对输入表单进行分类，以便它将很容易在后期搜索所有输入事件.
user		出于真实目的.

eventlog

此输入插件从Windows服务器的win32 API读取数据.以下是此插件的设置 :

设置名称	默认值	描述
add_field	{}	与文件插件相同，它用于在输入事件中追加一个字段
编解码器	"plain"	它用于解码来自窗口的输入事件;在进入Logstash管道之前
logfile	["应用程序"，"安全"，"系统"]	输入日志文件中需要的事件
interval	1000	它以毫秒为单位，定义了两次连续检查新事件日志之间的间隔
tags		在输入事件中添加任何其他信息
type		它用于将特定插件的输入分类为给定类型，以便在后续阶段中搜索所有输入事件

Twitter

此输入插件用于从其Streaming API收集twitter的feed.下表描述了此插件的设置.

设置名称	默认值	描述
add_field	{}	与文件插件相同，它用于在输入事件中追加一个字段
编解码器	"plain"	它用于解码来自窗口的输入事件;在进入Logstash管道之前
consumer_key		它包含twitter应用的消费者密钥.有关详细信息，请访问 https://dev.twitter.com/apps/new
consumer_secret		它包含twitter应用程序的消费者密钥.有关详细信息，请访问 https://dev.twitter.com/apps/new
enable_metric	true	它用于启用或禁用该插件实例的度量标准的报告和收集
跟随		它指定用逗号分隔的用户ID，LogStash在Twitter中检查这些用户的状态. 有关详细信息，请访问 https://dev.twitter.com
full_tweet	false	如果你希望Logstash从twitter API读取完整的对象返回，你可以将其更改为true
id		它用于为特定输入插件实例提供唯一标识号
ignore_retweets	False	您可以将其设置为true以忽略输入的Twitter Feed中的转发
关键字		这是一系列关键字，需要在twitters中进行跟踪输入提要
语言		它定义了LogStash从输入的twitter feed所需的推文的语言.这是一个标识符数组，它定义了twitter中的特定语言
locations		根据指定的位置过滤掉输入Feed中的推文.这是一个数组，其中包含位置的经度和纬度
oauth_token		这是必填字段，包含用户oauth令牌.有关详细信息，请访问以下链接 https://dev.twitter.com/apps
oauth_token_secret		这是一个必需的文件，其中包含用户oauth秘密令牌.有关详细信息，请访问以下链接 https://dev.twitter.com/apps
tags		在输入事件中添加任何其他信息
type		它用于将输入表单分类为给定的特定插件类型，以便在后续阶段轻松搜索所有输入事件

TCP

TCP用于通过TCP套接字获取事件;它可以从用户连接或服务器读取，这在模式设置中指定.下表描述了此插件的设置 :

设置名称	默认值	说明
add_field	{ }	与文件插件相同，它用于在输入事件中追加一个字段
编解码器	"plain"	它用于解码来自windows的输入事件;在进入Logstash管道之前
enable_metric	true	用于启用或禁用该插件实例的度量标准的报告和收集
host	"0.0.0.0"	客户端依赖的服务器操作系统的地址
id		它包含twitter应用程序的消费者密钥
mode	"server"	用于指定输入源是服务器还是客户端.
port		它定义了端口号
ssl_cert		用于指定SSL证书的路径
ssl_enable	false	启用或禁用SSL
ssl_key		指定SSL密钥文件的路径
tags		在输入事件中添加任何其他信息
type		用于对特定插件的输入进行分类给定类型，以便在后续阶段搜索所有输入事件

Logstash - 输出插件

Logstash支持各种输出源和不同的输出源数据库，文件，电子邮件，标准输出等技术

使用输出插件的语法如下 :

output {   Plugin name {      Setting 1……      Setting 2……..   }}

您可以使用以下命令下载输出插件 :

>logstash-plugin install logstash-output-

Logstash-plugin实用程序出现在Logstash安装目录的bin文件夹中.下表描述了Logstash提供的输出插件.

Sr.No.	插件名称&说明
1	CloudWatch 此插件用于将汇总的指标数据发送到亚马逊网络服务的CloudWatch.
2	csv 它用于以逗号分隔的方式写输出事件.
3	Elasticsearch 它用于将输出日志存储在Elasticsearch索引中.
4	电子邮件用于在生成输出时发送通知电子邮件.用户可以在电子邮件中添加有关输出的信息.
5	exec 它用于运行与输出事件匹配的命令.
6	ganglia 它将指标写入Gangila的gmond.
7	gelf 它用于为Graylog2生成输出GELF格式.
8	google_bigquery 它将事件输出到Google BigQuery.
9	google_cloud_storage 它将输出事件存储到Google云端存储.
10	graphite 它用于将输出事件存储到Graphite.
11	graphtastic 它用于在Windows上编写输出指标.
12	hipchat 它用于将输出日志事件存储到HipChat.
13	http 使用它将输出日志事件发送到http或https端点.
14	Influxdb 它用于在InfluxDB中存储输出事件.
15	irc 它用于将输出事件写入irc.
16	mongodb 它将输出数据存储在MongoDB中.
17	nagios 它用于通过被动检查结果通知Nagios.
18	nagios_nsca 它用于通过NSCA协议通知Nagios被动检查结果.
19	opentsdb 它存储Logstash输出事件到OpenTSDB.
20	pipe 它将输出事件流式传输到另一个程序的标准输入.
21	rackspace 它用于将输出日志事件发送到Rackspace Cloud的队列服务.
22	redis 它使用rpush命令发送将记录数据输出到Redis队列.
23	riak 它用于将输出事件存储到Riak分布式键/值对.
24	s3 它将输出记录数据存储到Amazon Simple Storage Service.
25	sns 它用于将输出事件发送到亚马逊的简单通知服务.
26	solr_http 它将输出记录数据编入索引并存储在Solr中.
27	sps 它用于将事件发送到AWS的简单队列服务.
28	statsd 用于将指标数据发送到statsd网络守护程序.
29	stdout 它用于在CLI的标准输出上显示输出事件，如命令提示符.
30	syslog 它用于将输出事件传送到syslog服务器.
31	tcp 它用于将输出事件发送到TCP套接字.
32	udp 它用于通过UDP推送输出事件.
33	websocket 它用于通过WebSocket协议推送输出事件.
34	xmpp 它用于推送输出事件超过XMPP协议.

所有插件都有其特定设置，这有助于指定重要字段像插件中的Port，Path等.我们将讨论一些输出插件的设置.

Elasticsearch

Elasticsearch输出插件使Logstash能够将输出存储在Elasticsearch的特定集群中发动机.这是用户的着名选择之一，因为它包含在ELK Stack的包中，因此为Devops提供端到端解决方案.下表描述了此输出插件的设置.

设置名称	默认值	说明
行动	index	它用于定义在Elasticsearch引擎中执行的操作.此设置的其他值包括删除，创建，更新等.
cacert		它包含.cer或.pem文件的路径，用于服务器的证书验证.
编解码器	"plain"	用于在将输出日志数据发送到目标源之前对其进行编码.
doc_as_upset	false	此设置用于更新操作.如果未在输出插件中指定文档标识，它会在Elasticsearch引擎中创建文档.
document_type		它用于在同一文档类型中存储相同类型的事件.如果未指定，则使用事件类型.
flush_size	500	这用于提高Elasticsearch中批量上传的性能
主机	["127.0.0.1"]	这是输出记录数据的目标地址数组
idle_flush_time	1	它定义了两次刷新之间的时间限制(秒)，Logstash强制在此设置中指定的时间限制后刷新
index	"logstash - ％{+ YYYY.MM.dd}"	它用于指定Elasticsearch引擎的索引
manage_temlpate	true	用于在Elasticsearch中应用默认模板
parent	nil	用于指定Elasticsearch中父文档的ID
密码		用于验证对Elasticsearch中安全集群的请求
path		它用于指定Elasticsearch的HTTP路径.
管道	nil	它用于设置摄取管道，用户希望为事件执行
proxy		它用于指定HTTP代理
retry_initial_interval	2	用于设置初始时间间隔(秒)批量重试之间.每次重试后它会变为两倍，直到达到retry_max_interval
retry_max_interval	64	It is used to set the maximum time interval for retry_initial_interval
retry_on_conflict	1	It is the number of retries by Elasticsearch to update a document
ssl		To enable or disable SSL/TLS secured to Elasticsearch
template		It contains the path of the customized template in Elasticsearch
template_name	"logstash"	This is used to name the template in Elasticsearch
timeout	60	It is the timeout for network requests to Elasticsearch
upsert	""	It update the document or if the document_id does not exist, it creates a new document in Elasticsearch
user		It contains the user to authenticate the Logstash request in secure Elasticsearch cluster

Email

当Logstash生成输出时，电子邮件输出插件用于通知用户。下表描述了此插件的设置。

Setting Name	Default Value	Description
address	"localhost"	It is the address of mail server
attachments	[]	It contains the names and locations of the attached files
body	""	It contains the body of email and should be plain text
cc		It contains the email addresses in comma separated manner for the cc of email
codec	"plain"	It is used to encode the output logging data before sending it to the destination source.
contenttype	"text/html; charset = UTF-8"	It is used to content-type of the email
debug	false	It is used to execute the mail relay in debug mode
domain	"localhost"	It is used to set the domain to send the email messages
from	"logstash.alert@nowhere.com"	It is used to specify the email address of the sender
htmlbody	""	It is used to specify the body of email in html format
password		It is used to authenticate with the mail server
port	25	It is used to define the port to communi cate with the mail server
replyto		It is used to specify the email id for reply-to field of email
subject	""	It contains the subject line of the email
use_tls	false	Enable or disable TSL for the communication with the mail server
username		Is contains the username for the authentication with the server
via	"smtp"	It defines the methods of sending email by Logstash

Http

此设置用于通过http将输出事件发送到目标。该插件具有以下设置:

Setting Name	Default Value	Description
automatic_retries	1	It is used to set the number of http request retries by logstash
cacert		It contains the path of file for server’s certificate validation
codec	"plain"	It is used to encode the output logging data before sending it to the destination source.
content_type		I specifies the content type of http request to the destination server
cooki es	true	It is used to enable or disable cookies
format	"json"	It is used to set the format of http request body
headers		It contains the information of http header
http_method	""	It is used to specify the http method used in the request by logstash and the values can be "put", "post", "patch", "delete", "get", "head"
request_timeout	60	It is used to authenticate with the mail server
url		It is a required setting for this plugin to s pecify the http or https endpoint

stdout

stdout输出插件用于在命令行界面的标准输出上写入输出事件。在Windows中是命令提示符，在UNIX中是终端。该插件具有以下设置:

Setting Name	Default Value	Description
codec	"plain"	It is used to encode the output logging data before sending it to the destination source.
workers	1	It is used to specify number of workers for the output

statsd

It is a network daemon used to send the matrices data over UDP to the destination backend services. It is command prompt in windows and terminal in UNIX. This plugin has following settings :

Setting Name	Default Value	Description
codec	"plain"	It is used to encode the output logging data before sending it to the destination source.
count	{}	It is used to define the count to be used in metrics
decrement	[]	It is used to specify the decrement metric names
host	"localhost"	It contains the address of statsd server
increment	[]	It is used to specify the in crement metric names
port	8125	It contains the port of statsd server
sample_rate	1	It is used specify the sample rate of metric
sender	"%{host}"	It specifies the name of the sender
set	{}	It is used to specify a set metric
timing	{}	It is used to specify a timing metric
workers	1	It is used to specify number of workers for the output

Filter Plugins

Logstash支持各种过滤器插件，以将输入日志解析并将其转换为结构化且易于查询的格式。

使用filter插件的语法如下:

  filter {     Plugin name {        Setting 1……        Setting 2……..     }  }

您可以使用以下命令下载过滤器插件:

  >logstash-plugin install logstash-filter-

Logstash插件实用程序位于Logstash安装目录的bin文件夹中。下表描述了Logstash提供的输出插件。

Sr.No.	Plugin Name & Description
1	aggregate This plugin collects or aggregate the data from various event of same type and process them in the final event
2	alter It allows user to alter the field of log events, which mutate filter do not handle
3	anonymize It is used replace the values of fields with a consistent hash
4	cipher It is used to encrypt the output events before storing them in destination source
5	clone It is used to create duplicate of the output events in Logstash
6	collate It merges the events from different logs by their time or count
7	csv This plugin parse data from input logs according to the separator
8	date It parse the dates from the fields in the event and set that as a timestamp for the event
9	dissect This plugin helps user to extract fields from unstructured data and makes it easy for grok filter to parse them correctly
10	drop It is used to drop all the events of same type or any other similarity
11	elapsed It is used to compute the time between the start and end events
12
13	extractnumbers It is used to extract the number from strings in the log events
14	geoip It adds a field in the event, which contains the latitude and longitude of the location of the IP present in the log event
15	grok It is the commonly used filter plugin to parse the event to get the fields
16	i18n It deletes the special characters from a filed in the log event
17	json It is used to create a structured Json object in event or in a specific field of an event
18	kv This plugin is useful in paring key value pairs in the logging data
19	metrics It is used to aggregate metrics like counting time duration in each event
20	multiline It is also one of the commonly use filter plugin, which helps user in case of converting a multiline logging data to a single event.
21	mutate This plugin is used to rename, remove, replace, and modify fields in your events
22	range It used to check the numerical values of fields in events against an expected range and string’s length within a range.
23	ruby It is used to run arbitrary Ruby code
24	sleep This makes Logstash sleeps for a specified amount of time
25	split It is used to split a field of an event and placing all the split values in the clones of that event
26	xml It is used to create event by paring the XML data present in the logs

Codec plugins

编解码器插件可以是输入或输出插件的一部分。这些插件用于更改或格式化日志数据显示。 Logstash提供了多个编解码器插件，这些插件如下:

Sr.No.	Plugin Name & Description
1	avro This plugin encode serialize Logstash events to avro datums or decode avro records to Logstash events
2	cloudfront This plugin reads the encoded data from AWS cloudfront
3	cloudtrail This plugin is used to read the data from AWS cloudtrail
4	collectd This reads data from the binary protocol called collected over UDP
5	compress_spooler It is used to compress the log events in Logstash to spooled batches
6	dots This is used performance tracking by setting a dot for e very event to stdout
7	es_bulk This is used to convert the bulk data from Elasticsearch into Logstash events including Elasticsearch metadata
8	graphite This codec read data from graphite into events and change the event into graphite formatted records
9	gzip_lines This plugin is used to handle gzip encoded data
10	json This is used to convert a single element in Json array to a single Logstash event
11	json_lines It is used to handle Json data with newline delimiter
12	line It p lugin will read and write event in a single live, that means after newline delimiter there will be a new event
13	multiline It is used to convert multiline logging data into a single event
14	netflow This plugin is used to convert nertflow v5/v9 data to logstash events
15	nmap It parses the nmap result data into an XML format
16	plain This reads text without delimiters
17	rubydebug This plugin will write the output Logstash events using Ruby awesome print library

Build Your Own Plugin

您还可以在Logstash中创建自己的插件，从而满足您的需求。 Logstash-plugin实用程序用于创建自定义插件。在这里，我们将创建一个过滤器插件，该插件将在事件中添加自定义消息。

Generate the Base Structure

用户可以使用logstash-plugin实用程序的generate选项来生成必要的文件，或者也可以在GitHub上使用。

  >logstash-plugin generate --type filter --name myfilter --path c:/tpwork/logstash/lib

在这里，类型选项用于指定插件为输入，输出或过滤器。在此示例中，我们将创建一个名为myfilter的过滤器插件。 path选项用于指定要在其中创建插件目录的路径。执行完上述命令后，您将看到目录结构已创建。

Develop the Plugin

您可以在插件目录的\ lib \ logstash \ filters文件夹中找到插件的代码文件。文件扩展名为.rb。

在我们的例子中，代码文件位于以下路径内:

  C:\tpwork\logstash\lib\logstash-filter-myfilter\lib\logstash\filters\myfilter.rb

我们将消息更改为:default"嗨，您正在it1352.com上学习"，然后保存文件。

Install the Plugin

要安装此插件，需要修改Logstash的Gemfile。您可以在Logstash的安装目录中找到此文件。在我们的情况下，它将位于C:\ tpwork \ logstash中。使用任何文本编辑器编辑此文件，然后在其中添加以下文本。

gem "logstash-filter-myfilter",:path => "C:/tpwork/logstash/lib/logstash-filter-myfilter"

在上面的命令中，我们指定插件的名称以及在哪里可以找到要安装的插件。然后，运行Logstash-plugin实用程序来安装此插件。

  >logstash-plugin install --no-verify

Testing

在这里，我们在前面的示例之一中添加myfilter:

logstash.conf

该Logstash配置文件在grok过滤器插件之后的过滤器部分中包含myfilter。

  input {     file {        path => "C:/tpwork/logstash/bin/log/input1.log"     }   }  filter {     grok {        match => ["message"，"％{LOGLEVEL:loglevel}  - ％{NOTSPACE:taskid}  -  ％{NOTSPACE:logger}  - ％{WORD:label}( - ％{INT:duration:int })？" ]     }     myfilter{}  }  output {     file {        path => "C:/tpwork/logstash/bin/log/output1.log"        codec => rubydebug     }  }

Run logstash

我们可以使用以下命令运行Logstash。

  >logstash –f logsatsh.conf

input.log

以下代码块显示了输入日志数据。

  INFO - 48566 - TRANSACTION_START - start

output.log

以下代码块显示了输出日志数据。

  {     "path" => "C:/tpwork/logstash/bin/log/input.log",     "@timestamp" => 2017-01-07T06:25:25.484Z,     "loglevel" => "INFO",     "logger" => "TRANSACTION_END",     "@version" => "1",     "host" => "Dell-PC",     "label" => "end",     "message" => "Hi, You are learning this on it1352.com",     "taskid" => "48566",     "tags" => []  }

Publish it on Logstash

开发人员还可以通过将其自定义插件上传到github上并遵循Elasticsearch Company定义的标准化步骤，将其自定义插件发布到Logstash。

请参考以下网址以获取有关发布的更多信息:

https://www.elastic.co/guide/en/logstash/current/contributing-to-logstash.html