How to install Elastic Stack

Your logs are trying to talk to you! The problem though is that reading through logs is like trying to pick out one conversation in a crowded and noisy room. Some people talk loud and others speak softly. With all this noise, how can you pick out the critical information? This is where Elastic Stack can help!

Elastic Stack is a group of open source products from Elastic designed to help users take data from any type of source and in any format and search, analyze, and visualize that data in real time. This is commonly referred to as an ELK stack (Elasticsearch, Logstash, and Kibana).

Setting up Elastic Stack can be quite confusing as there are several moving parts. As a very basic primer, logstash is the workhouse that applies various filters to parse the logs better. Logstash will then forward the parsed logs to elasticsearch for indexing. Kibana allows you to visualize the data stored in elasticsearch.

Server Installation

This guide is going to be based on CentOS/RHEL 7. Elasticsearch needs at least 2G of memory. So for the entire stack (Elasticsearch, Logstash and Kibana) to work, the absolute minimum required memory should be around 4G. Anything less than this may cause the services to become unstable or not start up at all.

Elastic Stack relies on Java, so install Java 1.8.0 by:

[root@elk01 ~]# yum install java-1.8.0-openjdk
[root@elk01 ~]# java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

Elastic Stack packages all the needed software within their own repos, so to setup their repo by:

[root@elk01 ~]# rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
[root@elk01 ~]# echo '[elasticstack-6.x]
name=Elastic Stack repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md' > /etc/yum.repos.d/elasticstack.repo

Now install the needed packages for Elastic Stack and set them to start on boot:

[root@elk01 ~]# yum install elasticsearch kibana logstash filebeat
[root@elk01 ~]# systemctl daemon-reload
[root@elk01 ~]# systemctl enable elasticsearch kibana logstash filebeat

Server Configuration

Setup Elasticsearch to listen for connects on the public IP of the server. Mine is also configured to listen on localhost as well since I am monitoring logs locally as well:

[root@elk01 ~]# vim /etc/elasticsearch/elasticsearch.yml
...
network.host: 123.123.123.123, localhost
...

Setup Elasticsearch to be able to use geoip and user-agent by:

[root@elk01 ~]# /usr/share/elasticsearch/bin/elasticsearch-plugin install ingest-geoip
[root@elk01 ~]# /usr/share/elasticsearch/bin/elasticsearch-plugin install ingest-user-agent

Configure logstash with a basic configuration to accept logs from filebeats and forward them to elasticsearch by:

[root@elk01 ~]# echo 'input {
  beats {
    port => 5044
  }
}

# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }

filter {
  if [type] == "apache-access" {
    # This will parse the apache access event
    grok {
      match => [ "message", "%{COMBINEDAPACHELOG}" ]
    }
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}" 
    document_type => "%{[@metadata][type]}" 
  }
}' > /etc/logstash/conf.d/logstash.conf

Start and test services by:

[root@elk01 ~]# systemctl start kibana elasticsearch logstash filebeat

Elasticsearch will take about 15 seconds or more to start. To ensure elasticsearch is running, check that the output is similar to the following:

[root@elk01 ~]# curl -XGET 'localhost:9200/?pretty'
{
  "name" : "Cp8oag6",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "AT69_T_DTp-1qgIJlatQqA",
  "version" : {
    "number" : "6.0.1",
    "build_hash" : "f27399d",
    "build_date" : "2016-03-30T09:51:41.449Z",
    "build_snapshot" : false,
    "lucene_version" : "7.0.1",
    "minimum_wire_compatibility_version" : "1.2.3",
    "minimum_index_compatibility_version" : "1.2.3"
  },
  "tagline" : "You Know, for Search"
}

Then log into Kibana by navigating your browser to:

http://localhost:5601

If this is installed on a remote server, then you can easily install Nginx to act as a front end for Kibana by:

# Install Nginx
[root@elk01 ~]# yum install nginx httpd-tools

# Setup username/password
[root@elk01 ~]# htpasswd -c /etc/nginx/htpasswd.users kibanaadmin

# Create Nginx vhost
[root@elk01 ~]# echo 'server {
    listen 80;

    server_name kibana.yourdomain.com;

    auth_basic "Restricted Access";
    auth_basic_user_file /etc/nginx/htpasswd.users;

    location / {
        proxy_pass http://localhost:5601;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;        
    }
}' > /etc/nginx/conf.d/kibana.conf

# Set services to start on boot and start nginx
[root@elk01 ~]# systemctl daemon-reload
[root@elk01 ~]# systemctl enable nginx
[root@elk01 ~]# systemctl start nginx

# Open up the firewall to allow inbound port 80 traffic from anywhere
[root@elk01 ~]# firewall-cmd --zone=public --add-port=80/tcp --permanent
[root@elk01 ~]# firewall-cmd --reload

# Allow nginx to connect to Kibana port 5601 if you’re using SELinux:
[root@elk01 ~]# semanage port -a -t http_port_t -p tcp 5601

# Navigate your browser to your new domain you setup, assuming you already setup DNS for it:
http://kibana.yourdomain.com

Client installation – Filebeat

The question now becomes, how can I get the log messages from other servers into our Elastic Stack server? As my needs are more basic since I am not doing any manipulation of log data, I can make use of filebeat and its associated plugins to get the Apache, Nginx, MySQL, Syslog, etc data I need over to the ElasticSearch server.

Assuming filebeat is not installed, ensure that you have the repos setup for it:

[root@web01 ~]# rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
[root@web01 ~]# echo '[elasticstack-6.x]
name=Elastic Stack repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md' > /etc/yum.repos.d/elasticstack.repo

Then install filebeat by:

[root@web01 ~]# yum install filebeat
[root@web01 ~]# systemctl daemon-reload
[root@web01 ~]# systemctl enable filebeat

Setup filebeats to send the logs over to your Elastic Stack server:

[root@web01 ~]# vim /etc/filebeat/filebeat.yml
...
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["123.123.123.123:9200"]
...

Now setup the plugins for filebeat. Only setup the ones you need. Be sure to restart filebeat after you have your desired modules enabled.

To send over Apache logs

[root@web01 ~]# filebeat modules enable apache2
[root@web01 ~]# filebeat setup -e
[root@web01 ~]# systemctl restart filebeat

Note, you may need to modify the filebeat apache2 module to pickup your logs. In my case, I had to set the ‘var.paths’ for both the access and error logs by:

[root@web01 ~]# vim /etc/filebeat/modules.d/apache2.yml
- module: apache2
  # Access logs
  access:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/var/log/httpd/*access.log*"]

  # Error logs
  error:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/var/log/httpd/*error.log*"]

[root@web01 ~]# systemctl restart filebeat

To send over syslog data:

[root@web01 ~]# filebeat modules enable system
[root@web01 ~]# filebeat setup -e
[root@web01 ~]# systemctl restart filebeat

To handle MySQL data:

[root@web01 ~]# filebeat modules enable mysql
[root@web01 ~]# filebeat setup -e
[root@web01 ~]# systemctl restart filebeat

To send over auditd logs

[root@web01 ~]# filebeat modules enable auditd
[root@web01 ~]# filebeat setup -e
[root@web01 ~]# systemctl restart filebeat

To send over Nginx logs

[root@web01 ~]# filebeat modules enable nginx
[root@web01 ~]# filebeat setup -e
[root@web01 ~]# systemctl restart filebeat

Enable Docker log shipping to elasticsearch. There is no plugin for this, but its easy enough to configure:
Reference: https://www.elastic.co/blog/enrich-docker-logs-with-filebeat

[root@web01 ~]# vim /etc/filebeat/filebeat.yml
filebeat.prospectors:
...
- type: log
  paths:
   - '/var/lib/docker/containers/*/*.log'
  json.message_key: log
  json.keys_under_root: true
  processors:
  - add_docker_metadata: ~
...

[root@web01 ~]# systemctl restart filebeat

Then browse to the Kibana dashboard to view the available dashboards for Filebeat, or create your own!

Client installation – Metricbeat

What about shipping metrics and statistics over to the Elastic Stack server? This is where Metricbeat comes into play. Metricbeat is a lightweight shipper that you can install on your client nodes that will collect metrics and ships them to Elasticsearch. There are modules for Apache, HAProxy, MySQL, Nginx, PostgreSQL, Redis, System and more. This can be installed on your client servers or on the ELK server itself if you like.

Assuming Metricbeat is not installed, ensure that you have the repos setup for it:

[root@web01 ~]# rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
[root@web01 ~]# echo '[elasticstack-6.x]
name=Elastic Stack repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md' > /etc/yum.repos.d/elasticstack.repo

Then install Metricbeat by:

[root@web01 ~]# yum install metricbeat
[root@web01 ~]# systemctl daemon-reload
[root@web01 ~]# systemctl enable metricbeat

Setup Metricbeat to send the logs over to your Elastic Stack server:

[root@web01 ~]# vim /etc/metricbeat/metricbeat.yml
...
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["123.123.123.123:9200"]
...

Now setup the plugins for Metricbeat. Only setup the ones you need. Be sure to restart Metricbeat after you have your desired modules enabled.

To see the full listing of available modules and what is currently enabled:

[root@web01 ~]# metricbeat modules list

To send over Apache, MySQL, Nginx and System metrics:

[root@web01 ~]# metricbeat modules enable apache mysql nginx system
[root@web01 ~]# filebeat setup -e

After enabling each one, be sure to check out the modules associated config file as you may need to make changes to it so it will work with your environment. The modules config files can be found in:

[root@web01 ~]# cd /etc/metricbeat/modules.d

Once the configurations are updated accordingly, restart Filebeat by:

[root@web01 ~]# systemctl restart filebeat

Then browse to the Kibana dashboard to view the available dashboards for Metricbeat, or create your own!

Quick Kibana Primer

Now that you have data coming into Elasticsearch, you can use Kibana to generate some quick searches and visualizations. This is not meant to be a full fledge tutorial on how to use Kibana, just a way to jump start the learning process as Kibana can be somewhat complicated if you have never seen it.

To search Kibana looking for failed logins, type the following in the discover search box:

system.auth.ssh.event:Failed OR system.auth.ssh.event:Invalid

To see what packages have been recently installed, type the following in the discover search box:

source: "/var/log/messages" AND system.syslog.message: *Install*

What about visualizations? To see the top 5 countries accessing Apache:

- Click 'Visualizations' over on the left
- Select 'Vertical bar chart'
- Select 'filebeat-*' from the existing index
Click 'X-Axis'
Aggregation:  Terms
Field:  apache2.access.geoip.country_iso_code
Order by:  Metric: Count
Order Descending:  5

To break it down further by city:
- Click 'Add sub-buckets'
- Select 'Split series'
Sub Aggregation:  Terms
Field:  apache2.access.geoip.city_name
Order by:  metric: Count
Order Decending:  5
Click run

View the top 5 remote IP’s hitting Apache:

- Click 'Visualizations' over on the left
- Select 'Vertical bar chart'
- Select 'filebeat-*' from the existing index
- Click 'X-Axis'
Aggregation:  Terms
Field:  apache2.access.remote_ip
Size:  5

Click 'Add sub-buckets'
- Select 'Split series'
Sub Aggregation:  Terms
Field:  apache2.access.remote_ip
Order by:  metric: Count
Order: Descending
Size:  5

View the top 10 requested URL’s in Apache:

- Click 'Visualizations' over on the left
- Select 'Data Table'
- Select 'filebeat-*' from the existing index
- Under Buckets, click 'Split Rows'
Aggregation:  Terms
Field:  apache2.access.url
Order By:  metric: Count
Order:  Descending
Size:  10
Custom Label:  URL

Then click 'Split Rows'
Sub Aggregation:  Terms
Field:  apache2.access.body_sent_bytes
Order By:  metric: Count
Descending:  10
Custom Label:  Size
Click run

Create line chart for apache response codes:

- Click 'Visualizations' over on the left
- Select 'Line chart'
- Select 'filebeat-*' from the existing index
- Click X-Axis
Aggregation:  Date Histogram
Field:  @timestamp
Interval:  Minute

Click 'Split Series'
Sub Aggregation:  Terms
Field:  apache2.access.response_code
Oder by:  metric: Count
Order:  Descending
Size: 5
Click run

See which logs are receiving a lot of activity:

- Click 'Visualizations' over on the left
- Select 'Pie Chart'
- Select 'filebeat-*' from the existing index
- Click 'Split Slices'
Aggregation:  Terms
Field:  source
Order by:  metric: Count
Order: Descending
Size: 5

Show total traffic by domains:

- Click 'Visualizations' over on the left
- Select 'Line Chart'
Aggregation:  Date Histogram
Field:  @timestamp
Interval:  Auto

- Click 'Split Series'
Sub Aggregation:  Filters
Filter 1:  apache2.access.method:* AND source:"/var/log/httpd/domain01.com-access.log"
Filter 2:  apache2.access.method:* AND source:"/var/log/httpd/domain02.com-access.log"

Show GET request counts:

- Click 'Visualizations' over on the left
- Select 'Metrics'
- Select 'filebeat-*' from the existing index
- Click 'Metrics'
Aggregation:  Count

- Click Buckets
- Click 'Split Group'
Aggregation:  Filters
Filter 1 - GET:  apache2.access.method:"GET" AND source:"/var/log/httpd/domain01.com-access.log"

Show POST request counts:

- Click 'Visualizations' over on the left
- Select 'Metrics'
- Select 'filebeat-*' from the existing index
- Click 'Metrics'
Aggregation:  Count

Click Buckets
- Select 'Split Group'
Aggregation:  Filters
Filter 1 - GET:  apache2.access.method:"POST" AND source:"/var/log/httpd/domain01.com-access.log"

Show GET vs POST requests by domain:

- Click 'Visualizations' over on the left
- Select 'Line chart'
- Select 'filebeat-*' from the existing index
- Click X-Axis
Aggregation:  Date Histogram
Field:  @timestamp
Interval:  Auto

Click 'Split Series'
Sub Aggregation:  Filters
Filter 1:  apache2.access.method:"GET" AND source:"/var/log/httpd/domain01.com-access.log"
Filter 2:  apache2.access.method:"POST" AND source:"/var/log/httpd/domain01.com-access.log"

Show total requests on domain:

- Click 'Visualizations' over on the left
- Select 'Line chart'
- Select 'filebeat-*' from the existing index
- Click Y-Axis
Aggregation:  Count

Click 'Add sub-buckets'
- Aggregation:  Data Histogram
- Field:  @timestamp
- Interval:  Auto

Click 'Split Series'
- Sub Aggregation:  Filters
- Filter:  apache2.access.method:* AND source:"/var/log/httpd/domain01.com-access.log"

Display the current Apache error logs:

- Click 'Visualizations' over on the left
- Select 'Data Table'
- Select 'Apache error log [Filebeat Apache2] from Saved Search

View top 10 WordPress posts:

- Click 'Visualizations' over on the left
- Select 'Data Table'
- Select 'filebeat-*' from the existing index
In the search bar above, type:  apache2.access.url: like \/20*

- Under Buckets, click 'Split Rows'
Aggregation:  Terms
Field:  apache2.access.url
Order By:  metric: Count
Order:  Descending
Size:  10
Custom Label:  Posts

Purging all data from Elasticsearch indexes

Whatever the reason for wanting to completely purge the Elasticsearch indexes, its really simple to do as shown below. Just keep in mind you will lose all the data collected in the indexes! This example will clear out the filebeat indexes:

[root@elk01 ~]# curl -XDELETE 'http://localhost:9200/filebeat-*'