Loading the data using Logstash

To import the data, please follow the instructions in this book's accompanying source code repository on GitHub, at https://github.com/pranav-shukla/learningelasticstack. This can be found in the v7.0 branch.

Please clone or download the repository from GitHub. The instructions for importing data are at the following path within the project: chapter-04/README.md. Once you have cloned the repository, check out the v7.0 branch.

Once you have imported the data, verify that your data has been imported with the following query:

GET /bigginsight/_search
{
  "query": {
    "match_all": {}
   },
  "size": 1
}

You should see a response similar to the following:

{
  ...
  "hits": 
    {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score": 1,
    "hits": [
            {
        "_index": "bigginsight",
        "_type": "_doc",
        "_id": "AV7Sy4FofN33RKOLlVH0",
        "_score": 1,
        "_source": {
          "inactiveMs": 1316,
          "bandwidth": 51.03333333333333,
          "signalStrength": -58,
          "accessPointId": "AP-1D7F0",
          "usage": 3062,
          "downloadCurrent": 39.93333333333333,
          "uploadCurrent": 11.1,
          "mac": "d2:a1:74:28:c0:5a",
          "tags": [],
          "@timestamp": "2017-09-30T12:38:25.867Z",
          "application": "Dropbox",
          "downloadTotal": 2396,
          "@version": "1",
          "networkId": "Guest",
          "location": "23.102900,72.595611",
          "time": 1506164775655,
          "band": "2.4 GHz",
          "department": "HR",
          "category": "File Sharing",
          "uploadTotal": 666,
          "username": "Cheryl Stokes",
          "customer": "Microsoft"
        }
      }
    ]
  }
}

Now that we have the data that we want, we can get started and learn about different types of aggregations from the data that we just loaded. You can find all the queries that are used in this chapter in the accompanying source code in the GitHub repository, at the location chapter-04/queries.txt. The queries can be run directly in Kibana Dev Tools, as we have seen previously in this book.

One thing to note here is the hits.total in the response. It has values of 10,000 and "relation"="gte". There are actually 242,835 documents in the index, all of which we have created. Before version 7.0 was released, hits.total always used to represent the actual count of documents that matched the query criteria. With Elasticsearch version 7.0, hits.total is not calculated if the hits are greater than 10,000. This is to avoid the unnecessary overhead of calculating the exact matching documents for the given query. We can force the calculation of exact hits by passing track_total_hits=true as a request parameter.

Table of Contents for Loading the data using Logstash

Create new playlist

Sign In

Sign Up

Table of Contents for
Loading the data using Logstash