rmorgan

rmorgan is a pretty cool user. Here are a bunch of searches created by rmorgan

Combine dbinspect and REST api data for buckets

| dbinspect index=*
| foreach * [eval dbinspect_<<FIELD>> = '<<FIELD>>']
| table dbinspect_*
| append [
  | rest splunk_server_group=dmc_group_cluster_master "/services/cluster/master/buckets"
  | foreach * [eval rest_api_<<FIELD>> = '<<FIELD>>']
  | table rest_api_* 
  ]
| eval bucketId=if(isNull(rest_api_title),dbinspect_bucketId,rest_api_title)
| stats values(*) as * by bucketId
| foreach rest_api_peers.*.* [eval rest_api_<<MATCHSEG2>>=""]
| foreach rest_api_peers.*.* [eval rest_api_<<MATCHSEG2>>=if("<<MATCHSEG1>>"=dbinspect_bucketId,'<<FIELD>>','<<MATCHSEG2>>')]
| fields - rest_api_peers.*

purpose:

requirements:

Needs to be executed on a search head that can query the cluster master REST API

comments:

The dbinspect API doesn't return consistent information about the size of buckets.

Work out data volumes by source type

| metadata type=sourcetypes
| noop sample_ratio=1000
| append [ search index=*  
   | eval size=length(_raw) 
   | stats avg(size) as average_event_size by sourcetype index
   ]
| stats values(totalCount) as total_events values(average_event_size) as average_event_size by sourcetype
| addinfo
| eval period_days=(info_max_time-info_min_time)/(24*60*60)
| eval totalMB_per_day=floor(total_events*average_event_size/period_days/1024/1024)
| table sourcetype totalMB_per_day

purpose:

Efficiently calculate how much data is being indexed per day by source type. Very useful for calculating enterprise security data volumes

requirements:

Requires 6.3.x or later for the event sampling feature

comments:

Combines results from | metadata for counts and then multiplies this by the average event size. Automatically accounts for time ranges. You will need to modify the sample rate to be suitable for your data volume. The metadata search turns out to be very approximate and counts the values associate with buckets, if you have buckets which are open for a very long time it will take the value for the entire period of the bucket, not the period of your search time range. Consider using tstats if this is an issue in your environment.

How many days in this month?

 | makeresults 
 | eval days_in_month=mvindex(split(if(tonumber(strftime(_time,"%y"))%4=0,"31,29,31,30,31,30,31,31,30,31,30,31","31,28,31,30,31,30,31,31,30,31,30,31"),","),tonumber(strftime(_time,"%m"))-1)

purpose:

Given _time how many days are in this month, 31? 30? 28? 29? This eval statement uses a lookup held in a multi-value array to pull out the value in a computationally efficient manor.

requirements:

comments:

The eval expression is the solution, and the example only uses makeresults to fake create sample data

Size distribution of my auto_high_volume buckets

| dbinspect [
  | rest /services/data/indexes      
  | eval index=title      
  | stats values(maxDataSize) as maxDataSize by index      
  | where maxDataSize="auto_high_volume"      
  | eval index="index=".index      
  | stats values(index) as indexes      
  | mvcombine delim=" " indexes     
  | eval search=indexes ] 
| bin sizeOnDiskMB span=2log4 
| chart limit=0 count by sizeOnDiskMB index

purpose:

requirements:

comments:

This search was developed to visualise if buckets were being rolled early.