Work out data volumes by source type

| metadata type=sourcetypes
| noop sample_ratio=1000
| append [ search index=*  
   | eval size=length(_raw) 
   | stats avg(size) as average_event_size by sourcetype index
| stats values(totalCount) as total_events values(average_event_size) as average_event_size by sourcetype
| addinfo
| eval period_days=(info_max_time-info_min_time)/(24*60*60)
| eval totalMB_per_day=floor(total_events*average_event_size/period_days/1024/1024)
| table sourcetype totalMB_per_day


Efficiently calculate how much data is being indexed per day by source type. Very useful for calculating enterprise security data volumes


Requires 6.3.x or later for the event sampling feature


Combines results from | metadata for counts and then multiplies this by the average event size. Automatically accounts for time ranges. You will need to modify the sample rate to be suitable for your data volume. The metadata search turns out to be very approximate and counts the values associate with buckets, if you have buckets which are open for a very long time it will take the value for the entire period of the bucket, not the period of your search time range. Consider using tstats if this is an issue in your environment.