streamstats

It's a pretty popular search command and it is used in all sorts of situations. Below are some really cool searches that use streamstats along with other search commands.

More than a day between events

<search>
| sort _time
| streamstats current=f global=f window=1 last(_time) as last_ts
| eval time_since_last = _time - last_ts
| fieldformat time_since_last = tostring(time_since_last, "duration")
| where time_since_last > 60*60*24

purpose:

find situations where there is more than a day between two events

requirements:

any events. the only field dependency is _time

comments:

Time between events

<search>
| sort _time 
| streamstats current=f global=f window=1 last(_time) as last_ts 
| eval time_since_last = _time - last_ts 
| fieldformat time_since_last = tostring(time_since_last, "duration")

purpose:

add a field to each event which is the time between this event and the previous one. duration between events

requirements:

any data. the only field requirement in this search is _time

comments:

Speed / Distance Login Anomaly

index=geod
| iplocation clientip 
| sort _time 
| strcat lat "," lon latlon 
| streamstats current=f global=f window=1 last(latlon) as last_latlon
| eval last_latlon=if(isnull(last_latlon), latlon, last_latlon)
| streamstats current=f global=f window=1 last(_time) as last_ts
| eval time_since_last = _time - last_ts
| eval time_since_last=if(isnull(time_since_last), 0, time_since_last)
| haversine originField=last_latlon outputField=distance units=mi latlon
| eval speed=if(time_since_last==0, 0, (distance/(time_since_last/60/60)))
| where speed > 500
| strcat speed " MPH" speed
| table user, distance, _time, time_since_last, speed, _raw

purpose:

Find those tuples of events where the speed needed to cover distance in time between events is greater than 500MPH

requirements:

haversine app clientip

comments:

Auth anomaly basic with haversine

index=geod 
| iplocation clientip 
| sort _time 
| strcat lat "," lon latlon 
| streamstats current=f global=f window=1 last(latlon) as last_latlon
| eval last_latlon=if(isnull(last_latlon), latlon, last_latlon)
| streamstats current=f global=f window=1 last(_time) as last_ts
| eval time_since_last = _time - last_ts
| eval time_since_last=if(isnull(time_since_last), 0, time_since_last)
| haversine originField=last_latlon outputField=distance units=mi latlon
| eval speed=if(time_since_last==0, 0, (distance/(time_since_last/60/60)))
| strcat speed " MPH" speed
| table user, distance, _time, time_since_last, speed, _raw

purpose:

Find the speed needed to cover the distance between the ip-location specified in two different login events

requirements:

haversine app clientip as ip address

comments:

cumulative distribution function

| stats count by X
| eventstats sum(count) as totalĀ 
| eval probXi=count/total
| sort X
| streamstats sum(probXi) as CDF

purpose:

requirements:

comments:

props to Pierre Brunel

Time Travel or How to move a field through time for prediction purposes

| inputlookup app_usage.csv | reverse | streamstats window=1 current=f first(RemoteAccess) as RemoteAccessFromFuture | reverse | ...

purpose:

Align a future value with the features in the past based on some time delta (Time to Decision, Time to Action) for machine learning or predictive analytics in general.

requirements:

comments:

Props to Tom LaGatta Be careful , check 1) for current=f 2) if your time frame is correct for the |reverse bit. 3) if you are confused about first() verse last(), use a line chart and check