Introduction:

The associate command identifies the link between fields that are associated with each other through fields and field value pairs. The Associate command can be  illustrated as  if one event has a reference domain of “http/www.splunk.com” and  another event has a referrer domain with the same URL value then they are associated.

Definition :

The associate command helps to sort out the relationship between the pairs of fields by means of calculating the exchange in entropy based on their value. Entropy defines a measure of loss of information in a transmitted message or it represents whether understanding the value of one field helps to predict the value of another field. Here, if a field has only one unique value the field has an entropy of zero or if the field has multiple values those values are distributed, the higher the entropy.

  Syntax:

associate [<associate-option>…][field-list]

associate –option : supcnt |supfreq|improv

[field-list] – <field>

(List of Fields)

Supcnt – It is  the count of  “reference key=reference value” combination. Must be a non-negative integer. Default: 100

Sup freq – It is  the frequency of the “reference key=reference value” combination as a fraction of the number of total events. Default: 0.1

improv – It is a limit/improvement  specified for the target key. Default: 0.5

The associate command returns following fields by default.

Reference_Key – It is the first field in pairs of fields.

Reference_Value – It is the value of the  first field in pair of fields.

Target_Key – It is the second field in pairs of fields.

Unconditional_Entropy – The entropy of the second field(Target key).

Conditional_Entropy – The entropy of the target key(second field)when the reference key(first field) is the reference value(value of the first field).

Entropy_Improvement – The difference between  unconditional entropy and  conditional entropy.

Description – IT  summarizes the relationship between the field values that is based on the entropy calculations.

Support – It tells us how often the reference field is equal to the reference value in the total number of events.

The Top_Conditional_Value field states three things :

  • The most common value for the given Reference_Value.
  • The frequency of the Reference_Value for that field in the dataset.
  • The frequency of the most common associated value in the Target_Key for the  events that have the specific Reference_Value in that Reference Key.

 

Consider the below example to get information about the fields related to associate commands.                                         Example:

Query :

index = main sourcetype=csv source=”supermarket.csv”| fields Branch City | associate improv=0.05

Result :

 Explanation :

In main data, we will use the associate command to see a relationship between pair of fields Branch and City. Branch fields has three value A,B and C. City fields has three value Yangon, Mandalay and Naypyitaw.

The associate command adds many columns to the output by default, you can use the ‘table’ command  to display only selected columns. associate improv=0.05 is the entropy limit of the target key(City).

In the first line, first field Branch is the Reference key and its value A is the Reference value. The second field City is the Target key. Support=34.00% is the percentage occurrence of Branch=A in the total number of events.

Unconditonal_Entropy=1.585  is the entropy of the target  field City. Conditonal_Entropy=0.000 is the entropy of target field City when reference key Branch  is equal to reference value A. Entropy_Improvement=1.584801 is the difference between unconditional entropy and conditional entropy.

 Top_Conditional_Value is  Yangon that means if Branch=A its TOP_conditional_Value will be Yangon for approximately 100% of the time. Top_Conditional_Value = Yangon(34.00% -> 100%) in which 34.00% is the percentage occurrence of reference value(A), 100% is the percentage occurrence of Yangon when City=A.

The Reference_Key-Branch and Reference_Value-A  are  correlated to the Target_Key-City.

When ‘Branch’ has the value ‘A’ the entropy of ‘Yangon decreases from 1.585 to 0.00,

which means when Branch=A the entropy of Yangon  decreases from Unconditional_Entropy(1.585) to Conditional_Entropy(0.000).

If you still have any doubt regarding Associate command in splunk. Feel free to Ask your Doubts in the Comment Section Below and Don’t Forget to Follow us on 👍 Social Networks.| Happy Splunking 😉