Machine Learning Patterns, Mechanisms > Data Exploration Patterns > Associativity Computation
Associativity Computation (Khattak)
How can the existence of relationship(s) between variables in a dataset be determined?
Gaining an understanding of a dataset and the subsequent model development requires finding connections between variables. Failure to do so results in ineffective models comprising irrelevant variables as predictors.
The connection between variables is expressed in the form of relationship between variables and is quantified via the application of proven statistical techniques.
Numerical values present in the dataset are taken in pairs and the measures of association (correlation and covariance) are calculated.
A dataset contains values of ice cream sold for different temperature readings recorded over three days (1). The measures of association are found in order to determine whether the number of ice creams sold is related to the temperature readings (2). Based on the value of correlation, it is concluded that there is a strong positive relationship between the number of ice creams sold and the temperature readings, which means that as the temperature increases, more ice cream is sold and vice versa (3).