## Leptonica: Cosine Similarity for Pix comparison

Lately I have been researching various methods of image comparison and classification. One simple method that is often used for comparing documents is known as cosine similarity. The cosine similarity is simply the dot product of two vectors divided by their euclidean norms multiplied. Wikipedia has more information on the actual definition.

``````
double cosineSimilarity(Pix* pixA, Pix* pixB) {
double numerator = 0.0;
double denominator_A = 0.0;
double denominator_B = 0.0;
double denominator = 0.0;
int width = 0;
int height = 0;
getSmallestDimensions(pixA, pixB, width, height);

l_uint8** linePtrs_A = (l_uint8**) pixGetLinePtrs(pixA, NULL);
l_uint8** linePtrs_B = (l_uint8**) pixGetLinePtrs(pixB, NULL);
volatile l_uint8 val_A, val_B;

// sum of A * B
for (int i = 0; i < height; ++i) {
l_uint8 *line_A = linePtrs_A[i];
l_uint8 *line_B = linePtrs_B[i];
for (int k = 0; k < width; ++k) {
val_A = line_A[k] & 0x1;
val_B = line_B[k] & 0x1;
numerator += val_A * val_B;
}
}

for (int i = 0; i < height; ++i) {
l_uint8 *line_A = linePtrs_A[i];
l_uint8 *line_B = linePtrs_B[i];
for (int k = 0; k < width; ++k) {
val_A = ((int) (line_A[k] & 0x1), 2);
val_B = ((int) (line_B[k] & 0x1), 2);
denominator_A += val_A;
denominator_B += val_B;
}
}
denominator = sqrt(denominator_A) * sqrt(denominator_B);
return numerator / denominator;
}

void getSmallestDimensions(Pix* pixA, Pix* pixB, int &width, int &height) {
int w1 = 0.0;
int w2 = 0.0;
int h1 = 0.0;
int h2 = 0.0;

pixGetDimensions(pixA, &w1, &h1, NULL);
pixGetDimensions(pixA, &w2, &h2, NULL);

if (w1 < w2) {
width = w1;
} else {
width = w2;
}
if (h1 < h2) {
height = h1;
} else {
height = h2;
}
}

``````
``` ```
``` Permalink ```
``` Leave a Comment You must be logged in to post a comment. 138 Trackbacks \ Pings » ```
``` Search for: Recent Posts Remove Search Protect by Conduit Polymophic messages with Jboss’s Netty and Google’s Protocol Buffers. Google Protocol Buffers + Maven + Eclipse Example of writing objects with Netty Leptonica: Cosine Similarity for Pix comparison Recent Comments Archives March 2014 February 2014 January 2014 September 2013 January 2013 October 2012 September 2012 Categories Uncategorized Meta Log in Entries RSS Comments RSS WordPress.org My right sidebar goes here. ```
``` Design by Richard 'Jodi' Maxwell · XHTML · CSS · ```