As we saw earlier, it can be useful to think of vectors as positions in space. If we want to compare vectors, we can measure the distance between these points, differences in their direction, or a combination of these components. The most common comparison operation involving both magnitude and direction is the dot product.
Dot product
The dot product is an element-wise product and then a summation of these products:
a⋅b=i=0∑∣a∣aibi
where ∣a∣=∣b∣.
Note that a zero in one vector will cancel out the contribution of the corresponding dimension in the other other vector. Let's look at an example ...
xyx⋅yx⋅y=[024]=[343]=(0×3)+(2×4)+(4×3)=20
Note that in the example above x0=0, which means the first dimension of both x and y does not contribute to x⋅y.
docker run -it"parsertongue/python:latest" ipython
# run using the following command:# docker run -it "parsertongue/python:latest" ipythonimport numpy as np
x = np.array([0,2,4])
y = np.array([3,4,3])
x.dot(y)
The result:
20
Comparing lengths
There are an infinite number of ways we can calculate distance between pairs of vectors. We'll focus on two methods, compare them, and see how generalize to the same form.
Manhattan distance
Manhattan or city block distance measures distance in a manner similar to the route an efficient taxi takes through grid-like city streets:
d1(a,b)=i=0∑∣a∣abs(ai−bi)
where x and y are two vectors with the same dimensions.
The Manhattan distance between a and b is length of the dotted line in the image above. Imagine an invisible building exists in the rectangle delimited by a and b. As an earthbound taxi driver starting at point a, you must drive around this obstacle to reach b.
Recall that notation used for the normalization term included a subscripted 2 (i.e., ∥x∥2). Another name for this norm is the 2-norm. The 2-norm is simply the Euclidean distance of a vector x and the origin:
Just as we saw Euclidean distance could be generalized, so can the 2-norm. The 2-norm is a special case of the more general concept of p-norm when p=2.
p-norm
The p-norm takes the following form:
∥x∥p=[i=0∑∣x∣abs(xi)p]p1
Vector normalization
As we saw previously, the normalized form of a vector x uses the 2-norm to normalize each term in some vector x to convert it to a vector of unit length: