A NumPy array is a N-dimensional container of
items of
the same type and size. As a computer programming data structure, it is limited
by resources and dtype --- there are values which are not representable by NumPy
arrays. Due to these limitations, NumPy arrays are not exactly equivalent to the
mathematical concept of coordinate vectors. NumPy arrays are often used to
(approximately) represent vectors however.
Math also has a concept of vector spaces whose elements are called vectors. One
example of a vector is an object with direction and magnitude. A coordinate
vector is merely a represention of the vector with respect to a particular
coordinate system. So while a NumPy array can at best record the
coordinates of a vector (tacitly, with respect to a coordinate system), it can
not capture the full abstract notion of a vector. The abstract notion of vector
exists without any mention of coordinate system.
Moreover, vector spaces can be collections of things other than
coordinates. For example, families of functions can form
a vector space. The functions would then be vectors. So here is another example
where NumPy arrays are not at all equivalent to vectors.
Linear algebra makes a distinction between "row vectors" and "column vectors".
There is no such distinction in NumPy. There are only n-dimensional arrays.
Keep in mind that NumPy was built around a desire to generalize array-like containers to N
dimensions where N
is bigger than 2. So NumPy operations are defined in ways that generalize to higher dimensions.
For example, transposing a NumPy array of shape (a,b,c,d)
returns an array of shape (d,c,b,a)
-- the axes are reversed. In two dimensions, this means an array of shape (a,b)
(i.e. a
rows, b
columns) becomes an array of shape (b,a)
(i.e, b
rows, a
columns). So NumPy's notion of transposition matches up nicely with the linear algebra notion for 2-dimensional arrays.
But this also means that the transpose of a 1-dimensional NumPy array of shape
(a,)
still has shape (a,)
. Nothing changes. It is still the same
1-dimensional array. Thus there is no real distinction between "row vectors"
and "column vectors".
NumPy apes the concept of row and column vectors using 2-dimensional arrays.
An array of shape (5,1)
has 5 rows and 1 column. You can sort of think of this as a column vector, and wherever you would need a column vector in linear algebra, you could use an array of shape (n,1)
. Similarly, wherever you see a row vector in linear algebra you could use an array of shape (1,n)
.
However, NumPy also has a concept of broadcasting and one of the rules of broadcasting is that extra axes will be automatically added to any array on the left-hand side of its shape whenever an operation requires it. So,
a 1-dimensional NumPy array of shape (5,)
can
broadcast to a 2-dimensional array of shape (1,5)
(or 3-dimensional array of
shape (1,1,5)
, etc).
This means a 1-dimensional array of shape (5,)
can be thought of as a row vector since it will automatically broadcast up to an array of shape (1,5)
whenever necessary.
On the other hand, broadcasting never adds extra axes on the right-hand side of the shape. You must do so explicitly. So if theta
is an array of shape (5,)
, to create a "column vector" of shape (5,1)
you must explicitly add the new axis yourself by using theta[:, np.newaxis]
or the shorthand theta[:, None]
.
What would be the correct numpy equivalent of $\theta^TX$?
If, for example,
In [4]: import numpy as np
In [5]: theta = np.array([1,2,3,4,5])[:, np.newaxis]
In [7]: X = np.random.randint(10, size=(5,3))
In [8]: X
Out[8]:
array([[4, 0, 3],
[6, 9, 1],
[7, 8, 7],
[4, 2, 6],
[7, 7, 2]])
then you could compute $\theta^TX$ using
In [18]: np.dot(theta.T, X)
Out[18]: array([[88, 85, 60]])
Note that np.dot
is defined so that
For N dimensions it is a sum product over the last axis of a
and
the second-to-last of b
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
This has the property that
For 2-D arrays it is equivalent to matrix multiplication, and for 1-D
arrays to inner product of vectors (without complex conjugation).
Note that NumPy also has a matrix
subclass of ndarray
whose multiplication operator is defined to match 2-dimensional matrix multiplication. So if theta
and X
were NumPy matrices, then you could write theta.T * X
instead of np.dot(theta.T, X)
. This can make translating math into NumPy code a bit more readable.
Or, if you have Python3.5 or newer, you can use regular NumPy arrays and write theta.T @ X
.