Introduction
- A tensor operation is an operation that either:
- Uses a tensor as one or more of it's inputs
- Or produces a tensor as it's output
Component-wise Operations
- A component-wise (or element-wise) operation acts on each value individually.
- Component-wise operations usually produce tensors of the same shape as the input tensor.
Addition
- Tensor addition operates on two tensors of the same shape, and produces a tensor also of the same shape.
- Each value in the output tensor is the sum of the values in the input tensors at the same location.
- Code (slow):
import numpy as np def add(a, b): assert a.shape == b.shape c = np.empty(a.shape) for index in np.ndindex(a.shape): c[index] = a[index] + b[index] return c a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6], [7, 8]]) c = add(a, b) print(c) # prints [[ 6 8] # [10 12]]
- Code (fast):
import numpy as np a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6], [7, 8]]) c = a + b print(c) # prints [[ 6 8] # [10 12]]
Relu
- relu stands for rectified linear unit.
- If an element is positive, it is used as-is.
- If an element is negative, 0 is used.
- Code (slow):
import numpy as np def relu(a): r = np.empty(a.shape) for index in np.ndindex(a.shape): r[index] = max(a[index], 0) return r a = np.array([[1, -2], [3, -4]]) b = relu(a) print(b) # prints [[1 0] # [3 0]]
- Code (fast):
import numpy as np a = np.array([[1, -2], [3, -4]]) b = np.maximum(a, 0) print(b) # prints [[1 0] # [3 0]]
Broadcasting
- Component-wise operations such as addition rely on the two input tensors having the same shape.
- However sometimes it is necessary to perform operations on vectors with different shapes.
- Broadcasting is an operation which increases the rank of a tensor.
- To broadcast a tensor:
- Add a new axis, which by default has a length of 1.
- Extend the length of the axis by cloning the values.
- Code:
import numpy as np a = np.array([[1, 2], [3, 4]]) print(a.shape) # prints (2, 2) a = np.expand_dims(a, axis=0) print(a.shape) # prints (1, 2, 2) a = np.concatenate([a] * 3, axis=0) print(a.shape) # prints (3, 2, 2) print(a) # prints: # [[[1 2] # [3 4]] # [[1 2] # [3 4]] # [[1 2] # [3 4]]]
- Numpy operations automatically perform broadcasting if there inputs are different shapes:
import numpy as np a = np.array([[1, 1], [3, 2]]) b = np.array([2.5, 1.5]) c = np.maximum(a, b) print(c) # prints: # [[2.5 2.5] # [3. 4. ]]
- For broadcasting to be successful, the shape of the larger ranked tensor should end with the shape of the snaller ranked tensor:
import numpy as np def canBroadcast(a, b): smallest_rank = min(a.ndim, b.ndim) return a.shape[-smallest_rank:] == b.shape[-smallest_rank:] a = np.array([[1, 2], [3, 4]]) canBroadcast(np.array([5, 6]), a) # true canBroadcast(np.array([5, 6, 7]), a) # false
Tensor Product
- Tensor product of two 1D tensors:
- When both tensors are vectors, their product produces a scalar.
- To find the product, multiply the corresponding elements and sum the results.
- Code (slow):
import numpy as np def product(a, b): assert a.shape == b.shape assert a.ndim == 1 total = 0 for i in range(len(a)): total += a[i] * b[i] return total a = np.array([3, 6, 2]) b = np.array([1, 3, 8]) c = product(a, b) print(c) # prints '37'
- Code (fast)
import numpy as np a = np.array([3, 6, 2]) b = np.array([1, 3, 8]) c = np.dot(a, b) print(c) # prints '37'
Tensor product of two 2D tensors:
- When both tensors are matrices, their product produces a matrix.
- The size of the new matrix has the number of columns from the first matrix, and the number of rows from the second matrix
- Each value is the result of applying the tensor product of a row from the first matrix with a column from the second matrix.
- Code (slow):
import numpy as np def product(a, b): assert a.ndim == 2 assert b.ndim == 2 assert a.shape[1] == b.shape[0] c = np.empty((a.shape[0], b.shape[1])) for index in np.ndindex(c.shape): row = a[index[0], :] column = b[:, index[1]] c[index] = np.dot(row, column) return c a = np.array([[1, 4], [3, 7], [8, 4]]) b = np.array([[3, 6, 2], [5, 1, 10]]) c = product(a, b) print(c) # prints # [[23. 10. 42.] # [44. 25. 76.] # [44. 52. 56.]]
- Code (fast):
import numpy as np a = np.array([[1, 4], [3, 7], [8, 4]]) b = np.array([[3, 6, 2], [5, 1, 10]]) c = np.dot(a, b) print(c) # prints # [[23. 10. 42.] # [44. 25. 76.] # [44. 52. 56.]]
- The tensor product also works for tensors with higher ranks.
- If the ranks of two tensors are different then the normal broadcasting rules apply.
Reshaping
- Reshaping is a tensor produces a new tensor with the same number of values, but a different shape.
import numpy as np x = np.array([[1, 2], [3, 4], [5, 6]]) x = x.reshape((2, 3)) print(x) # prints: # [[1 2 3] # [4 5 6]] x = x.reshape((1, 6)) print(x) # prints: # [[1 2 3 4 5 6]]
- Reshaping a tensor can also involve changing the rank:
import numpy as np x = np.array([[1, 2], [3, 4], [5, 6]]) x = x.reshape(6) print(x) # prints: # [1 2 3 4 5 6]
Transposing
- Transposing a matrix involves converting all its rows into columns:
import numpy as np x = np.array([[1, 2], [3, 4], [5, 6]]) x = x.transpose() print(x) # prints: # [[1 3 5] # [2 4 6]]