Min-Max Normalization

Min-Max Normalization

Introduction

  • Min-max normalization is an operation which rescales a set of data.
  • This can be useful when:
    • Comparing data from two different scales
    • Converting data to a new scale
  • In most situations, data is normalized to a fit a target range of [0, 1]
    • The smallest value in the original set would be mapped to 0
    • The largest value in the original set would be mapped to 1
    • Every other value would be mapped to a value somewhere between these two bounds
  • It is also called:
    • Feature Scaling
    • Min-Max Scaling
    • Rescaling
    • Normalization

Normalizing to [0, 1]

  • A set of numbers will have:
    • A smallest value:
      • This is also called the lower bound or least element
      • It is denoted:
        min(x)
    • A largest value:
      • This is also called the upper bound or greatest element
      • It is denoted:
        max(x)
    • A range:
      • The difference between the smallest and largest values
      • It is denoted:
        max(x) - min(x)
  • Normalization is the process of changing the lower and upper bounds to be 0 and 1 respectively

Algorithm

  1. First we modify the data to have a lower bound of 0. To do this we subtract the minimum value from each value:
  2. Then we modify the data to have an upper bound of 1. We do this by dividing each value by the original range:
  3. Finally, if we combine these two steps we get:

Example

Normalize the following data:
  1. First we calculate the lower bound, upper bound, and range:
    • min(x) = 7
    • max(x) = 21
    • max(x) - min(x) = 14
  2. Next we subtract the lower bound from each value:
  3. Finally, we divide by the range:

Code (Python)

import numpy as np

def normalize(x):
    min = np.min(x)
    max = np.max(x)
    range = max - min

    return [(a - min) / range for a in x]


x = [7, 21, 13, 15]
normalizedX = normalize(x)

print(normalizedX) # prints: [0.0, 1.0, 0.42857142857142855, 0.5714285714285714]

Normalizing from [0, 1]

  • If our numbers are in the range [0, 1] then we can scale them to have a different lower and upper bound
  • To achieve this, we simply do the reverse of normalization:
    1. Find the new range by subtracting the lower bound from the upper bound
    2. Multiply each value by the new range
    3. and add the new lower bound to each value:

Example

Normalize the following data to have a lower bound of 3 and an upper bound of 24:
  1. First we calculate the range:
  2. Then we multiply each value by the range:
  3. Finally, we add the lower bound to each value:

Code (Python)

def normalize(normalizedX, newLowerBound, newUpperBound):
    range = newUpperBound - newLowerBound

    return [a * range + newLowerBound for a in normalizedX]


normalizedX = [0.0, 1.0, 3/7, 4/7]
x = normalize(normalizedX, 3, 24)

print(x) # prints: [3.0, 24.0, 12.0, 15.0]

Normalizing from one range to another

  • Sometimes we need to normalize data in which neither the source range nor the target range is [0, 1]
  • In these situations, we first normalize the data to range of [0, 1], and then normalize it again to the true target range.
  • These two steps can be combined:

Code (Python)

import numpy as np

def normalize(x, newLowerBound, newUpperBound):
    min = np.min(x)
    max = np.max(x)
    range = max - min
    newRange = newUpperBound - newLowerBound

    return [((a - min) / range) * newRange + newLowerBound for a in x]


x = [7, 21, 13, 15]
y = normalize(x, 3, 24)

print(y) # prints: [3.0, 24.0, 12.0, 15.0]