AI

# Tensor Quantization: The Untold Story | by Dhruv Matani | Sep, 2023

Let’s begin with a quite simple instance and construct it up.

## Instance-1: Symmetric uint8 quantization

Let’s say we want to map the floating level vary [0.0 .. 1000.0] to the quantized vary [0 .. 255]. The vary [0 .. 255] is the set of values that may slot in an unsigned 8-bit integer.

To carry out this transformation, we wish to rescale the floating level vary in order that the next is true:

Floating level 0.0 = Quantized 0

Floating level 1000.0 = Quantized 255

That is known as symmetric quantization as a result of the floating level 0.0 is quantized 0.

Therefore, we outline a scale, which is the same as

The place,

On this case, scale = 3.9215

To transform from a floating level worth to a quantized worth, we will merely divide the floating level worth by the size. For instance, the floating level worth 500.0 corresponds to the quantized worth

On this easy instance, the 0.0 of the floating level vary maps precisely to the 0 within the quantized vary. That is known as symmetric quantization. Let’s see what occurs when this isn’t the case.

## Instance-2: Affine uint8 quantization

Let’s say we want to map the floating level vary [-20.0 .. 1000.0] to the quantized vary [0 .. 255].

On this case, now we have a special scaling issue since our xmin is totally different.

Let’s see what the floating level quantity 0.0 is represented by within the quantized vary if we apply the scaling issue to 0.0

Properly, this doesn’t fairly appear proper since, based on the diagram above, we might have anticipated the floating level worth -20.0 to map to the quantized worth 0.

That is the place the idea of zero-point is available in. The zero-point acts as a bias for shifting the scaled floating level worth and corresponds to the worth within the quantized vary that represents the floating level worth 0.0. In our case, the zero level is the unfavourable of the scaled floating level illustration of -20.0, which is -(-5) = 5. The zero level is all the time the unfavourable of the illustration of the minimal floating level worth for the reason that minimal will all the time be unfavourable or zero. We’ll discover out extra about why that is the case within the part that explains instance 4.

At any time when we quantize a worth, we’ll all the time add the zero-point to this scaled worth to get the precise quantized worth within the legitimate quantization vary. In case we want to quantize the worth -20.0, we compute it because the scaled worth of -20.0 plus the zero-point, which is -5 + 5 = 0. Therefore, quantized(-20.0, scale=4, zp=5) = 0.

## Instance-3: Affine int8 quantization

What occurs if our quantized vary is a signed 8-bit integer as an alternative of an unsigned 8-bit integer? Properly, the vary is now [-128 .. 127].

On this case, -20.0 within the float vary maps to -128 within the quantized vary, and 1000.0 within the float vary maps to 127 within the quantized vary.

The best way we calculate zero level is that we compute it as if the quantized vary is [0 .. 255] after which offset it with -128, so the zero level within the new vary is

Therefore, the zero-point for the brand new vary is -123.

Thus far, we’ve checked out examples the place the floating level vary consists of the worth 0.0. Within the subsequent set of examples, we’ll check out what occurs when the floating level vary doesn’t embody the worth 0.0

## The significance of 0.0

Why is it essential for the floating level worth 0.0 to be represented within the floating level vary?

When utilizing a padded convolution, we anticipate the border pixels to be padded utilizing the worth 0.0 in the commonest case. Therefore, it’s essential for 0.0 to be represented within the floating level vary. Equally, if the worth X goes for use for padding in your community, it’s essential make it possible for the worth X is represented within the floating level vary and that quantization is conscious of this.

## Instance-4: The untold story — skewed floating level vary

Now, let’s check out what occurs if 0.0 isn’t a part of the floating level vary.

On this instance, we’re attempting to quantize the floating level vary [40.0 .. 1000.0] into the quantized vary [0 .. 255].

Since we will’t signify the worth 0.0 within the floating level vary, we have to lengthen the decrease restrict of the vary to 0.0.

We will see that some a part of the quantized vary is wasted. To find out how a lot, let’s compute the quantized worth that the floating level worth 40.0 maps to.

Therefore, we’re losing the vary [0 .. 9] within the quantized vary, which is about 3.92% of the vary. This might considerably have an effect on the mannequin’s accuracy post-quantization.

This skewing is important if we want to make it possible for the worth 0.0 within the floating level vary could be represented within the quantized vary.

Another excuse for together with the worth 0.0 within the floating level vary is that effectively evaluating a quantized worth to verify if it’s 0.0 within the floating level vary may be very helpful. Consider operators corresponding to ReLU, which clip all values beneath 0.0 within the floating level vary to 0.0.

It is necessary for us to have the ability to signify the zero-point utilizing the identical information kind (signed or unsigned int8) because the quantized values. This permits us to carry out these comparisons rapidly and effectively.

Subsequent, let’s check out how activation normalization helps with mannequin quantization. We’ll particularly deal with how the standardization of the activation values permits us to make use of your complete quantized vary successfully.