Discussion about this post

User's avatar
Calvin McCarter's avatar

Normalizing data (ie min-max scaling) has certain uses. Back when people used SVMs for things like text classification on Ngram counts, it was almost always best to do min-max scaling, either to [0, 1] or [-1,1] (cf https://neerajkumar.org/writings/svm/). Another way to think about it is that min-max scaling to [0, 1] is can be used in place of quantile normalization.

For example, you may want highly-skewed features to be transformed to have lower variance. Another example is that for discretization (ie quantization / binning / 1d clustering), you might want to have uniform constant-width bins, implemented with min-max scaling, instead of equally-populated bins, implemented with quantile normalization. Of course, discretization is usually not a good idea, but it sometimes is, particularly for dependent variables (eg whenever quantile regression is valuable).

Also -- apologies for a bit of self-promotion -- you can often obtain even better results by using a tunable "interpolation" between min-max normalization and quantile normalization, as I showed in: "The Kernel Density Integral Transformation", C. McCarter, TMLR 2023. It acts as a nonparametric alternative to variance stabilizing transforms, working even for left-skewed data, for example. And it produces far more intuitive discretization results, without requiring user specification of the number of clusters, than various clustering techniques.

Expand full comment
2 more comments...

No posts