What is a Box Cox Transformation?

  • Data transforms are usually applied so that the data appear to more closely meet assumptions of a statistical inference model to be applied or to improve the interpret-ability or appearance of graphs.
  • Power transformation is a class of transformation functions that raise the response to some power. For example, a square root transformation converts X to X1/2
  • Box Cox transformation is a popular power transformation method developed by George E. P. Box and David Cox.

Box Cox Transformation Formula

The formula of the Box Cox transformation is:

Box Cox EQ1
Where:

  • y is the transformation result
  • x is the variable under transformation
  • λ is the transformation parameter

Use Minitab to Perform a Box-Cox Transformation

Minitab provides the best Box-Cox transformation with an optimal λ that minimizes the model SSE (sum of squared error). Here is an example of how we transform the non-normally distributed response to normal data using Box-Cox method.
Data File: “Box-Cox” tab in “Sample Data.xlsx”

Step 1: Test the normality of the original data set.

  1. Click Stat → Basic Statistics → Normality Test.
  2. A new window named “Normality Test” pops up.
  3. Select “Y” as “Variable.”
  4. Click “OK.”
  5. The normality test results are shown automatically in the new window.

Normality Test:

  • H0: The data are normally distributed.
  • H1: The data are not normally distributed.

If p-value > alpha level (0.05), we fail to reject the null hypothesis. Otherwise, we reject the null. In this example, p-value = 0.029 < alpha level (0.05). The data are not normally distributed.

Step 2: Run the Box-Cox Transformation:

  1. Click Stat → Control Charts → Box-Cox Transformation.
  2. A new window named “Box-Cox Transformation” pops up.
  3. Click into the blank list box below “All observations for a chart are in one column.”
  4. Select “Y” as the variable.
  5. Select “Run” into the box next to “Subgroup sizes (enter a number or ID column).”
  6. Click “OK.”
  7. The analysis results are shown automatically in the new window.

The software looks for the optimal value of lambda that minimizes the SSE (Sum of Squares of Error). In this case the minimum value is 0.12. The transformed Y can also be saved in another column.

  1. Create a new column named “Y1” in the data table.
  2. Click Stat → Control Charts → Box-Cox Transformation.
  3. Again, a window named “Box-Cox Transformation” pops up.
  4. Like before, Select “Y” as the variable.
  5. Select “Run” into the box next to “Subgroup sizes (enter a number or ID column).”
  6. Now, click on the “Options” button in the “Box-Cox Transformation” window.
  7. A new window named “Box-Cox Transformation – Options” appears.
  8. Click in the blank box under “Store transformed data in” and all the columns pop up in the list box on the left.
  9. Select “Y1” in “Store transformed data in.”
  10. Click “OK” in the window “Box-Cox Transformation – Options.”
  11. Click “OK” in the window “Box-Cox Transformation.”
  12. The transformed column is stored in the column “Y1.”

Run the normality test to check whether the transformed data are normally distributed.

Use the Anderson–Darling test to test the normality of the transformed data

  • H0: The data are normally distributed.
  • H1: The data are not normally distributed.

Model summary: If p-value > alpha level (0.05), we fail to reject the null. Otherwise, we reject the null. In this example, p-value = 0.327 > alpha level (0.05). The data are normally distributed.

2 Comments

  1. Esther Mepaiyeda on January 28, 2019 at 5:15 am

    Thank you Denise. This is quite explicit and useful.

  2. Terrie on January 22, 2018 at 11:36 am

    THANK YOU!!!!! You Rock! 🙂 this is just what I was looking for. – Terrie

Leave a Comment