Chemicals in Cosmetics

Chemicals in Cosmetics

Introduction

In today's modern world, it's almost impossible to go through a single day without seeing cosmetics being advertised in some form or another. But while cosmetics can provide a vast range of beauty benefits, they can also potentially contain ingredients that can be hazardous to our health.

In this project, I analyzed popular chemicals that are contained in our cosmetics.

Tools Used

1.Microsoft SQL Server

2.Tableau

Data Source

Data was gotten from kaggle.com. Kaggle is an online community platform for data scientists and machine learning enthusiasts, it has a huge repository of community published data.

Data Cleaning

This data was cleaned, (start to finish) only using Microsoft SQL Server. I cleaned only the columns needed for analysis, and dropped the rest. The columns needed for analysis were the Brandname , The chemical name and all date columns .

The following steps were involved

1.The Brandname column had a lot of Misspelt words, some with more punctuation marks than needed. E.g "L'oreal" and "L''oreal".

2.The chemical name column likewise had the same errors like the Brandname column.

3.Upon looking at the values from the Brandname column , it was concluded that they were missing data. A closer look at the column was taken and there was some sort of correlation between the brandname column and the chemical name column . They both had similar data. So i used the chemical name column,self joined with the brandname column and stating a unique value for both of them which was the "id" column.

4.All date columns were standardized in their respective date formats.

5.Filtered out missing data in order to get a near accuracy.

6.This dataset had no duplicate values, all columns were unique.

Conclusion

I know its a vague idea to start searching for the chemicals in the cosmetics we use . However, its good we have an idea of what our skin is consuming.

Click here for the github repository which has the answer to the questions asked directly from Kaggle.