Sachin Patel Logo Image
Sachin Patel

Etsy Multiclass Classification Model

It is a multiclass classification model for classifying different Products on the basis of top category wise, sub-category wise, primary color wise.

Project Image

Project Overview

The project aims to develop a multiclass classification machine learning model for predicting top category ID, bottom category ID, and primary color ID of products on the Etsy e-commerce platform, utilizing text data from product titles, tags, types, and craft types. The model undergoes preprocessing using techniques like CountVectorizer and TfidfTransformer, followed by training with Support Vector Machine (SVM) algorithm with a linear kernel. The project contributes to enhancing product categorization accuracy and efficiency on Etsy, potentially leading to better user experience and increased customer satisfaction in the e-commerce domain.

The dataset provided for this project comprises Parquet and TFRecords files, totaling 14GB when zipped. It contains 245,000 records in the training dataset and 27,000 records in the test dataset. The dataset includes various product attributes such as titles, descriptions, tags, types, and craft types, with the target attributes having no missing values.

Tools Used

Python
SVM
Scikit Learn
NumPy
Pandas
Transformer
Regularization
Classification
Multiclass
Machine Learning
Model Development
CountVectorizer