Hate Speech Detection

A machine learning system for classifying text as hate speech or non-hate speech

Back to Portfolio

Overview

The HateSpeech Detection project is a machine learning-based system designed to classify text as either hate speech or non-hate speech using Natural Language Processing (NLP) techniques. By leveraging a balanced dataset from Kaggle, the project employs Logistic Regression combined with TF-IDF vectorization to analyze and categorize text. This system aims to assist in moderating online content, fostering safer digital environments by identifying harmful language. The project is implemented in Python and includes robust preprocessing, model training, and evaluation steps, making it a valuable tool for content moderation.

Data Sources

Technologies Used

Key Features

Model Training Process

Model Performance

The model was trained on 80% of the dataset and evaluated on the remaining 20%, yielding the following performance metrics:

Challenges Overcome

Benefits

Future Improvements

Try It

Clone the repo, install dependencies, download the dataset, and run hate_speech_detection.py! See GitHub for setup.


Visit GitHub Repo