Sentiment Analysis - Coding Classification Algorithms from Scratch

Project Overview

This project implements several classification algorithms for machine learning tasks, specifically for sentiment analysis. It is part of the “Machine Learning with Python - From Linear Models to Deep Learning” course offered by MIT. The project is hosted on GitHub Pages. To view the project, click here. If you’re interested in exploring the source code, you can find it here.

Algorithms Overview

  1. Perceptron Algorithm
    • Updates classification parameters based on a single step and iterates through the dataset multiple times to converge to an optimal solution.
  2. Average Perceptron Algorithm
    • Similar to the Perceptron algorithm but computes the average of the parameters over multiple iterations, leading to a more stable model.
  3. Pegasos Algorithm
    • Optimizes linear classifiers using stochastic gradient descent, incorporating a regularization term to prevent overfitting.

Bag of Words (BoW)

Bag of Words (BoW) is a technique used in Natural Language Processing (NLP) to represent text data. It treats each document as a collection of words, ignoring grammar and word order. This representation is useful for tasks like text classification and sentiment analysis.