This project is part of the “Machine Learning with Python - From Linear Models to Deep Learning” course offered by MIT.
This project implements several classification algorithms for machine learning tasks. The algorithms included are:
Perceptron Algorithm: This algorithm updates classification parameters based on a single step. It iterates through the dataset multiple times to converge to an optimal solution.
Average Perceptron Algorithm: Similar to the Perceptron algorithm but computes the average of the parameters over multiple iterations, leading to a more stable model.
Pegasos Algorithm: This algorithm optimizes linear classifiers using stochastic gradient descent. It incorporates a regularization term to prevent overfitting.
Bag of Words (BoW) is a technique used in Natural Language Processing (NLP) to represent text data. It treats each document as a collection of words, ignoring grammar and word order. This representation is useful for tasks like text classification and sentiment analysis.
This Python file contains implementations of the aforementioned algorithms along with helper functions for feature extraction and classification.
To use these algorithms, import the necessary functions from project1.py and pass the required inputs according to each function’s documentation.
For example:
from project1 import perceptron, classify, classifier_accuracy
# Load your data and feature matrices
# ...
# Train the perceptron algorithm
theta, theta_0 = perceptron(feature_matrix, labels, T=10)
# Classify new data points
predictions = classify(new_feature_matrix, theta, theta_0)
# Compute classifier accuracy
train_accuracy, val_accuracy = classifier_accuracy(perceptron, train_feature_matrix, val_feature_matrix,
train_labels, val_labels, T=10)