Category Archives for "Tutorials"

Jan 09

Mastering Support Vector Machine Algorithm: A Comprehensive Guide

By ganpati | Tutorials

Futuristic abstract background with the words big data and transparent circles

Here are few things we need to understand before getting into the understanding the Naive Bayes model.

Feature Vector

In pattern recognition and machine learning, a feature vector is an n-dimensional vector of numerical features that represent some object. Many algorithms in machine learning require a numerical representation of objects, since such representations facilitate processing and statistical analysis. When representing images, the feature values might correspond to the pixels of an image, when representing texts perhaps term occurrence frequencies. Feature vectors are equivalent to the vectors of explanatory variables used in statistical procedures such as linear regression. Feature vectors are often combined with weights using a dot product in order to construct a linear predictor function that is used to determine a score for making a prediction.

The vector space associated with these vectors is often called the feature space. In order to reduce the dimensionality of the feature space, a number of dimensionality reduction techniques can be employed.

Separating Hyperplane


Just by looking at a plot, we can see that it is possible to separate the data. For instance, we could trace a line and then classify all the data points above the line, and all the other data points below the line. Such a line is called a separating hyperplane.

The first thing we can see from this definition, is that a SVM needs training data. Which means it is a supervised learning algorithm. It is also important to know that SVM is a classification algorithm. Which means we will use it to predict if something belongs to a particular class.

Goal of SVM

The goal of a support vector machine is to find the optimal separating hyperplane which maximizes the margin of the training data.

Here is the series of tutorials that will take you through the concept of SVM in details.

The tutorial helps you answer several questions like:

Part 1: What is the goal of the Support Vector Machine (SVM)?
Part 2: How to compute the margin?
Part 3: How to find the optimal hyperplane?
Part 4: Unconstrained minimization
Part 5: Convex functions
Part 6: Duality and Lagrange multipliers

Aug 07

Performance Estimation: Generalization Performance Vs. Model Selection

By ganpati | Tutorials

“How do we estimate the performance of a machine learning model?”
Author: Sebastian Raschka

“First, we feed the training data to our learning algorithm to learn a model. Second, we predict the labels of our test labels. Third, we count the number of wrong predictions on the test dataset to compute the model’s error rate.”

Not so fast! Depending on our goal, estimating the performance of a model is not that trivial, unfortunately. Maybe we should address the previous question from another angle: “Why do we care about performance estimates at all?” Ideally, the estimated performance of a model tells how well it performs on unseen data – making predictions on future data is often the main problem we want to solve in applications of machine learning or the development of novel algorithms. Typically, machine learning involves a lot of experimentation, though — for example, the tuning of the internal knobs of a learning algorithm, the so-called hyperparameters. Running a learning algorithm over a training dataset with different hyperparameter settings will result in different models. Since we are typically interested in selecting the best-performing model from this set, we need to find a way to estimate their respective performances in order to rank them against each other. Going one step beyond mere algorithm fine-tuning, we are usually not only experimenting with the one single algorithm that we think would be the “best solution” under the given circumstances. More often than not, we want to compare different algorithms to each other, oftentimes in terms of predictive and computational performance.

Let us summarize the main points why we evaluate the predictive performance of a model:

We want to estimate the generalization error, the predictive performance of our model on future (unseen) data.
We want to increase the predictive performance by tweaking the learning algorithm and selecting the best performing model from a given hypothesis space.
We want to identify the machine learning algorithm that is best-suited for the problem at hand; thus, we want to compare different algorithms, selecting the best-performing one as well as the best performing model from the algorithm’s hypothesis space.
Although these three sub-tasks listed above have all in common that we want to estimate the performance of a model, they all require different approaches. We will discuss some of the different methods for tackling these sub-tasks in this article.

Of course, we want to estimate the future performance of a model as accurately as possible. However, if there’s one key take-away message from this article, it is that biased performance estimates are perfectly okay in model selection and algorithm selection if the bias affects all models equally. If we rank different models or algorithms against each other in order to select the best-performing one, we only need to know the “relative” performance. For example, if all our performance estimates are pessimistically biased, and we underestimate their performances by 10%, it wouldn’t affect the ranking order. More concretely, if we have three models with prediction accuracy estimates such as

M2: 75% > M1: 70% > M3: 65%,

we would still rank them the same way if we add a 10% pessimistic bias:

M2: 65% > M1: 60% > M3: 55%.

On the contrary, if we report the future performance of the best ranked model (M2) to be 65%, this would obviously be quite inaccurate. Estimating the absolute performance of a model is probably one of the most challenging tasks in machine learning.

Read more+

Jan 30

A Simple Step by Step Guide to WEKA

By ganpati | Tutorials

A lot of people find data mining mysterious especially due to the coding part. WEKA takes that mystery away from data mining by providing you with a cool interface where you can do most of your job by the click of a mouse without writing any code. Weka’s GUI allows you to:

  • Preprocess data
  • Choose learning algorithms
  • Evaluate the results
  • Build simple visualizations
  • Form an interpretation of the results
  • Export some output

So, if you want to start machine learning algorithms without much of a coding background WEKA is the tool for you.

Here is a list of lectures that will ease you into the world of machine learning using WEKA. I’ve listed down the steps for you for your convenience.

Step 1:


You may visit this link and download the stable book 3rd ed version for Windows or Mac based on your preference.

Step 2:

The arff data format
WEKA uses arff data format for most of the tutorials. An arff file is similar to CSV and looks like below:

Step 3

Downloading toy data files for practice:

Here is a list of files that can be downloaded from the internet in order to proceed with the YouTube tutorials smoothly.

Step 4

The best way to learn Weka is by viewing the YouTube video tutorials offered by University of Waikato. The official link of this course can be viewed here. The page also has links to data sets that will be used during the course.

1 2 3 7