Here are few things we need to understand before getting into the understanding the Naive Bayes model.
In pattern recognition and machine learning, a feature vector is an n-dimensional vector of numerical features that represent some object. Many algorithms in machine learning require a numerical representation of objects, since such representations facilitate processing and statistical analysis. When representing images, the feature values might correspond to the pixels of an image, when representing texts perhaps term occurrence frequencies. Feature vectors are equivalent to the vectors of explanatory variables used in statistical procedures such as linear regression. Feature vectors are often combined with weights using a dot product in order to construct a linear predictor function that is used to determine a score for making a prediction.
The vector space associated with these vectors is often called the feature space. In order to reduce the dimensionality of the feature space, a number of dimensionality reduction techniques can be employed.
Just by looking at a plot, we can see that it is possible to separate the data. For instance, we could trace a line and then classify all the data points above the line, and all the other data points below the line. Such a line is called a separating hyperplane.
The first thing we can see from this definition, is that a SVM needs training data. Which means it is a supervised learning algorithm. It is also important to know that SVM is a classification algorithm. Which means we will use it to predict if something belongs to a particular class.
The goal of a support vector machine is to find the optimal separating hyperplane which maximizes the margin of the training data.
Here is the series of tutorials that will take you through the concept of SVM in details.
The tutorial helps you answer several questions like:
Part 1: What is the goal of the Support Vector Machine (SVM)?
Part 2: How to compute the margin?
Part 3: How to find the optimal hyperplane?
Part 4: Unconstrained minimization
Part 5: Convex functions
Part 6: Duality and Lagrange multipliers