Pegasos is an acronym for Primal Estimated sub-GrAdient Solver. This algorithm is used to solve optimization problems in Support Vector Machines (SVMs). It uses a form of stochastic gradient descent.
The number of iterations required is determined by the accuracy we want, not the size of the dataset.
In SVMs we are trying to find a separating hyperplane. In 2-dimensional problems we try to find a line that properly classifies the two classes of data.
The Pegasos algorithm works as “A set of randomly selected points from our training data is added to a batch. Each of these data points is tested to see if classification is proper. If true, then these data points are ignored. If false, then data points added to the update set. After iteration completes, the weights vector is updated with the improperly classified vectors. The loop is repeated.”
Pseudocode:
Set w to all zeros
For each batch
Choose k data vectors randomly
For each vector
If vector classification is incorrect
Change the weight vectors : w
Accumulate the changes to w
Algorithm:
This is a sequential Pegasos algorithm. The inputs T and k set the number of iterations and batch size respectively. eta determines the learning rate based on T iterations and lam.
def predict(w,x):
return w*x.T
def Pegasos(dataset,labels,lam,T,k):
m,n=shape(dataset); w=zeroes(n)
dataBatch=range(m)
for t in range(1,T+1):
wc = mat(zeros(n))
eta=1.0/(lam*t)
random.shuffle(dataBatch)
for j in range(k):
i=dataIndex[j]
p=predict(w,dataset[i,:])
if labels[i]*p<1:
wDelta += labels[i]*dataset[i,:].A #accumulate changes
w=(1.0-1/t)*w+(eta/k)*wc
return w
Try it out for your Dataset.
Hope it Helps!