Signature

Build a Neural Net to solve Exclusive OR (XOR) problem

Sep 10, 2016

Lets assume marketing manager of a shoe company wants to categorize customers in to promising and unpromising segments.

Image:inspiration nytimes

His problem:

His data points are not linearly seperable.The company’s loyal demographics are teenage boys and middle aged women.Young is good, Female is good, but both is not.It is a classic XOR problem.The problem with XOR is that there is no single line capable of seperating promising from unpromising examples.

Inspiration: Example from the Master Algorithm by Pedro Domingos

Resources required to get started:

  • Just a Unix computer (Linux, ubuntu, macOS) with Python installed

Amongst seveval deeplearning frameworks, such as TensorFlow, Theano, Torch etc, Nervana’s Neon is an option which could be used to solve XOR problem

  • 1

    Install Neon

    	git clone https://github.com/NervanaSystems/neon.git
    	cd neon
    	make
    	. .venv/bin/activate
    		    
    Additional instructions
  • 2

    Import required libraries

    	import cPickle, numpy as np 
    	from neon.initializers import Uniform, Gaussian 
    	from neon.layers import GeneralizedCost, Affine 
    	from neon.models import Model 
    	from neon.optimizers import GradientDescentMomentum, Schedule 
    	from neon.transforms import Softmax, CrossEntropyMulti, SumSquared, Rectlin 
    	from neon.callbacks.callbacks import Callbacks 
    	from neon.data import ArrayIterator 
    	from neon.util.argparser import NeonArgparser 
    	from neon.transforms import Misclassification 
    	from numpy import genfromtxt 
    	parser = NeonArgparser(__doc__) 
    	args = parser.parse_args(gen_be=True)
    			
  • 3

    Create training and test dataset

    X_data = genfromtxt("inputXOR.csv",delimiter=',') 
    Y_data = genfromtxt("outputXOR.csv",delimiter=',',dtype=None) 
    
    training_data = ArrayIterator(X=X_data[0:600], y=Y_data[0:600],nclass=2,make_onehot=True) 
    test_data = ArrayIterator(X=X_data[600:800], y=Y_data[600:800],nclass=2,make_onehot=True)
    
    Download inputXOR.csv Download outputXOR.csv

    Why ArrayIterator?

    • ArrayIterator object returns one tuple(input,label) at a time

    • This iterator supports classification, regression, and autoencoder tasks

  • 4

    Build a network by concatenating layers

    init = Gaussian() 
    layers = [] 
    layers.append(Affine(nout=2, init=init, bias=init, activation=Rectlin())) 
    layers.append(Affine(nout=2, init=init, bias=init, activation=Softmax()))
    
    
  • 5

    Pass layers in to Model class

    mlp = Model(layers=layers) 
    	

    Why Rectilinear Unit?

    • No vanishing or exploding gradient problems

    • Cross validation with xoR data suggested Rectiliear function is most suitable

    • Other transformers - Identity, Explin, Normalizer,Tanh,Logistic

    Why SoftMax?

    • Softmax function enables probabilistic interpretation of output

    • Just like logistic function, the derivative of softmax function includes the inputs

  • 6

    Select a loss function

    cost = GeneralizedCost(costfunc=SumSquared()) 
    	
    Other cost functions
  • 7

    Implement Stochastic Gradient Descent optimizer

    optimizer = GradientDescentMomentum(0.1, momentum_coef=0.2) 
    callbacks = Callbacks(mlp, eval_set=training_data, **args.callback_args) 
    
    	
  • 8

    Train the Model

    mlp.fit(training_data, optimizer=optimizer, num_epochs=100, cost=cost,callbacks=callbacks) 
    
    	
  • 9

    Get predictions and measure accuracy

    results = mlp.get_outputs(test_data) 
    prediction = results.argmax(1) 
    print "Evaluating the model" 
    error_pct = 100 * mlp.eval(test_data, metric=Misclassification()) 
    print 'Misclassification error = %.1f%%' % error_pct
    
    np.savetxt("predictions.csv", prediction, delimiter=",")
    
    #downloads weights is required
    weights =  mlp.get_description(get_weights=True) 
    print mlp.get_description(True)['model']['config']['layers'][0]['params']['W'] 
    print mlp.get_description(True)['model']['config']['layers'][1]['params']['W'] 
    	

In follow up blog post, lets trace end to end weight updates with backpropogation using Keras/ Tensorflow

  • BTW, you might also like these previous posts