Neural Network craziness
Neural networks are a cool tool for what amounts to regression of data. In implementing a simple feed-foward, three-layer neural network and trying to get it to learn the majority function (are more than half of the inputs active?), I made a few interesting observations. I still have a lot to learn about neural networks, but in my novice jump into them, I've found that:
- The simple back-propagation algorithm is susceptible to bad starts, at least with a non-linear search space. When it does get off to a bad start, there's no turning back. So far, I've found that the best bet is to restart with random weights.
- In the back-propagation algorithm, adjusting the scalar variable (alpha) in an exponential decay manner (similar to simulated annealing) yields better results than keeping it constant.
- If the sum of the delta change in the weights is really large relatively speaking, then the network will likely suck.
- If the delta change in the weights keeps increasing, it will be difficult to turn back. The algorithm has probably fallen into a spiraling cycle.
- Having an imbalance of negative examples vs positive examples in a binary output variable for some reason makes the back propagation algorithm converge more quickly, at least in the majority problem.
- Even though the majority problem can be represented with a single hidden node (actually, just a two-layer network) with the bias being -n/2 and all weights being set to 1, the backprop algorithm as I have implemented it fails miserably with just one hidden node and a large number of inputs. It has greater success with several hidden node. This could just be an implementation issue, but my implementation follows text book examples pretty closely.
- Neural networks aren't all that hot for general classification problems, but they're a good complement at times.
- The simple back-propagation algorithm is susceptible to bad starts, at least with a non-linear search space. When it does get off to a bad start, there's no turning back. So far, I've found that the best bet is to restart with random weights.
- In the back-propagation algorithm, adjusting the scalar variable (alpha) in an exponential decay manner (similar to simulated annealing) yields better results than keeping it constant.
- If the sum of the delta change in the weights is really large relatively speaking, then the network will likely suck.
- If the delta change in the weights keeps increasing, it will be difficult to turn back. The algorithm has probably fallen into a spiraling cycle.
- Having an imbalance of negative examples vs positive examples in a binary output variable for some reason makes the back propagation algorithm converge more quickly, at least in the majority problem.
- Even though the majority problem can be represented with a single hidden node (actually, just a two-layer network) with the bias being -n/2 and all weights being set to 1, the backprop algorithm as I have implemented it fails miserably with just one hidden node and a large number of inputs. It has greater success with several hidden node. This could just be an implementation issue, but my implementation follows text book examples pretty closely.
- Neural networks aren't all that hot for general classification problems, but they're a good complement at times.
2 Comments:
Uh say what?? You lost me at, "back-propagation algorithm".
Many people arе involved in exercise pгograms and an аctive lifestyle.You should use
a tripod to рosition the camera at differеnt angles
when photographing ѕhoes. Remember, when it ϲomeѕ to photographing shoes you should focus on ways to makе the shoes stand out on camera, whether it's by highliǥhting their color, texture or purposе.
Look at myy weblog: hiking boots mens
Post a Comment
<< Home