(For those used to the European date format, that’s Dec. 11, not Nov. 12.)
Today was my first actual work day after the end of NIPS. I spent most of it debugging code. A known issue was that logistic regression was overfitting by sending some of the weights off to , and because of a non-convexity in the search problem this actually lead to bad training set accuracy. I fixed this by re-parameterizing things to place an upper bound of 0 on some of the weights.
Then I found a bunch of bugs and fixed them:
1. I was not clearing my cache after parameters changed, so I incorrectly computed some parameter scores.
2. L-BFGS was not always correctly informing other parts of the code when the current point changed.
3. I was accidentally using the same approximate Hessian for multiple L-BFGS instances, which led to lots of numerical instabilities, not to mention just bad output.
Additionally, at Percy’s suggestion I kept logical forms found during previous iterations around for future optimization iterations; this together with fixing the above 3 bugs led to code that was at least able to get 100% training set accuracy on a set of size 5. However, the logical forms getting the right answer were not the “true” logical forms — they in particular had lots of spurious nodes. To fix this, I added an L1 penalty to the weights in the log-linear model, but this led to poor test set accuracy (presumably because some of the nodes labelled “spurious” were actually correct, but the model couldn’t tell yet because not enough “good” logical forms were on the top-level beam). So, I removed the L1 penalty for enough iterations to get 100% training accuracy, then added it back in. This seemed to help but it still gets lots of the logical forms wrong (despite getting the answer right). I’m running on a larger training set overnight to see what difference that makes, and will examine the right and wrong answers in more detail tomorrow. There is also still one more known bug that I need to track down (also to be done tomorrow).