If the Mayans are right, this will be the 4th to last (or perhaps 3rd to last) post on this blog. Hopefully they aren’t right.
Stuff I did today:
1. Examined the output of the scripts I ran over the weekend; a reasonable number of the weights for word->predicate features look correct, although a lot are also wrong; the weights on predicate<->predicate features look completely wrong. It’s clear that the model is overfitting most of the time. I probably need to regularize, which I currently do at the end but not at the early iterations. This is because regularizing made the model not explore enough, but I think the solution to this is to have a separate model specifically for exploration.
2. Tried to use Java’s built-in serialization so that I could save/load computation state in a more fine-grained way than I was currently. It turns out that this was a terrible idea, and I ended up just writing my own serialization (using the Fig parser to do a lot of the heavy lifting).
3. Set up my code on the NLP machines and ran a bunch of jobs in parallel with different early stopping parameters for L-BFGS (which was dominating the runtime before, for reasons that we aren’t sure of yet).
4. Met with Percy to try to debug things. Current issues are: incorrect feature weights, slow convergence of L-BFGS, and overfitting to the current beam. Tomorrow I hope to better understand what is causing each of these (for instance, is L-BFGS just slow because the problem is non-convex, or for some other reason?).