News

The code has been re-factored and integrated into the new repo: https://github.com/taolei87/rcnn/tree/master/code/sentiment

The new repo is recommended because it is more modular and supports more running options, type of models etc.

CNNs with non-linear and non-consecutive feature maps

This repo contains an implementation of CNNs described in the paper Molding CNNs for text: non-linear, non-consecutive convolutions by Tao Lei, Regina Barzilay and Tommi Jaakkola.

Dependencies

Data

Results

Directory trained_models contains saved models of the sentiment classification task. We reproduced the results mentioned in our paper by setting the random seed explicitly. The performance of each trained models are listed below:

Fine-grained models Dev acc. Test acc.
stsa.root.fine.pkl.gz 49.5 50.6
stsa.phrases.fine.pkl.gz 53.4 51.2
stsa.phrases.fine.2.pkl.gz ** 53.5 52.7
Binary models Dev acc. Test acc.
stsa.root.binary.pkl.gz 87.0 87.0
stsa.phrases.binary.pkl.gz 88.9 88.6

Note**: more recent run (stsa.phrases.fine.2.pkl.gz) gets better results than those reported in our paper.


Usage

Our model is implemented in model.py. The command python model.py --help will list all the parameters and corresponding descriptions.

Here is an example command to train a model on the binary sentiment classification task:

python model.py --embedding word_vectors/stsa.glove.840B.d300.txt.gz  \
    --train data/stsa.binary.phrases.train  \
    --dev data/stsa.binary.dev  --test data/stsa.binary.test  \
    --model output_model

We can optionally specify Theano configs via THEANO_FLAGS:

THEANO_FLAGS='device=cpu,floatX=float32'; python model.py ...

Another example with more hyperparamter settings:

export OMP_NUM_THREADS=1;   #specify number of cores 

THEANO_FLAGS='device=cpu,floatX=float64'; python model.py  \
    --embedding word_vectors/stsa.glove.840B.d300.txt.gz  \
    --train data/stsa.binary.phrases.train  \
    --dev data/stsa.binary.dev  --test data/stsa.binary.test  \
    --model output_model  \
    --depth 3  --order 3  --decay 0.5  --hidden_dim 200  \
    --dropout_rate 0.3  --l2_reg 0.00001  --act relu  \
    --learning adagrad  --learning_rate 0.01