Code for training the transformer models of the 6th place solution in the Google QUEST Q&A Labeling Kaggle competition.
A detailed description is posted on the Kaggle discussion section here
Running the code requires:
data
. It can be downloaded from https://www.kaggle.com/c/google-quest-challenge/data. train.py
script.To reproduce all 4 transformer models run the following commands:
python train.py -model_name=siamese_roberta && python finetune.py -model_name=siamese_roberta
python train.py -model_name=siamese_bert && python finetune.py -model_name=siamese_bert
python train.py -model_name=siamese_xlnet && python finetune.py -model_name=siamese_xlnet
python train.py -model_name=double_albert && python finetune.py -model_name=double_albert
The notebooks folder contains two notebooks. The stacking.ipynb
implements our weighted ensembling + post-processing grid search and the oof_cvs.ipynb
shows the CV scores of our models under variuos settings (i.e. ignoring hard targets or ignoring duplicate question rows).