Skip to content

Conversation

@niklasmei
Copy link
Collaborator

This adds a module for unsupervised pretraining using BERT-style mask prediction.

I originally made it as to pretrain a model that returns a latent view of input sequences along with a single vector to represent the sequences. This was the case because I used a model based on DeepIce, where I had the cls-token and the processed sequence. Some variables still reflect the original use in their names.

In the version here I made it optional to provide a vector that summarizes the input data, which is only used to predict some summary feature (the total charge in an event in the standard case).

It is important that the model that is to be pretrained does not change the number of sequence elements beyond providing an additional summary vector, like a cls-token. Other than that this pretraining module should be indifferent to the model that is pretrained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant