Train a basic LDA model using the NIPS corpus

Using the NIPS dataset's corpus, train a LDA model.

There are already implementations for LDA:
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html
https://radimrehurek.com/gensim/models/ldamodel.html

Create scripts (src/papers/models/) exposing a function that using the packages, to train a model for the given corpus (as a parameter). 
Expose a function for extracting topics for new, unseen, documents.

Create a notebook for the process - loading the NIPS corpus and calling the train and predict functions. Remember to divide the dataset before training, and testing the prediction part on the unseen documents.

The notebook should print the extracted topics for the preprocessed documents, compared to the non processed ones. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train a basic LDA model using the NIPS corpus #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Train a basic LDA model using the NIPS corpus #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions