Preparing documents and labels

Please read Preparation for the reading of documents and creating a list of keywords. We use bills data we prapared (documents and keywords).

library(keyATM)
library(quanteda)

data(keyATM_data_bills)
bills_keywords <- keyATM_data_bills$keywords
bills_dfm <- keyATM_data_bills$doc_dfm  # quanteda dfm object
keyATM_docs <- keyATM_read(bills_dfm)

Label model needs a vector of labels. Values of the vector represent topics. If a document does not have a label, the corresponding element in the label vector should have NA.

labels_use <- keyATM_data_bills$labels
labels_use
##   [1] NA NA NA NA NA NA  3 NA NA  3 NA  3 NA NA NA NA  2 NA NA NA NA NA NA NA NA
##  [26] NA NA  2 NA NA NA  3 NA NA NA NA NA NA  3 NA NA  3 NA NA  1 NA NA NA NA NA
##  [51] NA NA NA NA  3 NA NA NA  3  1 NA NA NA NA  3 NA NA NA NA NA  1 NA  3 NA NA
##  [76] NA NA NA  3 NA  2 NA  3 NA NA  2  3 NA NA NA NA NA NA NA NA  1 NA NA NA NA
## [101] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA  2 NA NA NA  2 NA NA NA NA  1
## [126]  3 NA  1 NA NA  1 NA NA NA NA  3 NA NA NA NA
length(labels_use)  # should be the same as the number of documents
## [1] 140

Please make sure that the order of label index is the same as the order of documents.

Fitting the model

out <- keyATM(
              docs              = keyATM_docs,                # text input
              no_keyword_topics = 3,                          # number of topics without keywords
              keywords          = bills_keywords,             # keywords
              model             = "label",                    # select the model
              model_settings    = list(labels = labels_use),  # set labels
              options           = list(seed = 250)
             )
## Label model is an experimental function.

Saving the model

Once you fit the model, you can save the model with save() for replication. This is the same as the Base model.

Checking top words

##      1_Education       2_Law     3_Health        4_Drug       Other_1
## 1  education [✓]     law [✓]   health [✓]      drug [✓]    commission
## 2         school   court [✓]         care       project      facility
## 3    educational      action   public [✓]         grant    electronic
## 4          grant        code         date       control   information
## 5    student [✓] enforcement subparagraph administrator communication
## 6          local      person       report        report          rule
## 7       describe     chapter     describe           air         level
## 8        library    district       senate    assistance       require
## 9           part       crime   individual     substance         party
## 10        follow       civil       follow         abuse         water
##        Other_2      Other_3
## 1     research     wildlife
## 2     national   management
## 3       center         land
## 4   technology     national
## 5     director      coastal
## 6     standard conservation
## 7    establish         fish
## 8     activity       refuge
## 9  development     resource
## 10   committee       system