Please read Preparation for the reading of documents and creating a list of keywords. We use bills data we prapared (documents and keywords).
library(keyATM)
library(quanteda)
data(keyATM_data_bills)
bills_keywords <- keyATM_data_bills$keywords
bills_dfm <- keyATM_data_bills$doc_dfm # quanteda dfm object
keyATM_docs <- keyATM_read(bills_dfm)
Label model needs a vector of labels. Values of the vector represent
topics. If a document does not have a label, the corresponding element
in the label vector should have NA
.
labels_use <- keyATM_data_bills$labels
labels_use
## [1] NA NA NA NA NA NA 3 NA NA 3 NA 3 NA NA NA NA 2 NA NA NA NA NA NA NA NA
## [26] NA NA 2 NA NA NA 3 NA NA NA NA NA NA 3 NA NA 3 NA NA 1 NA NA NA NA NA
## [51] NA NA NA NA 3 NA NA NA 3 1 NA NA NA NA 3 NA NA NA NA NA 1 NA 3 NA NA
## [76] NA NA NA 3 NA 2 NA 3 NA NA 2 3 NA NA NA NA NA NA NA NA 1 NA NA NA NA
## [101] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2 NA NA NA 2 NA NA NA NA 1
## [126] 3 NA 1 NA NA 1 NA NA NA NA 3 NA NA NA NA
length(labels_use) # should be the same as the number of documents
## [1] 140
Please make sure that the order of label index is the same as the order of documents.
out <- keyATM(
docs = keyATM_docs, # text input
no_keyword_topics = 3, # number of topics without keywords
keywords = bills_keywords, # keywords
model = "label", # select the model
model_settings = list(labels = labels_use), # set labels
options = list(seed = 250)
)
## Label model is an experimental function.
Once you fit the model, you can save the model with
save()
for replication. This is the same as the Base model.
top_words(out)
## 1_Education 2_Law 3_Health 4_Drug Other_1
## 1 education [✓] law [✓] health [✓] drug [✓] commission
## 2 school court [✓] care project electronic
## 3 grant code date control facility
## 4 educational action public [✓] administrator information
## 5 student [✓] person subparagraph grant communication
## 6 local enforcement report air party
## 7 public [3] district congress assistance level
## 8 describe chapter follow water waste
## 9 library crime describe substance compact
## 10 eligible term individual report low
## Other_2 Other_3
## 1 research management
## 2 national wildlife
## 3 center land
## 4 technology coastal
## 5 director conservation
## 6 activity national
## 7 establish fish
## 8 standard system
## 9 development refuge
## 10 committee resource