Skip to content

Commit 49f5fad

Browse files
committed
Update docs
1 parent d9d0119 commit 49f5fad

File tree

7 files changed

+112
-72
lines changed

7 files changed

+112
-72
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -89,13 +89,13 @@ The following configuration options can be passed both directly to a new [Model]
8989

9090
#### Options
9191

92-
| Property | Description | Default |
93-
| --- | --- | --- |
94-
| **nGramMin** | Minimum n-gram size | `1` |
95-
| **nGramMax** | Maximum n-gram size | `1` |
96-
| **minimumConfidence** | Minimum confidence required for predictions | `0.2` |
97-
| **vocabulary** | Terms mapped to indexes in the model data entries, set to `false` to store terms directly in the data entries | `[]` |
98-
| **data** | Object literal containing all training data | `{}` |
92+
| Property | Type | Default | Description |
93+
| --- | --- | --- | --- |
94+
| **nGramMin** | `int` | `1` | Minimum n-gram size |
95+
| **nGramMax** | `int` | `1` | Maximum n-gram size |
96+
| **minimumConfidence** | `int` \| `float` | `0.2` | Minimum confidence required for predictions |
97+
| **vocabulary** | `Array` \| `Set` \| `false` | `[]` | Terms mapped to indexes in the model data, set to `false` to store terms directly in the data entries |
98+
| **data** | `Object` | `{}` | Key-value store of labels and training data vectors |
9999

100100
### Using n-grams
101101

@@ -113,7 +113,7 @@ const classifier = new Classifier({
113113
nGramMax: 2
114114
})
115115

116-
let tokens = tokenize('I really dont like it')
116+
let tokens = classifier.tokenize('I really dont like it')
117117

118118
console.log(tokens)
119119
```

docs/classifier.md

Lines changed: 58 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -7,42 +7,86 @@
77
* [.model](#Classifier+model) : <code>Model</code>
88
* [.train([input], label)](#Classifier+train) ⇒ <code>this</code>
99
* [.predict(input, [maxMatches], [minimumConfidence])](#Classifier+predict) ⇒ <code>Array</code>
10+
* [.splitWords(input)](#Classifier+splitWords) ⇒ <code>Array</code>
11+
* [.tokenize(input)](#Classifier+tokenize) ⇒ <code>Object</code>
12+
* [.vectorize(tokens)](#Classifier+vectorize) ⇒ <code>Object</code>
13+
* [.cosineSimilarity(v1, v2)](#Classifier+cosineSimilarity) ⇒ <code>float</code>
1014

1115
<a name="new_Classifier_new"></a>
1216

1317
### new Classifier([model])
1418

1519
| Param | Type | Default | Description |
1620
| --- | --- | --- | --- |
17-
| [model] | <code>Model</code> \| <code>Object</code> | | |
18-
| [model.nGramMin] | <code>int</code> | <code>1</code> | Minimum n-gram size |
19-
| [model.nGramMax] | <code>int</code> | <code>1</code> | Maximum n-gram size |
20-
| [model.minimumConfidence] | <code>int</code> \| <code>float</code> | <code>0.2</code> | Minimum confidence required for predictions |
21-
| [model.vocabulary] | <code>Array</code> \| <code>Set</code> \| <code>false</code> | <code>[]</code> | Terms mapped to indexes in the model data entries, set to false to store terms directly in the data entries |
22-
| [model.data] | <code>int</code> | <code>{}</code> | Key-value store containing all training data |
21+
| [model] | `Model` \| `Object` | | |
22+
| [model.nGramMin] | `int` | `1` | Minimum n-gram size |
23+
| [model.nGramMax] | `int` | `1` | Maximum n-gram size |
24+
| [model.minimumConfidence] | `int` \| `float` | `0.2` | Minimum confidence required for predictions |
25+
| [model.vocabulary] | `Array` \| `Set` \| `false` | `[]` | Terms mapped to indexes in the model data, set to `false` to store terms directly in the data entries |
26+
| [model.data] | `Object` | `{}` | Key-value store of labels and training data vectors |
2327

2428
<a name="Classifier+model"></a>
2529

26-
### classifier.model : <code>Model</code>
30+
### classifier.model : `Model`
2731
Model instance
2832

2933
<a name="Classifier+train"></a>
3034

31-
### classifier.train([input], label) ⇒ <code>this</code>
35+
### classifier.train([input], label) ⇒ `this`
3236
Train the current model using an input string (or array of strings) and a corresponding label
3337

3438
| Param | Type | Description |
3539
| --- | --- | --- |
36-
| [input] | <code>string</code> \| <code>Array.&lt;string&gt;</code> | String, or an array of strings |
37-
| label | <code>string</code> | Corresponding label |
40+
| input | `string` \| `Array` | String, or an array of strings |
41+
| label | `string` | Corresponding label |
3842

3943
<a name="Classifier+predict"></a>
4044

41-
### classifier.predict(input, [maxMatches], [minimumConfidence]) ⇒ <code>Array</code>
45+
### classifier.predict(input, [maxMatches], [minimumConfidence]) ⇒ `Array`
4246
Return an array of one or more Prediction instances
4347

4448
| Param | Type | Default | Description |
4549
| --- | --- | --- | --- |
46-
| input | <code>string</code> | | Input string to make a prediction from |
47-
| [maxMatches] | <code>int</code> | <code>1</code> | Maximum number of predictions to return |
48-
| [minimumConfidence] | <code>float</code> | <code>null</code> | Minimum confidence required to include a prediction |
50+
| input | `string` | | Input string to make a prediction from |
51+
| [maxMatches] | `int` | `1` | Maximum number of predictions to return |
52+
| [minimumConfidence] | `float` | `null` | Minimum confidence required to include a prediction |
53+
54+
<a name="Classifier+splitWords"></a>
55+
56+
### classifier.splitWords(input) ⇒ `Array`
57+
Split a string into an array of lowercase words, with all non-letter characters removed
58+
59+
| Param | Type |
60+
| --- | --- |
61+
| input | `string` |
62+
63+
<a name="Classifier+tokenize"></a>
64+
65+
### classifier.tokenize(input) ⇒ `Object`
66+
Create an object literal of unique tokens (n-grams) as keys, and their
67+
respective occurrences as values based on an input string, or array of words
68+
69+
| Param | Type |
70+
| --- | --- |
71+
| input | `string` \| `Array` |
72+
73+
<a name="Classifier+vectorize"></a>
74+
75+
### classifier.vectorize(tokens) ⇒ `Object`
76+
Convert a tokenized object into a new object with all keys (terms)
77+
translated to their index in the vocabulary (adding all terms to
78+
the vocabulary that do not already exist)
79+
80+
| Param | Type |
81+
| --- | --- |
82+
| tokens | `Object` |
83+
84+
<a name="Classifier+cosineSimilarity"></a>
85+
86+
### classifier.cosineSimilarity(v1, v2) ⇒ `float`
87+
Return the cosine similarity between two vectors
88+
89+
| Param | Type |
90+
| --- | --- |
91+
| v1 | `Object` |
92+
| v2 | `Object` |

docs/model.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4,54 +4,54 @@
44

55
* [Model](#Model)
66
* [new Model([config])](#new_Model_new)
7-
* [.nGramMin](#Model+nGramMin) : <code>number</code>
8-
* [.nGramMax](#Model+nGramMax) : <code>number</code>
9-
* [.minimumConfidence](#Model+minimumConfidence) : <code>number</code>
10-
* [.vocabulary](#Model+vocabulary) : <code>Vocabulary</code> \| <code>false</code>
11-
* [.data](#Model+data) : <code>Object</code>
12-
* [.serialize()](#Model+serialize)<code>Object</code>
7+
* [.nGramMin](#Model+nGramMin) : `int`
8+
* [.nGramMax](#Model+nGramMax) : `int`
9+
* [.minimumConfidence](#Model+minimumConfidence) : `float`
10+
* [.vocabulary](#Model+vocabulary) : `Vocabulary` \| `false`
11+
* [.data](#Model+data) : `Object`
12+
* [.serialize()](#Model+serialize)`Object`
1313

1414
<a name="new_Model_new"></a>
1515

1616
### new Model([config])
1717

1818
| Param | Type | Default | Description |
1919
| --- | --- | --- | --- |
20-
| [config] | <code>Object</code> | | |
21-
| [config.nGramMin] | <code>int</code> | <code>1</code> | Minimum n-gram size |
22-
| [config.nGramMax] | <code>int</code> | <code>1</code> | Maximum n-gram size |
23-
| [config.minimumConfidence] | <code>int</code> \| <code>float</code> | <code>0.2</code> | Minimum confidence required for predictions |
24-
| [config.vocabulary] | <code>Array</code> \| <code>Set</code> \| <code>false</code> | <code>[]</code> | Terms mapped to indexes in the model data entries, set to false to store terms directly in the data entries |
25-
| [config.data] | <code>Object</code> | <code>{}</code> | Key-value store containing all training data |
20+
| [config] | `Object` | | |
21+
| [config.nGramMin] | `int` | `1` | Minimum n-gram size |
22+
| [config.nGramMax] | `int` | `1` | Maximum n-gram size |
23+
| [config.minimumConfidence] | `int` \| `float` | `0.2` | Minimum confidence required for predictions |
24+
| [config.vocabulary] | `Array` \| `Set` \| `false` | `[]` | Terms mapped to indexes in the model data entries, set to false to store terms directly in the data entries |
25+
| [config.data] | `Object` | `{}` | Key-value store containing all training data |
2626

2727
<a name="Model+nGramMin"></a>
2828

29-
### model.nGramMin : <code>number</code>
29+
### model.nGramMin : `int`
3030
Minimum n-gram size
3131

3232
<a name="Model+nGramMax"></a>
3333

34-
### model.nGramMax : <code>number</code>
34+
### model.nGramMax : `int`
3535
Maximum n-gram size
3636

3737
<a name="Model+minimumConfidence"></a>
3838

39-
### model.minimumConfidence : <code>number</code>
39+
### model.minimumConfidence : `float`
4040
Minimum confidence required for predictions
4141

4242
<a name="Model+vocabulary"></a>
4343

44-
### model.vocabulary : <code>Vocabulary</code> \| <code>false</code>
44+
### model.vocabulary : `Vocabulary` \| `false`
4545
Vocabulary instance
4646

4747
<a name="Model+data"></a>
4848

49-
### model.data : <code>Object</code>
49+
### model.data : `Object`
5050
Model data
5151

5252
<a name="Model+serialize"></a>
5353

54-
### model.serialize() ⇒ <code>Object</code>
54+
### model.serialize() ⇒ `Object`
5555
Return the model in its current state for storing, including the configured
5656
n-gram min/max values, the minimum confidence required for for predictions,
5757
the vocabulary as an array (if any, otherwise false), and an object literal

docs/prediction.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,15 @@
33
## Prediction
44

55
* [Prediction](#Prediction)
6-
* [.label](#Prediction+label) : <code>string</code>
7-
* [.confidence](#Prediction+confidence) : <code>number</code>
6+
* [.label](#Prediction+label) : `string`
7+
* [.confidence](#Prediction+confidence) : `number`
88

99
<a name="Prediction+label"></a>
1010

11-
### prediction.label : <code>string</code>
11+
### prediction.label : `string`
1212
Label of the prediction
1313

1414
<a name="Prediction+confidence"></a>
1515

16-
### prediction.confidence : <code>number</code>
16+
### prediction.confidence : `number`
1717
Confidence of the prediction

docs/vocabulary.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,64 +4,64 @@
44

55
* [Vocabulary](#Vocabulary)
66
* [new Vocabulary(terms)](#new_Vocabulary_new)
7-
* [.size](#Vocabulary+size) : <code>number</code>
8-
* [.terms](#Vocabulary+terms) : <code>Array</code> \| <code>Set</code>
9-
* [.add(terms)](#Vocabulary+add)<code>this</code>
10-
* [.remove(terms)](#Vocabulary+remove)<code>this</code>
11-
* [.has(term)](#Vocabulary+has)<code>bool</code>
12-
* [.indexOf(term)](#Vocabulary+indexOf)<code>number</code>
7+
* [.size](#Vocabulary+size) : `number`
8+
* [.terms](#Vocabulary+terms) : `Array` \| `Set`
9+
* [.add(terms)](#Vocabulary+add)`this`
10+
* [.remove(terms)](#Vocabulary+remove)`this`
11+
* [.has(term)](#Vocabulary+has)`bool`
12+
* [.indexOf(term)](#Vocabulary+indexOf)`number`
1313

1414
<a name="new_Vocabulary_new"></a>
1515

1616
### new Vocabulary(terms)
1717

1818
| Param | Type |
1919
| --- | --- |
20-
| terms | <code>Array</code> \| <code>Set</code> |
20+
| terms | `Array` \| `Set` |
2121

2222
<a name="Vocabulary+size"></a>
2323

24-
### vocabulary.size : <code>number</code>
24+
### vocabulary.size : `number`
2525
Vocabulary size
2626

2727
<a name="Vocabulary+terms"></a>
2828

29-
### vocabulary.terms : <code>Array</code> \| <code>Set</code>
29+
### vocabulary.terms : `Array` \| `Set`
3030
Vocabulary terms
3131

3232
<a name="Vocabulary+add"></a>
3333

34-
### vocabulary.add(terms) ⇒ <code>this</code>
34+
### vocabulary.add(terms) ⇒ `this`
3535
Add one or more terms to the vocabulary
3636

3737
| Param | Type |
3838
| --- | --- |
39-
| terms | <code>string</code> \| <code>Array</code> \| <code>Set</code> |
39+
| terms | `string` \| `Array` \| `Set` |
4040

4141
<a name="Vocabulary+remove"></a>
4242

43-
### vocabulary.remove(terms) ⇒ <code>this</code>
43+
### vocabulary.remove(terms) ⇒ `this`
4444
Remove one or more terms from the vocabulary
4545

4646
| Param | Type |
4747
| --- | --- |
48-
| terms | <code>string</code> \| <code>Array</code> \| <code>Set</code> |
48+
| terms | `string` \| `Array` \| `Set` |
4949

5050
<a name="Vocabulary+has"></a>
5151

52-
### vocabulary.has(term) ⇒ <code>bool</code>
52+
### vocabulary.has(term) ⇒ `bool`
5353
Return whether the vocabulary contains a certain term
5454

5555
| Param | Type |
5656
| --- | --- |
57-
| term | <code>string</code> |
57+
| term | `string` |
5858

5959
<a name="Vocabulary+indexOf"></a>
6060

61-
### vocabulary.indexOf(term) ⇒ <code>number</code>
61+
### vocabulary.indexOf(term) ⇒ `number`
6262
Return the index of a term in the vocabulary (returns -1 if not found)
6363

6464
| Param | Type |
6565
| --- | --- |
66-
| term | <code>string</code> |
66+
| term | `string` |
6767

src/classifier.js

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,12 @@ import Model from './model'
33
import Prediction from './prediction'
44

55
/**
6-
* @param {Model|Object} [model]
6+
* @param {(Model|Object)} [model]
77
* @param {int} [model.nGramMin=1] - Minimum n-gram size
88
* @param {int} [model.nGramMax=1] - Maximum n-gram size
99
* @param {(int|float)} [model.minimumConfidence=0.2] - Minimum confidence required for predictions
1010
* @param {(Array|Set|false)} [model.vocabulary=[]] - Terms mapped to indexes in the model data entries, set to false to store terms directly in the data entries
11-
* @param {int} [model.data={}] - Key-value store containing all training data
11+
* @param {Object} [model.data={}] - Key-value store containing all training data
1212
* @constructor
1313
*/
1414
class Classifier {
@@ -144,7 +144,6 @@ class Classifier {
144144
*
145145
* @param {string} input
146146
* @return {Array}
147-
* @private
148147
*/
149148
splitWords(input) {
150149
if (typeof input !== 'string') {
@@ -166,7 +165,6 @@ class Classifier {
166165
*
167166
* @param {(string|string[])} input
168167
* @return {Object}
169-
* @private
170168
*/
171169
tokenize(input) {
172170
let words = typeof input === 'string' ? this.splitWords(input) : input
@@ -210,9 +208,8 @@ class Classifier {
210208
* translated to their index in the vocabulary (adding all terms to
211209
* the vocabulary that do not already exist)
212210
*
213-
* @param {object} tokens
214-
* @return {object}
215-
* @private
211+
* @param {Object} tokens
212+
* @return {Object}
216213
*/
217214
vectorize(tokens) {
218215
if (!(tokens instanceof Object) || tokens.constructor !== Object) {
@@ -247,7 +244,6 @@ class Classifier {
247244
* @param {Object} v1
248245
* @param {Object} v2
249246
* @return {float}
250-
* @private
251247
*/
252248
cosineSimilarity(v1, v2) {
253249
if (!(v1 instanceof Object) || v1.constructor !== Object) {

src/model.js

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ class Model {
7474
/**
7575
* Minimum n-gram size
7676
*
77-
* @type {number}
77+
* @type {int}
7878
*/
7979
get nGramMin() {
8080
return this._nGramMin
@@ -91,7 +91,7 @@ class Model {
9191
/**
9292
* Maximum n-gram size
9393
*
94-
* @type {number}
94+
* @type {int}
9595
*/
9696
get nGramMax() {
9797
return this._nGramMax
@@ -108,7 +108,7 @@ class Model {
108108
/**
109109
* Minimum confidence required for predictions
110110
*
111-
* @type {number}
111+
* @type {float}
112112
*/
113113
get minimumConfidence() {
114114
return this._minimumConfidence

0 commit comments

Comments
 (0)