Skip to content

Commit d82c2e4

Browse files
committed
Removed minimumConfidence from Model
1 parent 513af38 commit d82c2e4

File tree

9 files changed

+47
-135
lines changed

9 files changed

+47
-135
lines changed

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,15 @@
22

33
All notable changes to this project will be documented in this file.
44

5+
## [2.0.0] - 2020-08-28
6+
7+
### Breaking changes
8+
9+
* Removed minimumConfidence from Model
10+
511
## [1.0.0] - 2020-08-26
612

713
Initial release
814

15+
[2.0.0]: https://github.com/andreekeberg/ml-classify-text-js/releases/tag/2.0.0
916
[1.0.0]: https://github.com/andreekeberg/ml-classify-text-js/releases/tag/1.0.0

README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,6 @@ The following configuration options can be passed both directly to a new [Model]
9393
| --- | --- | --- | --- |
9494
| **nGramMin** | `int` | `1` | Minimum n-gram size |
9595
| **nGramMax** | `int` | `1` | Maximum n-gram size |
96-
| **minimumConfidence** | `int` \| `float` | `0.2` | Minimum confidence required for predictions |
9796
| **vocabulary** | `Array` \| `Set` \| `false` | `[]` | Terms mapped to indexes in the model data, set to `false` to store terms directly in the data entries |
9897
| **data** | `Object` | `{}` | Key-value store of labels and training data vectors |
9998

@@ -147,7 +146,6 @@ Returning:
147146
{
148147
nGramMin: 1,
149148
nGramMax: 1,
150-
minimumConfidence: 0.2,
151149
vocabulary: [
152150
'this', 'is', 'great',
153151
'so', 'cool', 'wow',

docs/classifier.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@
2121
| [model] | `Model` \| `Object` | | |
2222
| [model.nGramMin] | `int` | `1` | Minimum n-gram size |
2323
| [model.nGramMax] | `int` | `1` | Maximum n-gram size |
24-
| [model.minimumConfidence] | `int` \| `float` | `0.2` | Minimum confidence required for predictions |
2524
| [model.vocabulary] | `Array` \| `Set` \| `false` | `[]` | Terms mapped to indexes in the model data, set to `false` to store terms directly in the data entries |
2625
| [model.data] | `Object` | `{}` | Key-value store of labels and training data vectors |
2726

@@ -49,7 +48,7 @@ Return an array of one or more Prediction instances
4948
| --- | --- | --- | --- |
5049
| input | `string` | | Input string to make a prediction from |
5150
| [maxMatches] | `int` | `1` | Maximum number of predictions to return |
52-
| [minimumConfidence] | `float` | `null` | Minimum confidence required to include a prediction |
51+
| [minimumConfidence] | `float` | `0.2` | Minimum confidence required to include a prediction |
5352

5453
<a name="Classifier+splitWords"></a>
5554

docs/model.md

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@
66
* [new Model([config])](#new_Model_new)
77
* [.nGramMin](#Model+nGramMin) : `int`
88
* [.nGramMax](#Model+nGramMax) : `int`
9-
* [.minimumConfidence](#Model+minimumConfidence) : `float`
109
* [.vocabulary](#Model+vocabulary) : `Vocabulary` \| `false`
1110
* [.data](#Model+data) : `Object`
1211
* [.serialize()](#Model+serialize)`Object`
@@ -20,7 +19,6 @@
2019
| [config] | `Object` | | |
2120
| [config.nGramMin] | `int` | `1` | Minimum n-gram size |
2221
| [config.nGramMax] | `int` | `1` | Maximum n-gram size |
23-
| [config.minimumConfidence] | `int` \| `float` | `0.2` | Minimum confidence required for predictions |
2422
| [config.vocabulary] | `Array` \| `Set` \| `false` | `[]` | Terms mapped to indexes in the model data entries, set to false to store terms directly in the data entries |
2523
| [config.data] | `Object` | `{}` | Key-value store containing all training data |
2624

@@ -34,11 +32,6 @@ Minimum n-gram size
3432
### model.nGramMax : `int`
3533
Maximum n-gram size
3634

37-
<a name="Model+minimumConfidence"></a>
38-
39-
### model.minimumConfidence : `float`
40-
Minimum confidence required for predictions
41-
4235
<a name="Model+vocabulary"></a>
4336

4437
### model.vocabulary : `Vocabulary` \| `false`
@@ -52,7 +45,6 @@ Model data
5245
<a name="Model+serialize"></a>
5346

5447
### model.serialize() ⇒ `Object`
55-
Return the model in its current state for storing, including the configured
56-
n-gram min/max values, the minimum confidence required for for predictions,
57-
the vocabulary as an array (if any, otherwise false), and an object literal
58-
with all the training data
48+
Return the model in its current state an an object literal, including the
49+
configured n-gram min/max values, the vocabulary as an array (if any,
50+
otherwise false), and an object literal with all the training data

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "ml-classify-text",
3-
"version": "1.0.0",
3+
"version": "2.0.0",
44
"description": "Text classification using n-grams and cosine similarity",
55
"module": "./lib",
66
"main": "./lib",

src/classifier.js

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@ import Prediction from './prediction'
66
* @param {(Model|Object)} [model]
77
* @param {int} [model.nGramMin=1] - Minimum n-gram size
88
* @param {int} [model.nGramMax=1] - Maximum n-gram size
9-
* @param {(int|float)} [model.minimumConfidence=0.2] - Minimum confidence required for predictions
109
* @param {(Array|Set|false)} [model.vocabulary=[]] - Terms mapped to indexes in the model data entries, set to false to store terms directly in the data entries
1110
* @param {Object} [model.data={}] - Key-value store containing all training data
1211
* @constructor
@@ -94,24 +93,32 @@ class Classifier {
9493
*
9594
* @param {string} input - Input string to make a prediction from
9695
* @param {int} [maxMatches=1] Maximum number of predictions to return
97-
* @param {float} [minimumConfidence=null] Minimum confidence required to include a prediction
96+
* @param {float} [minimumConfidence=0.2] Minimum confidence required to include a prediction
9897
* @return {Array}
9998
*/
100-
predict(input, maxMatches = 1, minimumConfidence = null) {
99+
predict(input, maxMatches = 1, minimumConfidence = 0.2) {
101100
if (typeof input !== 'string') {
102101
throw new Error('input must be a string')
103102
}
104103

104+
if (typeof minimumConfidence !== 'number') {
105+
throw new Error('minimumConfidence must be a number')
106+
}
107+
108+
if (minimumConfidence < 0) {
109+
throw new Error('minimumConfidence can not be lower than 0')
110+
}
111+
112+
if (minimumConfidence > 1) {
113+
throw new Error('minimumConfidence can not be higher than 1')
114+
}
115+
105116
let tokens = this.tokenize(input)
106117

107118
if (this.vocabulary !== false) {
108119
tokens = this.vectorize(tokens)
109120
}
110121

111-
if (minimumConfidence === null) {
112-
minimumConfidence = this.model.minimumConfidence
113-
}
114-
115122
let predictions = []
116123

117124
Object.keys(this._model.data).forEach(label => {

src/model.js

Lines changed: 3 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ import Vocabulary from './vocabulary'
44
* @param {Object} [config]
55
* @param {int} [config.nGramMin=1] - Minimum n-gram size
66
* @param {int} [config.nGramMax=1] - Maximum n-gram size
7-
* @param {(int|float)} [config.minimumConfidence=0.2] - Minimum confidence required for predictions
87
* @param {(Array|Set|false)} [config.vocabulary=[]] - Terms mapped to indexes in the model data entries, set to false to store terms directly in the data entries
98
* @param {Object} [config.data={}] - Key-value store containing all training data
109
* @constructor
@@ -18,7 +17,6 @@ class Model {
1817
config = {
1918
nGramMin: 1,
2019
nGramMax: 1,
21-
minimumConfidence: 0.2,
2220
vocabulary: [],
2321
data: {},
2422
...config
@@ -40,18 +38,6 @@ class Model {
4038
throw new Error('Config value nGramMax must be at least 1')
4139
}
4240

43-
if (typeof config.minimumConfidence !== 'number') {
44-
throw new Error('Config value minimumConfidence must be a number')
45-
}
46-
47-
if (config.minimumConfidence < 0) {
48-
throw new Error('Config value minimumConfidence can not be lower than 0')
49-
}
50-
51-
if (config.minimumConfidence > 1) {
52-
throw new Error('Config value minimumConfidence can not be higher than 1')
53-
}
54-
5541
if (config.nGramMax < config.nGramMin) {
5642
throw new Error('Invalid nGramMin/nGramMax combination in config')
5743
}
@@ -66,7 +52,6 @@ class Model {
6652

6753
this._nGramMin = config.nGramMin
6854
this._nGramMax = config.nGramMax
69-
this._minimumConfidence = config.minimumConfidence
7055
this._vocabulary = config.vocabulary
7156
this._data = {...config.data}
7257
}
@@ -105,31 +90,6 @@ class Model {
10590
this._nGramMax = size
10691
}
10792

108-
/**
109-
* Minimum confidence required for predictions
110-
*
111-
* @type {float}
112-
*/
113-
get minimumConfidence() {
114-
return this._minimumConfidence
115-
}
116-
117-
set minimumConfidence(confidence) {
118-
if (typeof confidence !== 'number') {
119-
throw new Error('minimumConfidence must be a number')
120-
}
121-
122-
if (confidence < 0) {
123-
throw new Error('minimumConfidence can not be lower than 0')
124-
}
125-
126-
if (confidence > 1) {
127-
throw new Error('minimumConfidence can not be higher than 1')
128-
}
129-
130-
this._minimumConfidence = confidence
131-
}
132-
13393
/**
13494
* Vocabulary instance
13595
*
@@ -165,18 +125,16 @@ class Model {
165125
}
166126

167127
/**
168-
* Return the model in its current state for storing, including the configured
169-
* n-gram min/max values, the minimum confidence required for for predictions,
170-
* the vocabulary as an array (if any, otherwise false),and an object literal
171-
* with all the training data
128+
* Return the model in its current state an an object literal, including the
129+
* configured n-gram min/max values, the vocabulary as an array (if any,
130+
* otherwise false), and an object literal with all the training data
172131
*
173132
* @return {Object}
174133
*/
175134
serialize() {
176135
return {
177136
nGramMin: this._nGramMin,
178137
nGramMax: this._nGramMax,
179-
minimumConfidence: this._minimumConfidence,
180138
vocabulary: Array.from(this._vocabulary.terms),
181139
data: this._data
182140
}

test/classifier.js

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,24 @@ describe('Classifier', () => {
328328
expect(() => classifier.predict([])).to.throw(Error)
329329
})
330330

331+
it('should throw an error if minimumConfidence is not a number', () => {
332+
const classifier = new Classifier()
333+
334+
expect(() => classifier.predict('', null, '')).to.throw(Error)
335+
})
336+
337+
it('should throw an error if minimumConfidence is lower than 0', () => {
338+
const classifier = new Classifier()
339+
340+
expect(() => classifier.predict('', null, -1)).to.throw(Error)
341+
})
342+
343+
it('should throw an error if minimumConfidence is higher than 1', () => {
344+
const classifier = new Classifier()
345+
346+
expect(() => classifier.predict('', null, 2)).to.throw(Error)
347+
})
348+
331349
it('should return an array', () => {
332350
const classifier = new Classifier()
333351

test/model.js

Lines changed: 0 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -39,24 +39,6 @@ describe('Model', () => {
3939
})).to.throw(Error)
4040
})
4141

42-
it('should throw an error if minimumConfidence is not a number', () => {
43-
expect(() => new Model({
44-
minimumConfidence: 'test'
45-
})).to.throw(Error)
46-
})
47-
48-
it('should throw an error if minimumConfidence is lower than 0', () => {
49-
expect(() => new Model({
50-
minimumConfidence: -1
51-
})).to.throw(Error)
52-
})
53-
54-
it('should throw an error if minimumConfidence is higher than 1', () => {
55-
expect(() => new Model({
56-
minimumConfidence: 2
57-
})).to.throw(Error)
58-
})
59-
6042
it('should throw an error if data is not an object literal', () => {
6143
expect(() => new Model({
6244
data: []
@@ -129,54 +111,6 @@ describe('Model', () => {
129111
})
130112
})
131113

132-
describe('minimumConfidence', () => {
133-
it('should return a number', () => {
134-
const model = new Model()
135-
136-
expect(model.minimumConfidence).to.be.a('number')
137-
})
138-
139-
it('should return the current minimumConfidence value', () => {
140-
const model = new Model({
141-
minimumConfidence: 0.5
142-
})
143-
144-
expect(model.minimumConfidence).to.equal(0.5)
145-
})
146-
147-
it('should set the minimumConfidence value', () => {
148-
const model = new Model()
149-
150-
model.minimumConfidence = 0.1
151-
152-
expect(model.minimumConfidence).to.equal(0.1)
153-
})
154-
155-
it('should throw an error if confidence is not a number', () => {
156-
const model = new Model()
157-
158-
expect(() => {
159-
model.minimumConfidence = 'test'
160-
}).to.throw(Error)
161-
})
162-
163-
it('should throw an error if confidence is lower than 0', () => {
164-
const model = new Model()
165-
166-
expect(() => {
167-
model.minimumConfidence = -1
168-
}).to.throw(Error)
169-
})
170-
171-
it('should throw an error if confidence is higher than 1', () => {
172-
const model = new Model()
173-
174-
expect(() => {
175-
model.minimumConfidence = 2
176-
}).to.throw(Error)
177-
})
178-
})
179-
180114
describe('vocabulary', () => {
181115
it('should return a vocabulary instance', () => {
182116
const model = new Model()
@@ -244,7 +178,6 @@ describe('Model', () => {
244178
expect(model.serialize()).to.eql({
245179
nGramMin: 1,
246180
nGramMax: 1,
247-
minimumConfidence: 0.2,
248181
vocabulary: [],
249182
data: {}
250183
})

0 commit comments

Comments
 (0)