(language-processing) Add maximum length limit for text input in SentenceExtractor
Added a new constant, MAX_TEXT_LENGTH, to the SentenceExtractor class. If the length of the text input exceeds this limit, the text is truncated to fit within the limit. This modification is designed to prevent excessive resource usage for unusually long text inputs.
This commit is contained in:
parent
4ce692ccaf
commit
32436d099c
@ -42,6 +42,7 @@ public class SentenceExtractor {
|
||||
* that might otherwise use an undue amount of processing power. 250 words is about 10X longer than
|
||||
* this comment. */
|
||||
private static final int MAX_SENTENCE_LENGTH = 250;
|
||||
private static final int MAX_TEXT_LENGTH = 65536;
|
||||
|
||||
@SneakyThrows @Inject
|
||||
public SentenceExtractor(LanguageModels models) {
|
||||
@ -136,6 +137,11 @@ public class SentenceExtractor {
|
||||
String[] sentences;
|
||||
|
||||
String textNormalizedSpaces = SentenceExtractorStringUtils.normalizeSpaces(text);
|
||||
|
||||
if (text.length() > MAX_TEXT_LENGTH) {
|
||||
textNormalizedSpaces = textNormalizedSpaces.substring(0, MAX_TEXT_LENGTH);
|
||||
}
|
||||
|
||||
try {
|
||||
sentences = sentenceDetector.sentDetect(textNormalizedSpaces);
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user