(converter) Refactor content type check in PlainTextDocumentProcessorPlugin
The method `isApplicable` in the `PlainTextDocumentProcessorPlugin` was refactored to handle a wider range of content types beyond merely "text/plain". It now also handles any content type that starts with "text/plain;", to accomodate contentTypes that append a charset as well.
This commit is contained in:
parent
51cdf46645
commit
41d896ba3e
1 changed files with 8 additions and 1 deletions
|
@ -54,7 +54,14 @@ public class PlainTextDocumentProcessorPlugin extends AbstractDocumentProcessorP
|
|||
|
||||
@Override
|
||||
public boolean isApplicable(CrawledDocument doc) {
|
||||
return doc.contentType.equalsIgnoreCase("text/plain");
|
||||
String contentType = doc.contentType.toLowerCase();
|
||||
|
||||
if (contentType.equals("text/plain"))
|
||||
return true;
|
||||
if (contentType.startsWith("text/plain;")) // charset=blabla
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
@Override
|
||||
|
|
Loading…
Reference in a new issue