Click or drag to resize

TokenizerBasic Class

Home | imbSCI | imbACE | imbNLP | imbWEM | imbWBI
Basic tokenizer for English - it removes all non-Latin characters during tokenization
Inheritance Hierarchy

Namespace:  imbNLP.Toolkit.Processing
Assembly:  imbNLP.Toolkit (in imbNLP.Toolkit.dll) Version: 0.2.30
Syntax
C#
public class TokenizerBasic : TokenizerBase

The TokenizerBasic type exposes the following members.

Constructors
  NameDescription
Public methodTokenizerBasic
Initializes a new instance of the TokenizerBasic class.
Top
Properties
  NameDescription
Public propertyInputReplacers
Set of replacement rules to be applied on a text, before splitting the text into tokens
(Inherited from TokenizerBase.)
Public propertyLowerCase (Inherited from TokenizerBase.)
Public propertyMinLength (Inherited from TokenizerBase.)
Public propertytokenSelector (Inherited from TokenizerBase.)
Public propertyTokenSplitterChars
Characters that are going to be used to split the text into tokens
(Inherited from TokenizerBase.)
Top
Methods
  NameDescription
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodExecuteInputReplacers
Executes all replacement rules from the InputReplacers collection
(Inherited from TokenizerBase.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Public methodTokenize
Tokenizes the specified text, according to the configuration of the tokenizer
(Inherited from TokenizerBase.)
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Top
See Also