SHOGUN  v1.1.0
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
List of all members | Public Member Functions | Protected Attributes
CStreamingStringFeatures< T > Class Template Reference

Detailed Description

template<class T>
class shogun::CStreamingStringFeatures< T >

This class implements streaming features as strings.

Definition at line 27 of file StreamingStringFeatures.h.

Inheritance diagram for CStreamingStringFeatures< T >:
Inheritance graph
[legend]

Public Member Functions

 CStreamingStringFeatures ()
 CStreamingStringFeatures (CStreamingFile *file, bool is_labelled, int32_t size)
virtual ~CStreamingStringFeatures ()
virtual void set_vector_reader ()
virtual void set_vector_and_label_reader ()
void use_alphabet (EAlphabet alpha)
void use_alphabet (CAlphabet *alpha)
void set_remap (CAlphabet *ascii_alphabet, CAlphabet *binary_alphabet)
void set_remap (EAlphabet ascii_alphabet=DNA, EAlphabet binary_alphabet=RAWDNA)
CAlphabetget_alphabet ()
floatmax_t get_num_symbols ()
virtual void start_parser ()
virtual void end_parser ()
virtual bool get_next_example ()
SGString< T > get_vector ()
virtual float64_t get_label ()
virtual void release_example ()
virtual int32_t get_vector_length ()
virtual EFeatureType get_feature_type ()
virtual EFeatureClass get_feature_class ()
virtual CFeaturesduplicate () const
virtual const char * get_name () const
virtual int32_t get_num_vectors () const
virtual int32_t get_size ()
virtual int32_t get_num_features ()
- Public Member Functions inherited from CStreamingFeatures
 CStreamingFeatures ()
 CStreamingFeatures (CStreamingFile *file, bool is_labelled, int32_t size)
virtual ~CStreamingFeatures ()
void set_read_functions ()
virtual bool get_has_labels ()
virtual bool is_seekable ()
virtual void reset_stream ()
- Public Member Functions inherited from CFeatures
 CFeatures (int32_t size=0)
 CFeatures (const CFeatures &orig)
 CFeatures (CFile *loader)
virtual ~CFeatures ()
virtual int32_t add_preprocessor (CPreprocessor *p)
 set preprocessor
virtual CPreprocessordel_preprocessor (int32_t num)
 del current preprocessor
CPreprocessorget_preprocessor (int32_t num)
 get current preprocessor
void set_preprocessed (int32_t num)
bool is_preprocessed (int32_t num)
int32_t get_num_preprocessed ()
 get whether specified preprocessor (or all if num=1) was/were already applied
int32_t get_num_preprocessors () const
void clean_preprocessors ()
int32_t get_cache_size ()
virtual bool reshape (int32_t num_features, int32_t num_vectors)
void list_feature_obj ()
virtual void load (CFile *loader)
virtual void save (CFile *writer)
bool check_feature_compatibility (CFeatures *f)
bool has_property (EFeatureProperty p)
void set_property (EFeatureProperty p)
void unset_property (EFeatureProperty p)
virtual void set_subset (CSubset *subset)
virtual void remove_subset ()
virtual void subset_changed_post ()
index_t subset_idx_conversion (index_t idx) const
bool has_subset () const
virtual CFeaturescopy_subset (SGVector< index_t > indices)
- Public Member Functions inherited from CSGObject
 CSGObject ()
 CSGObject (const CSGObject &orig)
virtual ~CSGObject ()
virtual bool is_generic (EPrimitiveType *generic) const
template<class T >
void set_generic ()
void unset_generic ()
virtual void print_serializable (const char *prefix="")
virtual bool save_serializable (CSerializableFile *file, const char *prefix="")
virtual bool load_serializable (CSerializableFile *file, const char *prefix="")
void set_global_io (SGIO *io)
SGIOget_global_io ()
void set_global_parallel (Parallel *parallel)
Parallelget_global_parallel ()
void set_global_version (Version *version)
Versionget_global_version ()
SGVector< char * > get_modelsel_names ()
char * get_modsel_param_descr (const char *param_name)
index_t get_modsel_param_index (const char *param_name)

Protected Attributes

CInputParser< T > parser
 The parser object, which reads from input and returns parsed example objects.
CAlphabetalphabet
 Alphabet to use.
CAlphabetalpha_ascii
 If remapping is enabled, this is the source alphabet.
CAlphabetalpha_bin
 If remapping is enabled, this is the target alphabet.
CStreamingFileworking_file
 The StreamingFile object to read from.
SGString< T > current_sgstring
 The current example's string as an SGString<T>
T * current_string
 The current example's string as a T*.
int32_t current_length
 The length of the current string.
float64_t current_label
 The label of the current example, if applicable.
bool has_labels
 Whether examples are labelled or not.
bool remap_to_bin
 Whether remapping must be done.
int32_t num_symbols
 Number of symbols.
- Protected Attributes inherited from CStreamingFeatures
bool has_labels
 Whether examples are labelled or not.
CStreamingFileworking_file
 The StreamingFile object to read from.
bool seekable
 Whether the stream is seekable.
- Protected Attributes inherited from CFeatures
CSubsetm_subset

Additional Inherited Members

- Public Attributes inherited from CSGObject
SGIOio
Parallelparallel
Versionversion
Parameterm_parameters
Parameterm_model_selection_parameters
- Protected Member Functions inherited from CSGObject
virtual void load_serializable_pre () throw (ShogunException)
virtual void load_serializable_post () throw (ShogunException)
virtual void save_serializable_pre () throw (ShogunException)
virtual void save_serializable_post () throw (ShogunException)

Constructor & Destructor Documentation

Default constructor.

Sets the reading functions to be CStreamingFile::get_*_vector and get_*_vector_and_label depending on the type T.

Definition at line 8 of file StreamingStringFeatures.cpp.

CStreamingStringFeatures ( CStreamingFile file,
bool  is_labelled,
int32_t  size 
)

Constructor taking args. Initializes the parser with the given args.

Parameters
fileStreamingFile object, input file.
is_labelledWhether examples are labelled or not.
sizeNumber of example objects to be stored in the parser at a time.

Definition at line 16 of file StreamingStringFeatures.cpp.

Destructor.

Ends the parsing thread. (Waits for pthread_join to complete)

Definition at line 27 of file StreamingStringFeatures.cpp.

Member Function Documentation

CFeatures * duplicate ( ) const
virtual

Duplicate the object.

Returns
a duplicate object as CFeatures*

Implements CFeatures.

Definition at line 83 of file StreamingStringFeatures.cpp.

void end_parser ( )
virtual

Ends the parsing thread.

Waits for the thread to join.

Implements CStreamingFeatures.

Definition at line 177 of file StreamingStringFeatures.cpp.

CAlphabet * get_alphabet ( )

Return the alphabet being used as a CAlphabet*

Returns

Definition at line 70 of file StreamingStringFeatures.cpp.

EFeatureClass get_feature_class ( )
virtual

Return the feature class

Returns
C_STREAMING_STRING

Implements CFeatures.

Definition at line 258 of file StreamingStringFeatures.cpp.

virtual EFeatureType get_feature_type ( )
virtual

Return the feature type, depending on T.

Returns
Feature type as EFeatureType

Implements CFeatures.

float64_t get_label ( )
virtual

Return the label of the current example as a float.

Examples must be labelled, otherwise an error occurs.

Returns
The label as a float64_t.

Implements CStreamingFeatures.

Definition at line 238 of file StreamingStringFeatures.cpp.

virtual const char* get_name ( ) const
virtual

Return the name.

Returns
StreamingSparseFeatures

Implements CSGObject.

Definition at line 210 of file StreamingStringFeatures.h.

bool get_next_example ( )
virtual

Instructs the parser to return the next example.

This example is stored as the current_example in this object.

Returns
True on success, false if there are no more examples, or an error occurred.

Implements CStreamingFeatures.

Definition at line 183 of file StreamingStringFeatures.cpp.

int32_t get_num_features ( )
virtual

Return the number of features in the current vector.

Returns
length of the vector

Implements CStreamingFeatures.

Definition at line 103 of file StreamingStringFeatures.cpp.

floatmax_t get_num_symbols ( )

get number of symbols

Note: floatmax_t sounds weird, but LONG is not long enough

Returns
number of symbols

Definition at line 77 of file StreamingStringFeatures.cpp.

int32_t get_num_vectors ( ) const
virtual

Return the number of vectors stored in this object.

Returns
1 if current_vector exists, else 0.

Implements CFeatures.

Definition at line 89 of file StreamingStringFeatures.cpp.

int32_t get_size ( )
virtual

Return the size of one T object.

Returns
Size of T.

Implements CFeatures.

Definition at line 97 of file StreamingStringFeatures.cpp.

SGString< T > get_vector ( )

Return the current feature vector as an SGString<T>.

Returns
The vector as SGString<T>

Definition at line 229 of file StreamingStringFeatures.cpp.

int32_t get_vector_length ( )
virtual

Return the length of the current vector.

Returns
current vector length as int32_t

Definition at line 252 of file StreamingStringFeatures.cpp.

void release_example ( )
virtual

Release the current example, indicating to the parser that it has been processed by the learning algorithm.

The parser is then free to throw away that example.

Implements CStreamingFeatures.

Definition at line 246 of file StreamingStringFeatures.cpp.

void set_remap ( CAlphabet ascii_alphabet,
CAlphabet binary_alphabet 
)

Set whether remapping to another alphabet is required.

Call before parsing.

Parameters
ascii_alphabetthe alphabet to convert from, CAlphabet*
binary_alphabetthe alphabet to convert to, CAlphabet*

Definition at line 54 of file StreamingStringFeatures.cpp.

void set_remap ( EAlphabet  ascii_alphabet = DNA,
EAlphabet  binary_alphabet = RAWDNA 
)

Set whether remapping to another alphabet is required.

Call before parsing.

Parameters
ascii_alphabetthe alphabet to convert from, EAlphabet
binary_alphabetthe alphabet to convert to, EAlphabet

Definition at line 62 of file StreamingStringFeatures.cpp.

void set_vector_and_label_reader ( )
virtual

Sets the read function (in case the examples are labelled) to get_*_vector_and_label from CStreamingFile.

The exact function depends on type T.

The parser uses the function set by this while reading labelled examples.

Implements CStreamingFeatures.

Definition at line 113 of file StreamingStringFeatures.cpp.

void set_vector_reader ( )
virtual

Sets the read function (in case the examples are unlabelled) to get_*_vector() from CStreamingFile.

The exact function depends on type T.

The parser uses the function set by this while reading unlabelled examples.

Implements CStreamingFeatures.

Definition at line 108 of file StreamingStringFeatures.cpp.

void start_parser ( )
virtual

Starts the parsing thread.

To be called before trying to use any feature vectors from this object.

Implements CStreamingFeatures.

Definition at line 167 of file StreamingStringFeatures.cpp.

void use_alphabet ( EAlphabet  alpha)

Set the alphabet to be used. Call before parsing.

Parameters
alphaalphabet as an EAlphabet enum.

Definition at line 34 of file StreamingStringFeatures.cpp.

void use_alphabet ( CAlphabet alpha)

Set the alphabet to be used. Call before parsing.

Parameters
alphaalphabet as a pointer to a CAlphabet object.

Definition at line 44 of file StreamingStringFeatures.cpp.

Member Data Documentation

CAlphabet* alpha_ascii
protected

If remapping is enabled, this is the source alphabet.

Definition at line 259 of file StreamingStringFeatures.h.

CAlphabet* alpha_bin
protected

If remapping is enabled, this is the target alphabet.

Definition at line 262 of file StreamingStringFeatures.h.

CAlphabet* alphabet
protected

Alphabet to use.

Definition at line 256 of file StreamingStringFeatures.h.

float64_t current_label
protected

The label of the current example, if applicable.

Definition at line 277 of file StreamingStringFeatures.h.

int32_t current_length
protected

The length of the current string.

Definition at line 274 of file StreamingStringFeatures.h.

SGString<T> current_sgstring
protected

The current example's string as an SGString<T>

Definition at line 268 of file StreamingStringFeatures.h.

T* current_string
protected

The current example's string as a T*.

Definition at line 271 of file StreamingStringFeatures.h.

bool has_labels
protected

Whether examples are labelled or not.

Definition at line 280 of file StreamingStringFeatures.h.

int32_t num_symbols
protected

Number of symbols.

Definition at line 286 of file StreamingStringFeatures.h.

CInputParser<T> parser
protected

The parser object, which reads from input and returns parsed example objects.

Definition at line 253 of file StreamingStringFeatures.h.

bool remap_to_bin
protected

Whether remapping must be done.

Definition at line 283 of file StreamingStringFeatures.h.

CStreamingFile* working_file
protected

The StreamingFile object to read from.

Definition at line 265 of file StreamingStringFeatures.h.


The documentation for this class was generated from the following files:

SHOGUN Machine Learning Toolbox - Documentation