jBNC Toolbox

jbnc.dataset
Class DatasetReader

java.lang.Object
  extended byjbnc.dataset.DatasetReader
Direct Known Subclasses:
DatasetReaderInt

public class DatasetReader
extends java.lang.Object

Functions for reading data sets with test cases.

Since:
June 1, 1999
Author:
Jarek Sacha
See Also:
Dataset, NamesReader

Constructor Summary
DatasetReader()
           
 
Method Summary
protected  java.util.Vector convertCase(java.util.Vector rawData, AttributeSpecs[] names)
          Convert case from a raw format.
 boolean getDiscardIncompleteCases()
           
 java.util.Vector open(java.lang.String fileName, AttributeSpecs[] names)
          Reads a data file with cases - comma delimited, no header.
 Dataset open(java.lang.String fileName, java.lang.String className)
          Reads a data file with cases - comma delimited, with header.
 void setDiscardIncompleteCases(boolean discardIncompleteCases)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DatasetReader

public DatasetReader()
Method Detail

setDiscardIncompleteCases

public void setDiscardIncompleteCases(boolean discardIncompleteCases)
Parameters:
discardIncompleteCases -

getDiscardIncompleteCases

public boolean getDiscardIncompleteCases()
Returns:
discardIncompleteCases

open

public java.util.Vector open(java.lang.String fileName,
                             AttributeSpecs[] names)
                      throws java.lang.Exception
Reads a data file with cases - comma delimited, no header. This function is typically used to read files in c4.5 format, description of attributes needs to be read from the '.names' first using jbnc.dataset.NamesReader.open().

Parameters:
names - Descriptions of columns in the file.
fileName - Description of Parameter
Returns:
Vector of vectors representing cases. Each case attribute is allocated in type defined by 'names' parameters.
Throws:
java.lang.Exception - Description of Exception

open

public Dataset open(java.lang.String fileName,
                    java.lang.String className)
             throws java.lang.Exception
Reads a data file with cases - comma delimited, with header. First line in the file gives names of attributes (columns). In current implementation the attributes are assumed to be discrete.

Parameters:
fileName - Name of the file to read data from.
className - Name of the attribute representing class. If it is null, it is assumed that the last column represents class.
Returns:
Vector of vectors representing cases. Each case attribute is allocated in type defined by 'names' parameters.
Throws:
java.lang.Exception - Description of Exception

convertCase

protected java.util.Vector convertCase(java.util.Vector rawData,
                                       AttributeSpecs[] names)
                                throws java.lang.Exception
Convert case from a raw format.

Parameters:
rawData - Description of Parameter
names - Description of Parameter
Returns:
Description of the Returned Value
Throws:
java.lang.Exception - Description of Exception

SourceForge.net Logo