Package deepnetts.data
Class TabularDataSet<T extends MLDataItem>
java.lang.Object
javax.visrec.ml.data.BasicDataSet<T>
deepnetts.data.TabularDataSet<T>
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classRepresents a basic data set item (single row) with input tensor and target vector in a data set. -
Constructor Summary
ConstructorsConstructorDescriptionTabularDataSet(int numInputs, int numOutputs) Create a new instance of BasicDataSet with specified size of input and output. -
Method Summary
Modifier and TypeMethodDescriptionint[]intcountMissingValues(int colIdx) String[]intintString[]boolean[]booleanhasMissingValues(int colIdx) voidsetColumnNames(String[] columnNames) voidshuffle()Shuffles the data set items using the default random generator.voidshuffle(int seed) Shuffles data set items using java random generator initializes with specified seedjavax.visrec.ml.data.DataSet[]split(double... parts) Splits data set into several parts specified by the input parameter partSizes.javax.visrec.ml.data.DataSet[]split(int parts) Split data set into specified number of part of equal sizes.trainTestSplit(double splitRatio) Methods inherited from class javax.visrec.ml.data.BasicDataSet
getColumns, getItems, setAsTargetColumns, setAsTargetColumns, setColumnsMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface javax.visrec.ml.data.DataSet
add, addAll, clear, get, isEmpty, iterator, shuffle, size, split, split, split, streamMethods inherited from interface java.lang.Iterable
forEach, spliterator
-
Constructor Details
-
TabularDataSet
public TabularDataSet(int numInputs, int numOutputs) Create a new instance of BasicDataSet with specified size of input and output.- Parameters:
numInputs- number of input featuresnumOutputs- number of output features
-
-
Method Details
-
getNumInputs
public int getNumInputs() -
getNumOutputs
public int getNumOutputs() -
split
public javax.visrec.ml.data.DataSet[] split(int parts) Split data set into specified number of part of equal sizes. Utility method used during cross-validation Note: this could be default method- Parameters:
parts-- Returns:
-
trainTestSplit
-
split
public javax.visrec.ml.data.DataSet[] split(double... parts) Splits data set into several parts specified by the input parameter partSizes. Values of partSizes parameter represent the sizes of data set parts that will be returned. Part sizes are decimal values that represent percents, cannot be negative or zero, and their sum must be 1- Specified by:
splitin interfacejavax.visrec.ml.data.DataSet<T extends MLDataItem>- Overrides:
splitin classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>- Parameters:
parts- sizes of the parts in percents- Returns:
- parts of the data set of specified size
-
shuffle
public void shuffle()Shuffles the data set items using the default random generator. Default rng can be initialized independently -
shuffle
public void shuffle(int seed) Shuffles data set items using java random generator initializes with specified seed- Parameters:
seed- a seed number to initialize random generator- See Also:
-
getColumnNames
- Overrides:
getColumnNamesin classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>
-
setColumnNames
- Overrides:
setColumnNamesin classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>
-
getTargetColumnsNames
- Specified by:
getTargetColumnsNamesin interfacejavax.visrec.ml.data.DataSet<T extends MLDataItem>- Overrides:
getTargetColumnsNamesin classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>
-
hasMissingValues
public boolean hasMissingValues(int colIdx) -
hasMissingValues
public boolean[] hasMissingValues() -
countMissingValues
public int countMissingValues(int colIdx) -
countMissingValues
public int[] countMissingValues()
-