Before data can be used to design a neural network, four steps in data preparation might be applied.
Figure 3.1: Data preparation for neural network design.
- Raw data is first collected.
- In the data processing step, the data-set can be cleaned by removing corrupted and incorrect records. Transformation techniques may be used to achieve useful features or to reduce the data dimension. Categorical variables in the data-set are also converted to numerical values that can be used.
- Data labeling might be applied to label targets.
- Data-set is then divided into three sets: Training set is used to train a neural network, a validation set is used to prevent the over-fitting issue, and the test set is used to evaluate the how well the trained neural network could cope with completely new data-set
DLHUB supports three types of Data-set format; please refer to the section: "Load Training Data - Supported Files" for more details.
Notes: ANSCENTER is continuously working on supporting new data-set formats. These formats will be introduced in the next releases.