Artificial Neural Networks

 

Artificial neural networks (ANN) are used as data analysis tools and are known for their learning and generalization ability. ANN is a mathematical model (algorithmic) inspired by biology. Neural networks provide relationships between complex data through a learning phase. Types of neural networks used are general regression, polynomial, and probabilistic neural networks. ANN uses processing features called neurons, each having a mathematical function with specific inputs, a computational procedure, and outputs. ANN models used in software estimation are the radial basis function network (RBFN) and function link artificial neural network (FLANN). The RBFN model offers a straightforward design, good generalizability, strong tolerance to noise, and learning ability. The FLANN method is suited when data is nonlinear and is less complicated. Although ANN is considered an algorithmic process, the network is not an algorithm but a framework of learning algorithms.

ANN's principal characteristic is the ability to approximate nonlinear functions and is thus similar to traditional statistical techniques such as logical regression, statistical regression, and discriminant analysis. The ANN method utilizes machine learning and pattern recognition for estimation and can discover relationships between the dependent and independent variables. Artificial neural networks have gained popularity for software estimation prediction due to their ability to capture complex data and to disregard noise in the input data. ANN uses data from previous software projects to provide outputs by inference through learned data. The ANN design, inspired by the biological nervous system processes information using computational elements (nodes) operating through weighted inputs (layers) to provide accurate estimates. Additionally, the more considerable the amount of historical data, the more accurate the estimation; thus, the ANN is most effective in achieving accurate software development estimations when historical data is available.

Artificial Neural Networks (ANNs) are a subset of machine learning models that are inspired by the human brain. They consist of layers of interconnected nodes, or "neurons", which work together to learn from data and make predictions or decisions.

In the context of project estimation, ANNs can be used to estimate project costs, schedules, or other key parameters based on past data. For example, an ANN might be trained on historical project data that includes features like project size, complexity, team experience, technology used, and actual cost or duration. The ANN learns the relationships between these features and the cost or duration, and can then be used to make estimates for new projects.

Here's a high-level overview of how the process works:

  1. Data Collection: Historical project data is collected, including both the features (e.g., project size, complexity, etc.) and the target variable (e.g., cost or duration).

  2. Data Preprocessing: The data is cleaned and preprocessed, which might include handling missing values, encoding categorical variables, normalizing numerical variables, etc.

  3. Training the ANN: The preprocessed data is used to train the ANN. This involves using an algorithm (such as backpropagation) to adjust the weights and biases of the neurons in the network to minimize the difference between the network's predictions and the actual values.

  4. Validation and Testing: The trained ANN is validated and tested on a separate set of data to check its performance. Performance metrics like mean squared error or mean absolute error might be used.

  5. Prediction: Once the ANN has been trained and validated, it can be used to make estimates for new projects. This involves inputting the features of the new project into the ANN and using the network's output as the estimate.

ANNs can be powerful tools for project estimation because they can model complex, non-linear relationships and learn from large amounts of data. However, they also have some limitations. They require a lot of data to train effectively, they can be difficult to interpret and explain, and they can be sensitive to the quality and relevance of the data used to train them.