Abstract:

An artificial neural network is a computational method that mirrors the way a biological nervous system processes information. Artificial neural networks are used in many different fields to process large sets of data, often providing useful analyses that allow for prediction and identification of new data. However, neural networks struggle with providing clear explanations regarding why certain outcomes occur. Despite these difficulties, neural networks are valuable data analysis tools applicable to a variety of fields. This paper will explore the general architecture, advantages and applications of neural networks.

Introduction:

Artificial neural networks attempt to mimic the functions of the human brain. Biological nervous systems are composed of building blocks called neurons. In a biological nervous system, biological neurons communicate with axons and dendrites. When a biological neuron receives a message, it sends an electric signal down its axon. If this electric signal is greater than a threshold value, the electrical signal is converted to a chemical signal that is sent to nearby biological neurons.2 Similarly, while artificial neural networks are dictated by formulas and data structures, they can be conceptualized as being composed of artificial neurons, which hold similar functions to their biological counterparts. When an artificial neuron receives data, if the change in the activation level of a receiving artificial neuron exceeds a defined threshold value, the artificial neuron creates an output signal that propagates to other connected artificial neurons.2 The human brain learns from past experiences and applies this information from the past in new settings. Similarly, artificial neural networks can adapt their behaviors until their responses are both accurate and consistent in new situations.1

While artificial neural networks are structurally similar to their biological counterparts, artificial neural networks are distinct in several ways. For example, certain artificial neural networks send signals only at fixed time intervals, unlike biological neural networks, in which neuronal activity is variable.3 Another major difference between biological neural networks and artificial neural networks is the time of response. For biological neural networks, there is often a latent period before a response, whereas in artificial neural networks, responses are immediate.3

Neural networks are useful in a wide-range of fields that involve large datasets, ranging from biological systems to economic analysis. These networks are practical in problems involving pattern recognition, such as predicting data trends.3 Neural networks are also effective when data is error-prone, such as in cognitive software like speech and image recognition.3

Neural Network Architecture:

One popular neural network design is the Multilayer Perceptrons (MLP) design. In the MLP design, each artificial neuron outputs a weighted sum of its inputs based on the strength of the synaptic connections.1 Artificial neuron synaptic strength is determined by the formulaic design of the neural network and is directly proportional to weight: stronger and more valuable artificial neurons have a larger weight and therefore are more influential in the weighted sum. The output of the neuron is based on whether the weighted sum is greater than the threshold value of the artificial neuron.1 The MLP design was originally composed of perceptrons. Perceptrons are artificial neurons that provide a binary output of zero or one. Perceptrons have limited use in a neural network model because small changes in the input can drastically alter the output value of the system. However, most current MLP systems use sigmoid neurons instead of perceptrons. Sigmoid neurons can take inputs and produce outputs of values between zero and one, allowing for more variation in the inputs because these changes do not radically alter the outcome of the model.4

In terms of the architecture of the MLP design, the network is a feedforward neural network.1 In a feedforward design, the units are arranged so signals travel exclusively from input to output. These networks are composed of a layer of input neurons, a layer of output neurons, and a series of hidden layers in between the input and output layers. These hidden layers are composed of internal neurons that further process the data within the system. The complexity of this model varies with the number of hidden layers and the number of inputs in each layer.1

In an MLP design, once the number of layers and the number of units in each layer are determined, the threshold values and the synaptic weights in the system need to be set using training algorithms so that the errors in the system are minimized.4 These training algorithms use a known dataset (the training data) to modify the system until the differences between the expected output and the actual output values are minimized.4 Training algorithms allow for neural networks to be constructed with optimal weights, which lets the neural network make accurate predictions when presented with new data. One such training algorithm is the backpropagation algorithm. In this design, the algorithm analyzes the gradient vector and the error surface in the data until a minimum is found.1 The difficult part of the backpropagation algorithm is determining the step size. Larger steps can result in faster runtimes, but can overstep the solution; comparatively smaller steps can lead to a much slower runtime, but are more likely to find a correct solution.1

While feedforward neural network designs like MLP are common, there are many other neural network designs. These other structures include examples such as recurrent neural networks, which allow for connections between neurons in the same layer, and self-organizing maps, in which neurons attain weights that retain characteristics of the input. All of these network types also have variations within their specific frameworks.5 The Hopfield network and Boltzmann machine neural network architectures utilize the recurrent neural network design.5 While feedforward neural networks are the most common, each design is uniquely suited to solve specific problems.

Disadvantages

One of the main problems with neural networks is that, for the most part, they have limited ability to identify causal relationships explicitly. Developers of neural networks feed these networks large swathes of data and allow for the neural networks to determine independently which input variables are most important.10 However, it is difficult for the network to indicate to the developers which variables are most important in calculating the outputs. While some techniques exist to analyze the relative importance of each neuron in a neural network, these techniques still do not present as clear of a causal relationship between variables as can be gained in similar data analysis methods such as a logistic regression.10

Another problem with neural networks is the tendency to overfit. Overfitting of data occurs when a data analysis model such as a neural network generates good predictions for the training data but worse ones for testing data.10 Overfitting happens because the model accounts for irregularities and outliers in the training data that may not be present across actual data sets. Developers can mitigate overfitting in neural networks by penalizing large weights and limiting the number of neurons in hidden layers.10 Reducing the number of neurons in hidden layers reduces overfitting but also limits the ability of the neural network to model more complex, nonlinear relationships.10

Applications

Artificial neural networks allow for processing of large amounts of data, making them useful tools in many fields of research. For example, the field of bioinformatics relies heavily on neural network pattern recognition to predict various proteins’ secondary structures. One popular algorithm used for this purpose is Position Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST) Secondary Structure Prediction (PSIPRED).6 This algorithm uses a two-stage structure that consists of two three-layered feedforward neural networks. The first stage of PSIPRED involves inputting a scoring matrix generated by using the PSI-BLAST algorithm on a peptide sequence. PSIPRED then takes 15 positions from the scoring matrix and uses them to output three values that represent the probabilities of forming the three protein secondary structures: helix, coil, and strand.6 These probabilities are then input into the second stage neural network along with the 15 positions from the scoring matrix, and the output of this second stage neural network includes three values representing more accurate probabilities of forming helix, coil, and strand secondary structures.6

Neural networks are used not only to predict protein structures, but also to analyze genes associated with the development and progression of cancer. More specifically, researchers and doctors use artificial neural networks to identify the type of cancer associated with certain tumors. Such identification is useful for correct diagnosis and treatement of each specific cancer.7 These artificial neural networks enable researchers to match genomic characteristics from large datasets to specific types of cancer and predict these types of cancer.7 (What to put in/ what to get out/process)

In bioinformatic scenarios such as the above two examples, trained artificial neural networks quickly provide high-quality results for prediction tasks.6 These characteristics of neural networks are important for bioinformatics projects because bioinformatics generally involves large quantities of data that need to be interpreted both effectively and efficiently.6

The applications of artificial neural networks are also viable within fields outside the natural sciences, such as finance. These networks can be used to predict subtle trends such as variations in the stock market or when organizations will face bankruptcy.8,9 Neural networks can provide more accurate predictions more efficiently than other prediction models.9

Conclusion:

Over the past decade, artificial neural networks have become more refined and are being used in a wide variety of fields. Artificial neural networks allow researchers to find patterns in the largest of datasets and utilize the patterns to predict potential outcomes. These artificial neural networks provide a new computational way to learn and understand diverse assortments of data and allow for a more accurate and effective grasp of the world.

References

  1. Taiwo Oladipupo Ayodele (2010). Types of Machine Learning Algorithms, New Advances in Machine Learning, Yagang Zhang (Ed.), InTech, DOI: 10.5772/9385. Available from: http://www.intechopen.com/books/new-advances-in-machine-learning/types-of-machine-learning-algorithms
  2. Neural Networks: An Introduction By Berndt Muller, Joachim Reinhardt
  3. Urbas, John V. Article
  4. : Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, 2015 
  5. Elements of Artificial Neural Networks by Kishan Mehrotra, Chilukuri Mohan
  6. Neural Networks in Bioinformatics by Ke Chen, Lukasz A. Kurgan
  7. Artificial Neural Networks in the cancer genomics frontier by Andrew Oustimov, Vincent Vu
  8. An enhanced artificial neural network for stock price predictions by Jiaxin Ma
  9. A comparison of artificial neural network model and logistics regression in prediction of companies’ bankruptcy by Ali Mansouri
  10. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes by Jack V. Tu

1 Comment