Neural Networks | Sue Brandreth's Learning Resources

Artificial neural network
(https://en.wikipedia.org/wiki/Artificial_neural_network)

In machine learning and cognitive science, an artificial neural network (ANN) is a network inspired by biological neural networks (the central nervous systems of animals, in particular the brain) which are used to estimate or approximate functions that can depend on a large number of inputs that are generally unknown. Artificial neural networks are typically specified using three things:

Architecture specifies what variables are involved in the network and their topological relationships—for example the variables involved in a neural network might be the weights of the connections between the neurons, along with activities of the neurons
Activity Rule Most neural network models have short time-scale dynamics: local rules define how the activities of the neurons change in response to each other. Typically the activity rule depends on the weights (the parameters) in the network.
Learning Rule The learning rule specifies the way in which the neural network’s weights change with time. This learning is usually viewed as taking place on a longer time scale than the time scale of the dynamics under the activity rule. Usually the learning rule will depend on the activities of the neurons. It may also depend on the values of the target values supplied by a teacher and on the current value of the weights.

Explain that Stuff Woodford, C. (2016) Neural networks
(http://www.explainthatstuff.com/introduction-to-neural-networks.html)

General illustration of a neural network: a brain scan photo overlaid with dots to represent connected neural units. Which is better—computer or brain? Ask most people if they want a brain like a computer and they’d probably jump at the chance. But look at the kind of work scientists have been doing over the last couple of decades and you’ll find many of them have been trying hard to make their computers more like brains! How? With the help of neural networks—computer programs assembled from hundreds, thousands, or millions of artificial brain cells that learn and behave in a remarkably similar way to human brains. What exactly are neural networks? How do they work? Let’s take a closer look!

Neural Networks
(https://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html)

This report is an introduction to Artificial Neural Networks. The various types of neural networks are explained and demonstrated, applications of neural networks like ANNs in medicine are described, and a detailed historical background is provided. The connection between the artificial and the real thing is also investigated and explained. Finally, the mathematical models involved are presented and demonstrated.

1. Introduction to Neural Networks: 1.1 What is a neural network?; 1.2 Historical background; 1.3 Why use neural networks?; 1.4 Neural networks versus conventional computers – a comparison

2. Human and Artificial Neurones – investigating the similarities: 2.1 How the Human Brain Learns?; 2.2 From Human Neurones to Artificial Neurones

3. An Engineering approach: 3.1 A simple neuron – description of a simple neuron; 3.2 Firing rules – How neurones make decisions; 3.3 Pattern recognition – an example; 3.4 A more complicated neuron

4. Architecture of neural networks: 4.1 Feed-forward (associative) networks; 4.2 Feedback (autoassociative) networks; 4.3 Network layers; 4.4 Perceptrons

5. The Learning Process: 5.1 Transfer Function; 5.2 An Example to illustrate the above teaching procedure; 5.3 The Back-Propagation Algorithm

6. Applications of neural networks

6.1 Neural networks in practice

6.2 Neural networks in medicine

6.2.1 Modelling and Diagnosing the Cardiovascular System

6.2.2 Electronic noses – detection and reconstruction of odours by ANNs

6.2.3 Instant Physician – a commercial neural net diagnostic program
6.3 Neural networks in business: 6.3.1 Marketing; 6.3.2 Credit evaluation

7. Conclusion

References

Appendix A – Historical background in detail
Appendix B – The back propogation algorithm – mathematical approach
Appendix C – References used throughout the review

Neilsen, M. (2015) Neural Networks and Deep Learning, Determination Press
(http://neuralnetworksanddeeplearning.com/index.html)

Download Source Code (ZIP 18,846KB)

Neural Networks and Deep Learning is a free online book. The book will teach you about:

Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data
Deep learning, a powerful set of techniques for learning in neural networks

Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing. This book will teach you many of the core concepts behind neural networks and deep learning.

Neural networks are one of the most beautiful programming paradigms ever invented. In the conventional approach to programming, we tell the computer what to do, breaking big problems up into many small, precisely defined tasks that the computer can easily perform. By contrast, in a neural network we don’t tell the computer how to solve our problem. Instead, it learns from observational data, figuring out its own solution to the problem at hand.

Automatically learning from data sounds promising. However, until 2006 we didn’t know how to train neural networks to surpass more traditional approaches, except for a few specialized problems. What changed in 2006 was the discovery of techniques for learning in so-called deep neural networks. These techniques are now known as deep learning. They’ve been developed further, and today deep neural networks and deep learning achieve outstanding performance on many important problems in computer vision, speech recognition, and natural language processing. They’re being deployed on a large scale by companies such as Google, Microsoft, and Facebook.

The purpose of this book is to help you master the core concepts of neural networks, including modern techniques for deep learning. After working through the book you will have written code that uses neural networks and deep learning to solve complex pattern recognition problems. And you will have a foundation to use neural networks and deep learning to attack problems of your own devising.

A principle-oriented approach

One conviction underlying the book is that it’s better to obtain a solid understanding of the core principles of neural networks and deep learning, rather than a hazy understanding of a long laundry list of ideas. If you’ve understood the core ideas well, you can rapidly understand other new material. In programming language terms, think of it as mastering the core syntax, libraries and data structures of a new language. You may still only “know” a tiny fraction of the total language – many languages have enormous standard libraries – but new libraries and data structures can be understood quickly and easily.

This means the book is emphatically not a tutorial in how to use some particular neural network library. If you mostly want to learn your way around a library, don’t read this book! Find the library you wish to learn, and work through the tutorials and documentation. But be warned. While this has an immediate problem-solving payoff, if you want to understand what’s really going on in neural networks, if you want insights that will still be relevant years from now, then it’s not enough just to learn some hot library. You need to understand the durable, lasting insights underlying how neural networks work. Technologies come and technologies go, but insight is forever.

A hands-on approach

We’ll learn the core principles behind neural networks and deep learning by attacking a concrete problem: the problem of teaching a computer to recognize handwritten digits. This problem is extremely difficult to solve using the conventional approach to programming. And yet, as we’ll see, it can be solved pretty well using a simple neural network, with just a few tens of lines of code, and no special libraries. What’s more, we’ll improve the program through many iterations, gradually incorporating more and more of the core ideas about neural networks and deep learning.

This hands-on approach means that you’ll need some programming experience to read the book. But you don’t need to be a professional programmer. I’ve written the code in Python (version 2.7), which, even if you don’t program in Python, should be easy to understand with just a little effort. Through the course of the book we will develop a little neural network library, which you can use to experiment and to build understanding. All the code is available for download here. Once you’ve finished the book, or as you read it, you can easily pick up one of the more feature-complete neural network libraries intended for use in production.

On a related note, the mathematical requirements to read the book are modest. There is some mathematics in most chapters, but it’s usually just elementary algebra and plots of functions, which I expect most readers will be okay with. I occasionally use more advanced mathematics, but have structured the material so you can follow even if some mathematical details elude you. The one chapter which uses heavier mathematics extensively is Chapter 2, which requires a little multivariable calculus and linear algebra. If those aren’t familiar, I begin Chapter 2 with a discussion of how to navigate the mathematics. If you’re finding it really heavy going, you can simply skip to the summary of the chapter’s main results. In any case, there’s no need to worry about this at the outset.

It’s rare for a book to aim to be both principle-oriented and hands-on. But I believe you’ll learn best if we build out the fundamental ideas of neural networks. We’ll develop living code, not just abstract theory, code which you can explore and extend. This way you’ll understand the fundamentals, both in theory and practice, and be well set to add further to your knowledge.

On the exercises and problems

Using neural nets to recognize handwritten digits

How the backpropagation algorithm works

Improving the way neural networks learn

A visual proof that neural nets can compute any function

Why are deep neural networks hard to train?

Deep learning

Appendix: Is there a simple algorithm for intelligence?

Acknowledgements

Frequently Asked Questions

A Basic Introduction To Neural Networks
(http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html)

What Is A Neural Network?

The simplest definition of a neural network, more properly referred to as an ‘artificial’ neural network (ANN), is provided by the inventor of one of the first neurocomputers, Dr. Robert Hecht-Nielsen. He defines a neural network as:

“…a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.In “Neural Network Primer: Part I” by Maureen Caudill, AI Expert, Feb. 1989

ANNs are processing devices (algorithms or actual hardware) that are loosely modeled after the neuronal structure of the mamalian cerebral cortex but on much smaller scales. A large ANN might have hundreds or thousands of processor units, whereas a mamalian brain has billions of neurons with a corresponding increase in magnitude of their overall interaction and emergent behavior. Although ANN researchers are generally not concerned with whether their networks accurately resemble biological systems, some have. For example, researchers have accurately simulated the function of the retina and modeled the eye rather well.

Although the mathematics involved with neural networking is not a trivial matter, a user can rather easily gain at least an operational understanding of their structure and function.

The Basics of Neural Networks

Neural neworks are typically organized in layers. Layers are made up of a number of interconnected ‘nodes’ which contain an ‘activation function’. Patterns are presented to the network via the ‘input layer’, which communicates to one or more ‘hidden layers’ where the actual processing is done via a system of weighted ‘connections’. The hidden layers then link to an ‘output layer’ where the answer is output as shown in the graphic below.

Most ANNs contain some form of ‘learning rule’ which modifies the weights of the connections according to the input patterns that it is presented with. In a sense, ANNs learn by example as do their biological counterparts; a child learns to recognize dogs from examples of dogs.

Although there are many different kinds of learning rules used by neural networks, this demonstration is concerned only with one; the delta rule. The delta rule is often utilized by the most common class of ANNs called ‘backpropagational neural networks’ (BPNNs). Backpropagation is an abbreviation for the backwards propagation of error.

With the delta rule, as with other types of backpropagation, ‘learning’ is a supervised process that occurs with each cycle or ‘epoch’ (i.e. each time the network is presented with a new input pattern) through a forward activation flow of outputs, and the backwards error propagation of weight adjustments. More simply, when a neural network is initially presented with a pattern it makes a random ‘guess’ as to what it might be. It then sees how far its answer was from the actual one and makes an appropriate adjustment to its connection weights. More graphically, the process looks something like this:

Note also, that within each hidden layer node is a sigmoidal activation function which polarizes network activity and helps it to stablize.

Backpropagation performs a gradient descent within the solution’s vector space towards a ‘global minimum’ along the steepest vector of the error surface. The global minimum is that theoretical solution with the lowest possible error. The error surface itself is a hyperparaboloid but is seldom ‘smooth’ as is depicted in the graphic below. Indeed, in most problems, the solution space is quite irregular with numerous ‘pits’ and ‘hills’ which may cause the network to settle down in a ‘local minum’ which is not the best overall solution. How the delta rule finds the correct answer

Since the nature of the error space can not be known a prioi, neural network analysis often requires a large number of individual runs to determine the best solution. Most learning rules have built-in mathematical terms to assist in this process which control the ‘speed’ (Beta-coefficient) and the ‘momentum’ of the learning. The speed of learning is actually the rate of convergence between the current solution and the global minimum. Momentum helps the network to overcome obstacles (local minima) in the error surface and settle down at or near the global minimum.

Once a neural network is ‘trained’ to a satisfactory level it may be used as an analytical tool on other data. To do this, the user no longer specifies any training runs and instead allows the network to work in forward propagation mode only. New inputs are presented to the input pattern where they filter into and are processed by the middle layers as though training were taking place, however, at this point the output is retained and no backpropagation occurs. The output of a forward propagation run is the predicted model for the data which can then be used for further analysis and interpretation.

It is also possible to over-train a neural network, which means that the network has been trained exactly to respond to only one type of input; which is much like rote memorization. If this should happen then learning can no longer occur and the network is refered to as having been “grandmothered” in neural network jargon. In real-world applications this situation is not very useful since one would need a separate grandmothered network for each new kind of input.

How Do Neural Networks Differ From Conventional Computing?

To better understand artificial neural computing it is important to know first how a conventional ‘serial’ computer and it’s software process information. A serial computer has a central processor that can address an array of memory locations where data and instructions are stored. Computations are made by the processor reading an instruction as well as any data the instruction requires from memory addresses, the instruction is then executed and the results are saved in a specified memory location as required. In a serial system (and a standard parallel one as well) the computational steps are deterministic, sequential and logical, and the state of a given variable can be tracked from one operation to another.

In comparison, ANNs are not sequential or necessarily deterministic. There are no complex central processors, rather there are many simple ones which generally do nothing more than take the weighted sum of their inputs from other processors. ANNs do not execute programed instructions; they respond in parallel (either simulated or actual) to the pattern of inputs presented to it. There are also no separate memory addresses for storing data. Instead, information is contained in the overall activation ‘state’ of the network. ‘Knowledge’ is thus represented by the network itself, which is quite literally more than the sum of its individual components.

What Applications Should Neural Networks Be Used For?

Neural networks are universal approximators, and they work best if the system you are using them to model has a high tolerance to error. One would therefore not be advised to use a neural network to balance one’s cheque book! However they work very well for:

capturing associations or discovering regularities within a set of patterns;
where the volume, number of variables or diversity of the data is very great;
the relationships between variables are vaguely understood; or,
the relationships are difficult to describe adequately with conventional approaches.

What Are Their Limitations?

There are many advantages and limitations to neural network analysis and to discuss this subject properly we would have to look at each individual type of network, which isn’t necessary for this general discussion. In reference to backpropagational networks however, there are some specific issues potential users should be aware of.

Backpropagational neural networks (and many other types of networks) are in a sense the ultimate ‘black boxes’. Apart from defining the general archetecture of a network and perhaps initially seeding it with a random numbers, the user has no other role than to feed it input and watch it train and await the output. In fact, it has been said that with backpropagation, “you almost don’t know what you’re doing”. Some software freely available software packages (NevProp, bp, Mactivation) do allow the user to sample the networks ‘progress’ at regular time intervals, but the learning itself progresses on its own. The final product of this activity is a trained network that provides no equations or coefficients defining a relationship (as in regression) beyond it’s own internal mathematics. The network ‘IS’ the final equation of the relationship.
Backpropagational networks also tend to be slower to train than other types of networks and sometimes require thousands of epochs. If run on a truly parallel computer system this issue is not really a problem, but if the BPNN is being simulated on a standard serial machine (i.e. a single SPARC, Mac or PC) training can take some time. This is because the machines CPU must compute the function of each node and connection separately, which can be problematic in very large networks with a large amount of data. However, the speed of most current machines is such that this is typically not much of an issue.

What Are Their Advantages Over Conventional Techniques?

Depending on the nature of the application and the strength of the internal data patterns you can generally expect a network to train quite well. This applies to problems where the relationships may be quite dynamic or non-linear. ANNs provide an analytical alternative to conventional techniques which are often limited by strict assumptions of normality, linearity, variable independence etc. Because an ANN can capture many kinds of relationships it allows the user to quickly and relatively easily model phenomena which otherwise may have been very difficult or impossible to explain otherwise.

An Introduction to Neural Networks
(http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html)

Overview:

Why would anyone want a `new’ sort of computer?

What is a neural network?

Some algorithms and architectures.

Where have they been applied?

What new applications are likely?

Some useful sources of information.

Some comments added Sept 2001

NEW: questions and answers arising from this tutorial

Artificial Neural Networks for Beginners
(https://arxiv.org/ftp/cs/papers/0308/0308031.pdf)

Download PDF (PDF 196KB)

CS-449: Neural Networks
(https://www.willamette.edu/~gorr/classes/cs449/intro.html)

G5AIAI – Introduction to Artificial Intelligence – Neural Networks
(http://www.cs.nott.ac.uk/~pszgxk/courses/g5aiai/006neuralnetworks/neural-networks.htm)

The Nature of Code – Chapter 10. Neural Networks
(http://natureofcode.com/book/chapter-10-neural-networks/)

Neural networks class – Université de Sherbrooke
(https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH)

Coursera – Neural Networks for Machine Learning, University of Toronto
(https://www.coursera.org/learn/neural-networks#)

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.

About this course: Learn about artificial neural networks and how they’re being used for machine learning, as applied to speech and object recognition, image segmentation, modeling language and human motion, etc. We’ll emphasize both the basic algorithms and the practical tricks needed to get them to work well. This course contains the same content presented on Coursera beginning in 2013. It is not a continuation or update of the original course. It has been adapted for the new platform.