Back to Ben Goertzel's Research Papers and Essays |
Ben Goertzel
Evidence is mounting that the brain may be reasonably considered a quantum system. With this in mind, it is interesting to ask what implications this would have for neural network theory. How might a "quantum neural network" operate?
In this note I will present some ideas in this regard. My central conclusion is that, in a quantum neural network, information would tend to travel along broad channels formed from many bunched-together, intertwined pathways. Pathways that are not grouped into broad channels will tend to be ignored in quantum theory; they will "cancel out." Thus, in a sense, it may be that quantum theory simplifies the classical dynamics of the brain, rather than complicating it.
Suppose one has a neural network with N neurons, linked by weighted synapses with weights between 0 and 1. Suppose for simplicity that the neurons are binary: each one is eithe active or inactive at any given time. Let a(I,t) denote the activation of neuron I at time t.
For the sake of constructing a clear argument, I will assume a somewhat unusual neural network dynamics. Similar results should hold if one generalizes the model. Let us suppose, initially, that when a neuron is active, it selects only one of its links along which to fire its "quantum" of charge. The probability of selecting the link to neuron J is given by the weight of the link to neuron J.
We will be calculating the probability of a unit of charge, leaving neuron I at time t, and reaching neuron J by time t+r; let us also suppose that this "target" neuron J has no outgoing links (so that, once the unit of charge reaches J, it stays there), and that, at time t, neuron I is the only neuron active. What we are studying, then, is the probability of a single unit of charge travelling from neuron I to neuron J through the network.
In formulas, the probability we are interested in is given by P[a(J,t+r)=1|a(I,t)=1]. In classical physics, one computes this probability using the standard rules of probability theory In quantum physics, on the other hand, there is a different method: the ordinary rules of probability theory are not correct.
The probability of this transition, in quantum mechanics, is given by a sum over all possible ways that charge can flow from I to J over the r time steps between t and t+r. This means that it is a sum over all paths from I to J, of length not greater than r, through the neural network. Each path q has a certain energy H(q), and the relevant raw Feynman sum F[a(J,t+r)=1|a(I,t)=1] is the sum of e^{i/h' H(q)} where h' is the normalized Planck's constant. The quantity P[a(J,t+r)=1|a(I,t)=1] is then obtained by normalizing F[a(J,t+r)=1|a(I,t)=1] (dividing it by the sum of itself and the corresponding quantity F[a(J,t+r)=0|a(I,t)=1] ).
What is this quantity measuring? The key point is that most of the paths from J to I are cancelling out, due to interference. The only ones that are really contributing are the ones that are near local extrema of the energy function: i.e., the ones that are "clustered together" near a large number of very similar paths. The actual lengths of the paths are not important: what is important is the degree of "clustered togetherness," the density of paths at one particular place in the network.
In more comprehensible language, one may say that, in quantum theory, the probability of neuron I stimulating neuron J is high when there is a broad, clearly-defined pathway from I to J. The understanding is that "broad, clearly-defined pathway" means "a cluster of pathways all with very similar energies to each other." It is not merely the total number of pathways from one neuron to another that matters, but the breadth of the individual "channels" within the overall collection of pathways.
I have talked about the special case where each neuron only sends charge to one other neuron at a given time, where I has no out-going links, and where, at time t, neuron I is the only neuron active, etc.; but it is clear that the same ideas apply to the transition P[a(J,t+r)=x|a(I,t)=y] in a more general neural network model. In the general case, one still computes transition probabilities by Feynman sums, which favor clusters of paths of very similar energy. The difference is that these paths need not be direct linear paths from I to J; they may involve indirect effects, feedback dynamics and so forth. The intuitive conclusion still holds: a transition is more likely if it is carried out by a grouping of tightly-bunched-together paths.
This is, I must note, a fairly unusual argument, in that the limit as Planck's constant tends to zero does not give the "non-quantum dynamics" of the network. This is because the non-quantum behavior of the network is assumed stochastic. As Planck's constant vanishes, what we obtain instead is a kind of rough deterministic approximation to the assumed stochastic dynamics.
Using these ideas, one could fairly easily set up a formal neural network model that "emulates" the quantum physics of neural networks in an intuitive way, without simulating the specifics of energetic flow in the brain. Suppose one has a neural network, as described above; then in order to qualitatively emulate the implications of quantum theory for the network, what one has to do is to maintain a record of the trajectory of each unit of charge travelling through the network. One allows the network to evolve for a while, and then searches through these records to compute the appropriate transition probabilities. If one admits the length of a path as a decent approximation of the energy of a path, then one avoids calculational difficulties while maintaining the conceptual framework of quantum theory.
The calculations involved are straightforward. Take an arbitrary pair of neurons I and J. Each time a unit of charge has gone between I and J, we record the length L[I,J] of the path it took. Then we sum the quantity e^[ i/h L[I,J]] over all the L[I,J] from all the paths found. This gives the quantity F[I,J]. F[I,J] must then be normalized as above, yielding P[I,J].
Suppose that one then creates a new network by setting the weights of one's network equal to the derived P[I,J]. Then one is, in a sense, deriving the quantum neural network that is implicit in one's original neural network.
The actual relevance of this construction to the brain is an open question. My point here is merely to suggest some of the possible implications of the "quantum brain" for neural processing.