Hacking the neural network

Lately, I’ve been intrigued with the idea of brain hacking, triggered in part by the talk Lee von Kraus gave at the NY hall of science some months ago.

By “hacking”, let’s be clear that I do not mean hardware hacking of the type Ben Krasnow seems to be so fond of. I tried that in fact, and it didn’t work.

Rather, let’s define ‘hacking’ as a form of software manipulation; reprogramming the neural network, if you will.

What is a neural network?

Our brains are amazing devices. Unlike most computers which process data sequentially (albeit, at amazing speeds), evolution has decided the parallel processing route was a bit better. No doubt, in my mind, because it’s *very hard* to completely disable a huge, decentralized network. Phineas Gage is the best example of this that comes to mind, but all it takes is one look at the RIAA’s attempts to destroy such networks, to gain a good understanding of what I mean!

But I digress.

A brain is a huge parallel computation machine.  One with a “clock speed” of about 10 hertz, but a node count something akin to 230*10^9. A massive number yes, but what exactly does it mean?

Given its low processing rate, yet amazing IO capability, a brain is interesting compared to normal Von-Nuemann machines. To make the distinction a bit more clear I’m going to define some  quantity “computing power”; that is, the number of possible operations which can be preformed in some finite time:

(tick) (operations) (# of operations preformed in one tick)^-1 = computing power (P)

Ticks can be thought of as the time it takes for an operation, or group of operations, to be processed. Propagation delay or reaction time, if you will. In humans this works out to be on average 200 milliseconds, in a typical desktop PC, 0.294 nanoseconds.

Operations are just that; operations. Memory lookups, inputs, outputs… anything.

The maximum number of operations that can be preformed in one tick can be thought of as the “bus width”. For humans, this is practically limitless, and can be assumed to be equal to the number of operations which are given to the machine. For your typical 4 thread CPU, one can assume this to be no more than 4.

With this in mind, let’s compare some simple tasks:

Elementary mathematics as an example typically deals with one input, one output, a few memory/character lookups, and two memory writes at any given time (8 operations). For your brain, this becomes a problem since your ability to compute such algebra in unit time, is resultantly;

(0.2) * (8)  * (8)^-1 = 0.2 seconds

Very quickly you can see that summing an infinite series, doesn’t quite work with our hardware. For a von-neumann machine though, the opposite is true:

(.294 * 10^-9) * (8) * (4)^-1 = 2.35 nanoseconds

Obviously, humans were not designed to do simple math.

Here’s where the fun comes in, however. Our eyes have on average, 190 million photo-receptors total, and we have two of them. This corresponds to an image, 390 megapixels in size, that gets processed in continuous time by our brain. What is then, our ability to process a single “frame”?

(0.2) * (390 * 10^8) * (390 * 10^8) ^-1 = 0.2 seconds

Now it’s fun to see that for a Turing computer:

(.294 * 10^-9) * (390 * 10^8) * (4)^-1  = 2.85 seconds

…it takes longer.

This is the beauty of parallel architectures. They are limited only by (A) their ability to handle IO, and (B), their total propagation delay. For operations that process a large amount of data such as filtering, image processing, and voice recognition, they work really, really well. For operations like arithmetic, such is not the case.

It’s prudent to note though, that neural networks are fundamentally bottle-necked by their reaction time. How is it then, that you are able to drive a car if your “visual framerate” is only 0.2 seconds?

Now it gets even more fun.

Our brains regularly compensate for the hardware available. This hardware runs in continuous time –time not governed by a “system clock” like every other state machine out there. An input now, comes out about 200ms later. An input (now + 20ms), comes out about (200 + 20)ms later. It’s a continuous signal processing machine, which means it can do fun tricks to compensate for scarcity in free hardware.

This text that you’re reading right now is text that you are in fact reading right now. It’s visually processed, recognized as words, and sent through a specialized piece of your network trained for comprehension.

But the edge of your computer monitor, or the room in your periphery… is not.

That’s actually the world as it was some noticed time ago; with some continually added information. That is, it’s a hallucination about how things are expected to be, one that is updated only when inputs dramatically change. As it rightly should; why would you waste finite computing resources on what the back wall is doing, when you “know” with certainty that it’s not going to be doing anything interesting? Of course, this has its downfalls when someone unexpectedly chucks a baseball your way, but it’s certainly nice to have enough neurons leftover, to process sound while watching a movie!

This is among one of many tricks.

Why is it, that you only notice the lead tracks in a 20 track studio production? Why is it, that you don’t notice the 3rd chair violinist’s contribution in an orchestra rendition? Why is it so that you don’t notice the noise your computer fan makes, unless it changes unexpectedly? All of this information is there; it’s just… not important.

I’m sure many of you have seen this:

It illustrates my point exactly.

I feel it’s safe to argue, that a person’s experience of the world is merely a simulation. A simulation that responds to inputs in such a way to better handle future inputs, trained with what’s familiar. 

This is why it’s possible, to reach for your right toe without looking and (usually) succeed. You know what degrees of freedom your body is capable of, and you know the position of your toe relative to the rest of your body. With this model in mind, it’s a simple task to tell your arm to move in such a way, that all it takes is a little bit of a priori SLAM at the end to make fine adjustments. Walking, a task that took Honda 30 years to accomplish on Von-Neumann machines, is simple for us because of this.

This is also why abstract concepts such as charged particles moving in B fields are such confusing ones to comprehend. Your network isn’t trained to recognize that type of pattern. If I was a philosopher, I’d say it’s reasonable to argue that sentience is nothing but the ability such a network to modify its surroundings, such that the inputs it receives are familiar; self-preserving. But I’m not, so that tangent ends here.

 

The good stuff: Hacking such a network

It’s not really possible to ‘program’ a neural network. Can you imagine giving explicit instruction to 10^11 nodes, in such a way that they recognize text from an image?

Code that in C. I dare you.

No. Rather, the only feasible way to preform such a task is to ‘teach’ the network. Give it inputs, monitor the outputs, then “tell it” to remember its connections if the output is good. Do this enough times, and eventually you’ll get a predictable response worth looking at. Do this continuously for 80 years, with millions of inputs and a pre-programmed ‘instinct’ to get a head start, and you get an old wise man.

This in mind, I will make the postulate that ‘personality’; habits, reactions, ideals, tastes and preferences… are all learned. They’re statistical patterns formed through past experiences, and by the definition of a neural network, they are patterns that can be changed. Hacked, if you will.

And how does one hack them? Tell nodes that what they’re doing is wrong, useless even. Eventually, they will change.

These past two years I’ve been doing just that. Making note of my habits, my responses, and recognizing the patterns they reveal. When I see something I don’t particularly like, I do what I can to tell myself to “change”. And quite frankly, it works.

I used to be a very depressive person. When something went wrong, the response was to feel sad about it. Hobbies, drugs, you name it, did not change that.

Of course this didn’t get much accomplished, so I told myself that. Consistently.

Now, when something breaks or goes wrong, there is no longer any depression. Maybe I’m sad, or annoyed for 20 minutes, but the pattern of pouting for days about it, is gone.

The fun thing is, there seems to be no “limit” to the extent at which this occurs.  If I break a $2,000 laser, it doesn’t sadden me. If a family member passes on it’s an unfortunate event, but it’s not likely to bother me for more than a half hour. If my oscilloscope breaks, rather than call up Tektronix and scream over the phone, I take the thing apart, see what went wrong, and determine whether or not it was my ill-doing or simply an inherent machine fault. I can act rationally.

Some might call that dehumanizing, but I won’t. It’s nice to be able to react to such situations, without the bias that sadness otherwise provides. That was a good hack.

 

So what else can we hack?

Well, I don’t know. This is not a science.

We’re going to try things and see what’s possible. At the time of this writing, my current goals are:

  • To immediately respond, as much as feasible, when given tasks. I intend to start by immediately responding to emails and such, until it becomes habitual.
  • To reprogram my fight-flight patterns, such that mathematics does not trigger such a response. We need to associate integrate(x^2+3x/e^4x) with “fun”, and do so effectively.
  • To reprogram my flight-flight patterns, such that exams do not trigger the response. We’re making progress here.
  • To dissociate “fear” with “unknown”. There is little good reason to be fearful of what one doesn’t yet know, and doing so, only wastes time. I imagine this is why so many entrepreneurs fail in their ventures; they waste too much time planning in fear.

…and that’s it, currently.

I must stress to the reader of my words, this is hard. It is very difficult to tell billions of nodes that their connections are now wrong, and even more difficult to tell them that the new ones they make are worth keeping. Do not try to change too much at once, or it will not work.

Old patterns are strong; archaic ones even stronger. One must never accept their familiar outputs, if a hack is to succeed.

Now, go break some synapses!