Quicker and Stronger Behaviors Reinforced Using Variable Reward System | Off Leash K9 Training

Dog Training Variable Reward


In the 1950′s, a man named B.F Skinner described a powerful cognitive quirk, called A Variable Schedule Rewards. Skinner’s experiment/study observed that lab mice reacted most voraciously to random, variable rewards; when the mice would press the lever, sometimes they’d get a small treat, other times they’d get a large treat and then at other times they’d get no treat at all. The mice that received variable rewards seemed to have pressed the lever compulsively, where as, the mice that received the same treat every time seemed to have pressed the lever more freely and controlled. The same principle applies for humans…

Here is a interesting example/reality; have you every spoken to a person while he or she is playing a video game, trying to engage with them and ask them questions, but all you get in reply is a mumbled “yea, ,mhm, ok, whatever”, then you have seen this mental state. The person playing the game is so attached to the game that they will almost agree to, or do anything in order to get rid of any distractions that is coming their way and do whatever it takes for them to keep playing the game. That is why it seems as if, and is believed that variable rewards seem to keep the brain occupied and providing new habits coming its way. Our brains are never satisfied, therefore always in search for the next reward, which is why we state this transcendental state as “fun”. Recently, a neuroscience study has shown that our dopamine system works at such level that it always requires and is in search for new desires rather than provide us with rewards for our hard efforts.

Even though this stressful, hardwiring mental state pains us at times, it has kept us alive for decades as species.

Why does this work?

Well there isn’t a correct, proofed answer; but it has to do with dopamine, a neurotransmitter that is very closely connected with ones’ desires and habits. Receiving rewards increases ones dopamine levels in the brain, which then motivates and encourages them to do what they did in order to get that reward (rats with no dopamine receptors tend to struggle to build habits). Studies and experiments have shown that unpredictable rewards/treats tend to cause a greater increase level of dopamine.

Let’s briefly look at how variable rewards have been applied in such areas like casinos and other gambling games.

Researchers have shown a study that dopamine levels in ones brain varies most in situations where they are unsure about wether they are going to be rewarded or not, such as when gambling or playing the lotto. Dopamine has been known for a long time as a important role in ones system in how we experience rewards from variety of different natural sources such as food and sex. “Using a combination of techniques, we were actually able to measure release of the dopamine neurotransmitter under natural conditions using monetary reward,” said David Zald, assistant professor of psychology at Vanderbilt University. This research is a way of understanding what happens to ones brain during an unpredictable reward situations, such as gambling.

The team studied the people under three different situations. Under the first situation, the individual chose one of four cards and knew a monetary reward of $1.00 was possible but did not know when it would happen. During the second situation, individuals knew they would get a predictable reward with every fourth card they selected. Under the third situation,
individuals chose cards but did not receive nor expect any rewards at all.  Zaid and his team found that over the course of the study, dopamine  increased more in one part of the brain in the unpredictable first situation. In contrast, the receipt of a reward under the predictable situation did not result big increases in dopamine.

In summary, when faced with unpredictable positive responses, our internal instinct is to automatically try and to make it predictable so we can control the outcome. However, once it is predictable,  it often times loses its’ appeal. This is the same reason that people love to play video games and spend hours per day doing so; however, once they beat the game or master the system, they rarely play that game anymore (and move on to a new one).

“The most interesting thing we found is that there were areas that showed increased dopamine release during the unpredictable condition,” Zald said.

To make this very simple to understand, I like to look at it like deer hunting.  If every single time you went out to hunt, you actually got a deer (fixed ratio, 1:1), you wouldn’t be very motivated because you know it will always be there.  If you knew that you had to wait for 4 hours and then a deer would appear (fixed interval: 4 hours) then you would not be motivated for the first 3.9 hours, because you know it will not appear anyway.   If you know it will appear only on day 3 (fixed ratio) but could appear at any time on day 3 (variable interval), then you would only be motivated to really pay attention throughout day 3!

Then, we have the reality of hunting, which is the primary reason it’s so addictive and men/women every season spent countless hours and days in their tree stands.  It’s because it works off of a variable ratio and variable interval “reward” schedule.  What this means is that the deer could show up on ANY day and ANY time; however, as all hunters “know,” you feel confident that if you “put your time in,” the reward will come.  This is why we stay highly motivated to “hunt” every year, and now you understand why something such as a detection dog stays highly motivated to “hunt” for the odor, as well.

With keeping the hunting scenario in mind, please refer back to the quote earlier, “In summary, when faced with unpredictable positive responses, our internal instinct is to automatically try and to make it predictable so we can control the outcome.”

Can you think of how hunters also do this in hunting?  What do hunters do in order to make the variable ratio/variable interval much more easy and predictable (fixed ratio/fixed interval)?  They use trail cams, they use doe urine, and they use bait.  This is to try to maximize efficiency and minimize effort, which is the same thing dogs do, as well.  However, as noted above (casino, hunting, playing video games, etc), once it becomes 100% predictable,  motivation and interest usually declines.

If you have any questions about making your dog great, find an Off Leash K9 Trainer in your area:





Leave a Reply