Use of Rewards – Beth Bradley Dog Training

In learning theory there are several schedules of reinforcement that we can use. A schedule of reinforcement is basically the frequency in which we deliver rewards when the dog performs a behavior. To simplify this we will concentrate on 2 general schedules: fixed schedule of reinforcement and random or variable schedule of reinforcement.

A fixed schedule means that we always reward the dog after the same amount of repetitions of the behavior. For example, on a fixed schedule of 1, we always rewarded the dog after he performs 1 repetition of the behavior and on a fixed schedule of 3, we reward after 3 repetitions of a behavior (for example, we reward after performing 3 sits in a row).

Step #1: A fixed schedule of 1 is great for the beginning stages of learning when we try to establish a new behavior.

Step #2: You can’t continue Step #1 for a long period of time, it can actually work against you. Here is why: if we reward every behavior – once we stop reinforcing, the behavior will be extinct very quickly. This resembles a soft drink machine: this machine is always on a fixed schedule of 1. Every time you put money in it, a soda can pops out. One day, you arrive at a broken machine, you out money in and nothing happens, you try again because you are really thirsty and still nothing happens – you walk away – your behavior is extinct.

This is the reason that after establishing a behavior we need to change our schedule of reinforcement to a variable or random schedule. On this schedule, the dog never knows when the reward is coming. It can come after repeating the behavior twice, 4 times, 1 time, 6 times, 3 times, etc. We start transferring to this variable schedule very gradually or we risk the extinction of the behavior. We start by asking 2 behaviors for 1 treat. For example, we ask for the dog to sit, we don’t reward but rather we take a step back, get him/her to walk towards us and ask for a sit again – now we reward. Right after that, we ask for another sit and and treat immediately. Then, we might ask for 3 sits before we treat and right after that only 1 sit, and then 2 sits, etc.

This variable schedule is similar to a slot machine’s schedule. A slot machine works on a variable or random schedule of reinforcement. The gambler never knows when he/she will be rewarded but it can happen any time after he/she pulls the handle. The reinforcement varies in the amount of money given and in the frequency of the delivery of the money. Hence, one always wants to pull the handle again since there is always a chance that reinforcement will appear.

Your job is to become your dog’s slot machine and make him/her want to engage in a behavior time after time simply because there is always a chance for being rewarded.

Step #3: If your dog is making mistakes once you go variable or Checking out -went to a random reward schedule to soon or you went too soon to a long time between rewards. This can cause your dog to check out trying to figure out which rep gets the reward.

Quality of Reward (special food, ball, tug, etc)

Find out what will make the dog work harder and longer. You should notice that your dog’s intensity should go up considerably. Then you must build on the expectation of that reward.

Intensity of reward:

Make the intensity more intense, tug harder, longer, do misses make the reward even more fun and the dog should get more excited. If you are using food, do food jackpots and use your voice to praise

Duration of reward event:

Playing or doing a jackpot can be short then you go into obedience or longer durations; vary the duration so the dog doesn’t know how long the reward will go on and it keeps the dog working harder, this helps the dog from checking out after the reward is over…doing outs on the toy and back into play so this way the duration is longer. When you increase the duration of the work you must increase the duration of the reward

You can give your reward word with without giving the reward..this will make dog work harder as well because it will be as if you gave your dog a reward and the dog will work longer…Example: say good to mark a certain position, then say Yes which is the release command but don’t give a reward..the dog will look for reward, back into and obedience mode, repeat, then back into obedience, release and reward and give a good long play…reward can be out of the picture, off the field.

Remote reward:

On ground away from you, or somewhere on field then run with dog to get it

Punishment with No marker:

Negative punishment: Withholding reward( use with a marker) Start with this first

Positive punishment: Correction Dog must understand what he is being punished for so timing is utmost important….this includes a strong voice, leaning over, demeanor change or hands if the dog is sensitive to the handler

When adding punishment sometimes have to reward more often

Biting – chewing – barking – self rewarding behaviors that need to be addressed with a physical correction

Punishers like a prong collar can also be used as a guide, not necessarily punishment – you can see the difference in the dog’s demeanor

Concerns @ punishment:

Corrections for fear (are they being disobedient or fearful

What does a dog know?

Sometimes dogs doesn’t understand small issues, 1 inch off vs. 3 inches off.

Slow sits

Do they know when they are right or wrong and do they know before applying punishment?

Forging – teach back command and then you can correct for not backing up – either verbal back or head cue…if dog doesn’t back up correct for not responding to back cue but dog doesn’t really know position..then dog is anticipating left turn so staying back.

Mistakes vs. disobedience:

Punish disobedience not a mistake

Can’t punish for speed. – learned helplessness and rely on correction for speed

Punishing for classical condition to activity is wrong – beyond their control

ie. Shaking, slow sit due to tension anticipating reward, vocalization

105

Page BreakUse of Rewards

A fixed schedule of 1 is great for the beginning stages of learning when we try to establish a new behavior. However, if we continue this schedule for a long period of time, it can actually work against us. Here is why: if we reward every behavior – once we stop reinforcing, the behavior will be extinct very quickly. This resembles a soft drink machine: this machine is always on a fixed schedule of 1. Every time you put money in it, a soda can pops out. One day, you arrive at a broken machine, you out money in and nothing happens, you try again because you are really thirsty and still nothing happens – you walk away – your behavior is extinct.

This is the reason that after establishing a behavior we need to change our schedule of reinforcement to a variable or random schedule. On this schedule, the dog never knows when the reward is coming. It can come after repeating the behavior twice, 4 times, 1 time, 6 times, 3 times, etc. We start transferring to this variable schedule very gradually or we risk the extinction of the behavior. We start by asking 2 behaviors for 1 treat. For example, we ask for the dog to sit, we don’t reward but rather we take a step back, get him/her to walk towards us and ask for a sit again – now we reward. Right after that, we ask for another sit and treat immediately. Then, we might ask for 3 sits before we treat and right after that only 1 sit, and then 2 sits, etc.

Your job is to become your dog’s slot machine and make him/her want to engage in a behavior time after time simply because there is always a chance for being rewarded.