Is traditional training on the way out and positive reinforcement training the way of the future?

I read Val Bonney's article, 'It's time to make decisions' in the November 2011 issue of 'The Queensland Dog World' with great interest. Everybody would agree that we need to be asking questions, explaining the options and be open to discussion as to which training choices to make. This article is presented from the perspective of someone who has made the decision to switch from traditional to positive reinforcement training.

For any ongoing debate to be useful the basic terminology initially needs to be agreed. To assist in discussion I firstly offer the following explanations of some terms used in this article.

A behaviour, for sake of the debate, is any action that a living animal is capable of doing. Animals of all species learn behaviour either through desirable or undesirable consequences, operant conditioning (Skinner), or they learn through reflexes, classical conditioning – also sometimes known as respondent or Pavlovian conditioning. Training primarily involves the use of operant conditioning. No matter what the species, every single trainer in the world, knowingly or not, utilises one or more of the four quadrants of operant conditioning (positive punishment, positive reinforcement, positive reinforcement and negative reinforcement) when training any behaviour.

[The technical function of reinforcement is to increase the occurrence of a behaviour. The function of punishment is to reduce the occurrence of a behaviour. Reinforcement is positive when something nice is added to the animal's environment (e.g. food) and it is negative when something nasty is removed (e.g. an electric current is switched off). Punishment too is either positive, when something unpleasant is added (e.g. hitting or turning on electricity) or it is negative when something pleasant (e.g. food or attention) is not given or is removed. ]

Traditional training involves correcting wrong behaviour (e.g. via a check chain) and praising or patting if the dog gets it right. Pressure around the neck is applied, positive punishment, and when the dog is in the right position the pressure is released, negative reinforcement. Positive reinforcement then follows via praise, a pat or, although frowned on for many years in some dog clubs, even food.

In positive reinforcement training, where behaviours are obtained either through the lure and reward method or by the newer capturing, targeting and shaping techniques of clicker training, no pressure or force is applied. The positive punishment quadrant of operant conditioning is, except in very rare circumstances, a 'no-go!' zone. If the dog gets it right then he is positively rewarded with something that most motivates him at that point in time. If he gets it wrong then negative punishment reduces the likelihood of the error being repeated.

 

Traditional Training

A traditional view of dog training sometimes describes the trainer as the alpha wolf of the pack, the boss, giving instructions in a firm to harsh voice and establishing dominance - sometimes physically through neck scruffs or alpha rolls - to obtain a dog's submission. The trainer demands the behaviour instantly and, if necessary, physically pressures the dog into position. Moving into the right location, to avoid something aversive (e.g. a severe leash jerk or a mild leash 'pop') is rewarded with praise, patting or a scratch under the chin.

This approach originated from the 1947 paper, 'Expression Studies on Wolves', by zoologist Rudolf Shenkel. It was a short term study on unrelated captive wolves in a Swiss zoo which concluded that wolves in a pack fight to gain dominance and the winner is the alpha wolf. These observations were then extrapolated to wild wolf behaviour and subsequently, as they are distantly descended from wolves, to pet dogs. According to this approach, often seen on some popular quick-fix American TV shows, humans need to dominate their dogs in order for them to behave.

Positive Reinforcement Training

Whether they are teaching a dog or a dolphin, a pig or a parrot, many animal trainers around the world have questioned whether the methods of their fathers and grandfathers are necessarily the methods that they should still be using today. Many of them have decided that there are other options.

The newer training methods involve no physical pressure, force or pain - however mild.  Positive reinforcement is used frequently to give a puppy feedback on how he is doing, and negative punishment is used occasionally. The dog, whether he is a police dog, a guide dog or a family pet dog, has a choice. He is allowed to think for himself as to what decision to make. He is encouraged to use his initiative and is not fearful to experiment and be corrected should he make a mistake. He becomes a willing and involved trainee who, through trial and error, learns a large repertoire of behaviours quickly. He soon reliably, enthusiastically and joyously does what is required of him because, firstly, he fully understands and, secondly because he wants to not because he has to.

Clicker Training

This method is simply an extension of positive reinforcement training. An additional teaching aid in the form of a 'clicker' is used by some positive trainers to take their skills to a different level. The 'click' makes a sound that, only after it has first been classically conditioned to be meaningful, identifies the desired behaviour instantly it happens. The click, like the dolphin trainer's whistle, separates the behaviour from the reward. It explains to a dog that what he is doing at that exact moment is right and that consequently something nice will follow.

Once the behaviour is established reliably then, and then only, is a cue word introduced as a signal, or a label, as to what the behaviour is called. The word is said quietly, firmly and clearly as information that lets the dog know immediately what action is required. Since the desired behaviour happens promptly, happily and willingly there is no need for a loud voice, repetitions or corrections.

Many dog owners - from owners of police dogs to service dogs, competition dogs to the faithful family pet - have found that using a conditioned marker signal after the behaviour, and immediately preceding the reinforcement, significantly reduces the amount of time needed for basic to advanced training and also for rehabilitation and behaviour modification. Teaching with successive approximations, small steps, towards the final goal and setting a dog up for 100% success at every step, ensures that learning is mostly error free and therefore, because failure is rare, corrections are unnecessary. On the occasions that they are required it is invariably more effective if done via negative punishment rather than with a verbal correction, leash jerk or an electric shock.  

Dogs that are free-shaped with a clicker (allowed to experiment and problem solve) without the fear of reprisals, learn rapidly, trustingly and permanently and do not need repetitious practice drills. Their trainers therefore have more time available to help them to develop a large repertoire of instantly performed behaviours.

Here, from a decade ago, are a few training examples that considerably influenced me and other traditional trainers to consider the training options and to change the methods we had been using:

   

1.       'Spinner' was a Golden Retriever who never received a verbal or collar correction in his life and in 1999 obtained his obedience championship before he was three years of age. As described in the January 2011 issue of 'Dog World' his owner, Sue Hogben from Perth, crossed over from traditional to clicker training and has since achieved 18 scores of 200/200 in open obedience competitions with four different dogs.

2.       'Geordie' was a German shepherd cross from the Police Dog Squad in Maroochydore on the Sunshine Coast. His handler, Glen Wilson, wrote on his website: "I learned a lot from the head marine mammal trainer at Sea World ... that there is no need for force or punishment in dog training. Once I had discovered Operant Conditioning and swapped the check chain for a clicker I really picked up control and polish". In 1999 Geordie beat every other conventionally trained police dog to comfortably win the biannual Australian/New Zealand Police Dog Championships held that year in Melbourne.

3.       'Peek', a 3. 5-kilo Papillon was an assistance dog for a double leg amputee in a wheelchair. He could put clothes into a washing machine, make the bed, fetch the phone and TV remote controls, open and shut cupboard doors, operate light switches, flush the lavatory, push buttons in a lift and help his owner undress. His owner Debi Davis from, Arizona, USA said: "Peek was close to euthanasia when he came into my life at 3 months, with a host of behavioural problems". She joined a traditional training class where, "the more I pushed, pulled and corrected the more Peek postured, growled and challenged. Classes were a war of wills". They both found it "too stressful" and when she was told to use an ear pinch to teach holding the dumbbell Debi finally quit. Soon afterwards she discovered a clicker class that accepted small dogs for service work. A few years later, after thousands of "short, happy, proactive learning sessions", Peek became the first 'toy' breed to win the prestigious 'Service Dog of the Year' award at the 1999 Delta Society annual conference. He was also the first 'clicker-trained' dog ever to do so.

4.       Many zoos, such as Taronga Zoo in Sydney have changed exclusively to positive reinforcement techniques. As a result they no longer have to use physical force, such as roping and electric prodding, or sedate their elephants and other animals when filing toenails, giving injections or collecting blood samples.

Alexa Capri in Italy has been clicker training since 2003 with emotionally aroused and dangerous ex-fighting Pit Bulls rescued from illegal East European fighting rings and sent to her for rehabilitation. In 2006 I attended a workshop in the UK for clicker trainers where she said, "It took us one year and a half before one of the ex-fighting pits could trust us enough to lie down when he was on a leash. He would stand and wait for something bad to happen. We have re-homed 14 pits.... as a dog trainer I see many dogs and I am horrified by those who claim that you cannot rehabilitate a Pit Bull with methods (e.g. classical conditioning to change emotions) based on gentle and positive training and trusting relationship". Further details and videos can be seen at http://www.gentleteam.it

   

Over the years Shenkel's conclusions on wolf pack theory have been questioned. Professor David L Mech, who spent 40 years living with and studying wild wolves in North America, showed that wolves actually live as small family groups. Unlike captive wolves they do not engage in dominance and submissive behaviours or fight within a family group to gain dominance with the winner becoming the alpha wolf. In the introduction to his study of wild wolves (2000) Mech says, "Attempting to apply information about the behavior of assemblages of unrelated captive wolves to the familial structure of natural packs has resulted in considerable confusion. Such an approach is analogous to trying to draw inferences about human family dynamics by studying humans in refugee camps. The concept of the alpha wolf as a 'top dog' ruling a group of similar-aged compatriots is particularly misleading". [Read more at http://www.mnforsustain.org/wolf_mech_dominance_alpha_status.htm.]

Even though alpha dominance theories have been challenged some people will continue to use the methods that have always worked well for them. 'Why change?' Just because traditional methods can undoubtedly produce results does it follow that there aren't different, newer or quicker ways of achieving the same goals?

Skilled positive reinforcement trainers know how to phase food out properly and not to use treats as a bribe (other than maybe once or twice to jump start a new behaviour) but to reward the behaviour only after it has occurred. They know that a reward is an effective reinforcement only if it strengthens the behaviour and that the reinforcement should be not only food but a variety of other motivators such as praise, play, tug, games or life rewards. The trainer sets an animal up for success and helps him to improve, step by step, on what he can do right. Many clicker trainers take notes and carefully chart the progress at training sessions of the particular behaviour being taught. They actively welcome errors - both as a learning opportunity for the dog and as a prompt for themselves to improve their ability in terms of timing, rate and quality of reinforcement and criteria selection. [That is why some keen dog trainers hone their skills learning to train other species such as chickens!]

The aims of all animal trainers are similar: to establish new desired behaviours, strengthen and refine existing desired behaviours, however simple or complex, or to modify or eliminate existing undesired or dangerous behaviours. The simple fact is that results can, without question, be achieved with all training methods.

It is the approach to education, the enjoyment or otherwise of the training process from the dogs' point of view and the philosophy and vocabulary that vary. Cues encourage voluntary and active participation in learning. With contemporary training 'command' and 'obedience' become 'cue' and 'good manners', 'correct' and 'dominate' change to 'show' and 'motivate'. 'No!' becomes 'Yes! (you have understood)', 'stop' becomes 'do something else instead' and 'pack leader' becomes 'team leader' and also the 'coach' who sets obligatory rules, boundaries and limitations.

In the 1950s-1970s there were no alternatives so everybody used check chains and employed conventional dog training techniques. Owners who like me chose much later in life to use different equipment and methods have discovered the changed philosophy and strategies to be more effective than commands and compulsion. We have also found them easier, more efficient and enjoyable at both ends of the lead. We have welcomed the accelerated learning curve when our current dogs have learned in ten minutes or less what our previous dogs learned in ten days or more and have done so because they wanted to and not because they had to.

Open discussion, for example via further articles to 'Dog World', absolutely needs to continue. If people understand the theory of animal learning and the pros and cons of alternative training methods they can make informed decisions as to which techniques and equipment to use. Positive reinforcement training may very well never be everybody's cup of tea; but I sense, and certainly hope that for the majority of animal trainers around the world it will increasingly and inevitably become their preferred choice.