Operant conditioning

Operant conditioning
Learning by consequences
Ratatouille
Ratatouille is hungry
and performs various
exploratory behaviours
By chance he
presses the lever
I’ll do that
again
A pellet of food
appears!
Some definitions....
Reinforcement :
Positive
reinforcement :
Negative
reinforcement :
Anything which has the effect of increasing the likelihood
of the behaviour being repeated
Anything which has the effect of increasing the likelihood
of the behaviour being repeated by using consequences
that are pleasant when they happen i.e. food for
Ratatouille
Anything which has the effect of increasing the likelihood
of the behaviour being repeated by removal of something
unpleasant.
Some definitions....
Punishment :
Positive
punishment :
Negative
punishment :
Anything which has the effect of decreasing the likelihood
of the behaviour being repeated by using consequences.
Anything which has the effect of decreasing the likelihood
of the behaviour being repeated by using consequences
that are unpleasant.
Anything which has the effect of decreasing the likelihood
of the behaviour being repeated by taking something
pleasant away.
Operant Conditioning
Primary and secondary reinforcement
• Read about primary and secondary reinforcers, define the
terms primary and secondary reinforcements and
complete the table.
Primary reinforcers
Secondary reinforcers
Food when hungry
Money
Real world application
• Describe how operant conditioning is used in the real world:
• 1) to train a dog to sit
• 2) through token economy in a prison
Shaping
• Shaping is used to improve and modify behaviours until a
satisfactory standard has been achieved.
• How do we learn language? Using the concept of shaping,
describe this process. (4 marks)
Schedules of reinforcement
• When and how often we reinforce a behaviour can have a significant
impact on the strength and rate of the response.
2 types of schedules
• Continuous reinforcement: the desired behaviour is reinforced every
single time it occurs.
• Partial reinforcement: the response is reinforced only part of the
time.
1. Fixed ratio schedules: the response is reinforced only
after a specified number of responses.
2. Variable-ratio schedules occur when a response is
reinforced after an unpredictable number of
responses.
3. Fixed-interval schedules the first response is
rewarded only after a specified amount of time has
elapsed
4. Variable-interval schedules occur when a response is
rewarded after an unpredictable amount of time has
passed.
Which schedule of reinforcement produces
the fastest learning?
Match them to the schedules of
reinforcement:
• Due to varied time they don’t know
when the reward will come so the
schedule is successful and extinction is
slow
• Once the schedule is learnt they may
pause between rewards knowing that
nothing will happen, extinction is
quite quick.
• Due to uncertainty of rewards it is
successful and resistant to extinction.
• Not very successful – rats speed up
their response just before the next
reward is due. Extinction is quick too.
Example: Thanked every time you
wash a car
Example: Factory work, i.e. a pound
for every 10 toys made.
Example might be monthly pay.
A bell goes off at random times in the
classroom. Tina is rewarded if she is
"on task”
The "pay out" of money on the
slot/poker machines/"one armed
bandits" on which people gamble at
casinos.
Applying theory
• Complete the Operant Conditioning quiz
Summary…
• The schedule of reinforcement will determine how long the acquired behaviour
will last.
• Shaping is used to improve and modify behaviours until a satisfactory standard
has been achieved.
• Primary reinforcers satisfy a need and secondary reinforcers either represent or
can be exchanged for a primary reinforcer.
Define the following key words (2 marks each):
• Primary and Secondary Reinforcement
• Positive and Negative Reinforcement
• Spontaneous recovery
• Extinction
• Conditioned response
• Stimulus Generalisation
• Positive and Negative Punishment
• Shaping
• Schedules of reinforcement