5.6 min read|Last Updated: January 2, 2026|Tags: , , |

Give That POV a Sandwich

One of the cleanest and most common ways of encouraging the audience to project their feelings onto a character is called a Point of View Sandwich.

The Kuleshov Effect and the Car in the Swamp sequence in Psycho are both versions of the POV sandwich.

A POV sandwich is one of the cleanest, most reliable ways to make an audience identify with a character. It’s simple, robust, and everywhere—from the Kuleshov experiments to the “car in the swamp” sequence in Psycho.

Use it well and you’ll pull viewers into a character’s experience without a single line of dialogue. Use it bluntly or too often and you’ll flatten the effect.

Knowing how to build it into your writing without stepping on toes is core to developing screenwriting skills.

But first, what is it?

The Skinny

The classic POV sandwich consists of three consecutive shots, which show:

  1. Who’s looking.
  2. What they see.
  3. How they feel about it and/or what they do.

Shots 1 & 3 are very often continuations of the same shot – hence the other common name for this structure – the Suture Edit – as shot 1 is cut apart, and Shot 2 is sutured into it to create the three-shot sandwich.

The POV sandwich encourages us to understand and identify with a character by understanding what they experience, and seeing how they react. It’s the backbone of subjective storytelling across all genres.

The writer and the POV sandwich

In English-language cinema, it’s considered stepping over a line for the writer to specify POV sandwiches, or any other form of cinematic language. They have to imply it. Or have the character say something extraneous.

In a sensible system, it would be part of the detailed scenario the writer and director would construct together.

Part of the skill of being a screenwriter in the current system is implying cinematic language without stating it.

The POV sandwich in more detail

SHOT 1 – the look:

Generally, a mid-shot or tighter. As this shot can easily be mistaken for typical shot/reverse coverage, it’s usually emphasised by at least one of:

  • The camera tracks in (or the naturally tightens as the character moves towards the camera);
  • The eye-line of the character changes towards the lens (or even better, across the lens);
  • Or a piece of score starts or changes.

SHOT 2 – the seen:

Classically, this was a 50mm eye-level shot of what the character might see, with any movement confined to a tripod pan.

Over time, these ‘rules’ have stretched, and shot 2 is often ‘inflected’ to reflect the character’s state of mind – drunkenness represented as an out-of-focus hand-held shot, or losing consciousness as going in and out of focus, followed by a fade to black. If this becomes overt, it distances the audience.

If the character is looking through or past something – window, door, distorted glass, curtain, etc – either this shot or shot 1 will include the foreground, though generally not both.

SHOT 3 – the effect:

This shot appears to be a continuation of SHOT 1, even if it isn’t. Most often it is – though sometimes from another take.

This shot shows us one of two things – the ‘Ekman’ autonomic reaction (which prompts empathy), or a withheld non-response (which prompts projection – ‘What would I feel if I were her and saw/experienced that?’).

Variations on the Classic POV Sandwich

There are many variations on this classic piece of film grammar. The whole sequence can sometimes be reduced to a single shot with staging and camera movement. It is part of a director’s skill set to find non-obvious ways to achieve the effect of a POV sandwich.

Some of these fall entirely within the director’s wheelhouse; others can be implied in the script. For example, thrillers often use the technique of shifting from apparently objective coverage to subjective coverage.

We think we’re looking at the image through our own eyes, before realising that it’s also the point of view of a character.

Imagine a low angle wide shot of two dangerous people talking. At first this might read as simply a slightly inflected objective shot. But if, as they start to walk towards the camera, it tracks behind a piece of foreground, to the point that the camera is down at ground level as their feet walk past very close to the lens, not only would we ‘understand’ what’s happening, but we would probably feel the anxiety the observer feels. Especially if one of them throws a glance towards the camera at some point.

Or, instead of playing the whole sandwich on ‘what is seen’, you can play the whole scene on the observer’s face and tell the off-screen story with sound and the observer’s reactions.

The implied reaction shot

Or you could imply the reaction shot.

SHOT 1: Night. Lit by her night light, a little girl in bed, clutching her toy animal. Off-screen, the sounds of an argument. Raised voices and things smashing. She gets out of bed.

SHOT 2: A slightly open door leading to a dark corridor. The sounds of the argument are much closer. The little girl’s face appears in the gap between the door and the door jamb.

SHOT 3: Her POV. Under the kitchen table, adult legs and lower torsos of a man and a woman arguing fiercely. One of them slams the table, crockery jumps.

SHOT 4: Cut back to SHOT 2, the doorway. But the space is empty. The girl has gone. (If you wanted to be really corny, you could show her toy animal lying on the ground.)

SHOT 5: Looking out from under a bed. Darkness, and the soft glow of the night light. The door opens, and light fills the room. Her parents’ shadows stretch across the floor. They scout around, panic, exit.

Even though SHOT 4 doesn’t show the girl’s reaction, it does imply how she reacted, with a reinforcing implication in SHOT 5. This is still effectively a POV sandwich.

The variations are almost infinite, and once you become aware of the underlying structure, you will begin to see variations on POV sandwiches everywhere and begin to find new ways you can use the tool.

But – like all tools – save it for when it matters. Overuse leads to audience immunisation.

Whatever form it takes, there are three bits to this fundamental part of cinematic language (not necessarily in that order, as the famous quote goes):

  1. A character notices something.
  2. We experience it, as they experience it.
  3. We are asked to identify with how it affects them – by seeing their reaction (empathy); by having their reaction hidden from us (projection); or by having their reaction displaced into action (also projection).