Twenty years ago, I wrote a research report arguing why internet video wasn’t a threat to traditional TV. Besides being wrong, it suffered from a failure of imagination. It equated the internet with being a new way to transport packages of cable networks. Among other things, I didn’t anticipate that the internet would birth new forms, notably including social video and live-streaming, which now represent ~25% of all video viewing in the U.S.
In Hollywood today, much of the discussion about GenAI centers around how you can use it to make movies and TV shows more efficiently:
Which jobs are the most vulnerable to being replaced?
When will it be possible, both technically and legally, to use it to replace some principal photography, as opposed to just used in pre- and post-production?
How much will it really reduce costs?
Will consumers embrace it and, if so, in what genres?
These are important and valid questions, but they all implicitly suffer from the same failure of imagination. They assume that the primary application of GenAI video is making the same old stuff in a new way. Over time, all new media move beyond mere imitation of older forms. Creators come to understand the unique properties of the new medium and use it to make entirely new things. The same will happen with GenAI video.
It’s hard to predict precisely how the technology will evolve or what will take hold with consumers. It will take time to figure it out. But by exploring the unique properties of AI video models, we can make some educated guesses.
Tl;dr:
Viewing a new medium as merely a way to imitate an old form—also called skeuomorphism—is a common mistake in media.
Most of the discussions about GenAI in Hollywood today also narrowly regard it as a new way to make TV shows or movies for less.
But GenAI will birth new forms too. It’s impossible to predict these with precision, but by exploring the unique properties of GenAI video, we can make educated guesses.
Cheap: GenAI will be orders of magnitude cheaper than traditional production, which will enable far more risk and experimentation; broader representation; A/B testing at scale; and fan creation.
Dynamic: It can be rendered dynamically and, eventually real-time. This will open up contextual, personalized, interactive and possibly emergent or infinite stories.
3D: It isn’t tethered to the fixed perspective of each still frame, meaning that it will be possible to experience video from an unlimited number of perspectives, including within the action itself. Every viewer can be the cinematographer.
Unconstrained: It also isn’t bound by physics or reality, meaning it will be capable of impossible shots, alternative realities, physics-defying environments and a lot of other stuff that’s hard to conceptualize.
The scarcest resources in media are consumers’ time and attention. These new forms will inevitably compete for both. That’s an opportunity for those who understand it and a risk for those who don’t.
A Failure of Imagination
In 2005, I covered U.S. cable, satellite and entertainment stocks at Banc of America Securities. Some investors were starting to worry that internet-delivered television, or what was then called IPTV or “Internet Bypass,” would create a new competitive threat for the cable and satellite TV providers that I covered, like Comcast and DirecTV. (The phrases “over-the-top” and “streaming video” weren’t in vogue yet.)
I wrote a report arguing IPTV wasn’t a threat to the TV industry. You can check out the flawed logic in Figure 1.
Figure 1. Wrong All Around
Source: Banc of America Securities.
Besides being wrong, this analysis also suffered from a failure of imagination. I equated “internet video” with a new way to transport packages of cable networks. It was a very narrow view of how the internet would affect the video business. I didn’t anticipate a couple of minor things, like:
The compression of the supply chain, as distributors like Netflix and Hulu vertically integrated upstream into (exclusive) content creation and content creators like Disney, Time Warner, Paramount, et. al. vertically integrated downstream into direct-to-consumer distribution.
The phasing out or diminishment of “networks” and the rise of general entertainment streaming brands, like Netflix, Prime Video, Peacock and Max.
Changes in consumer TV viewing behavior, such as the shift of roughly 60-70% on TV viewing to on-demand, binge viewing and a move away from ad-supported viewing (although this is now swinging back).
The emergence of the digital user experience as a basis of competition, including stream quality and reliability and features like search, discovery and recommendations.
The globalization of content, like the popularity of K-dramas, anime and other foreign language content in the U.S.
The democratization of video content creation tools, both hardware (the iPhone) and software (like CapCut), and the subsequent emergence of tens or hundreds of millions of amateur and semi-professional video creators.
The associated rise of social (or user generated) video as a new industry and a new form. As I recently discussed in Social Video is Eating the World, I estimate social video now accounts for ~25% of all video viewing in the U.S. It has birthed its own subculture and countless subgenres: dance videos, recipes, reaction videos and “stitches,” ASMR, video podcasts, lifestyle hacks, pranks, geo guessing, video gaming, esports, standup comedy clips, fashion (“fit checks”), workouts, a lot of stuff about dogs and on and on. It has also created celebrities who are as big, or bigger, than the biggest Hollywood stars: Mr. Beast, Charli D’Amelio, Khaby Lame, Addison Rae and more.
The emergence of live-streaming, like Twitch and Kick, as a new form, with its own unique subculture of emotes, virtual gifting, chatting, stream sniping, etc.
In other words, I was thinking of internet video in a skeuomorphic way.
Skeuomorphism in Media
Chris Dixon, a general partner at a16z, often talks about the concept of skeuomorphism in technology. He means that the initial applications of a new technology are often simple adaptations of the applications of prior technologies.
Skeuomorphism is very common in media. The first radio programs were broadcasts of vaudeville shows; the first TV broadcasts were televised stage plays; the first video games were analogs of sports or board games; the first web pages were static text, like newspapers or magazines. As mentioned, my conception of the internet as a new way to deliver packages of TV networks was also skeuomorphic.
In 1964, Marshall McLuhan famously wrote in Understanding Media: The Extensions of Man that “the medium is the message.” He meant that each medium has unique properties that shape both the content and its perception in ways that are unique to that medium. With new media, however, it takes a while for creators to figure that out. Eventually, they move past the skeuomorphic phase of merely imitating old forms and leverage those unique properties to make entirely new forms.
In TV and film, this entailed the entire vocabulary of tracking, dolly, handheld, close up, aerial and other shots, editing and special effects.
In online publishing, it meant the use of hypertext links and multimedia, real-time updating, infinite scrolling and interactive data visualizations.
In video games, it meant all sorts of new gameplay mechanics, open worlds, virtual economies, massively-multiplayer games and a whole lot else.
Most Discussion of GenAI Video is Skeuomorphic
Not surprisingly, so far most of the discussion about “AI video” has exhibited skeuomorphic thinking—namely whether, how and to what degree GenAI will be used to make movies and TV shows more efficiently.
I’m certainly guilty of this. For the past couple of years, I’ve been writing about the ways that Hollywood will (or won’t) use GenAI in its production workflows to reduce the cost of TV and film productions (like AI Use Cases in Hollywood) and why GenAI will lower the entry barriers for individual creators and small teams to create high quality content and prove a disruptive threat to Hollywood (like How Will the “Disruption” of Hollywood Play Out?). In Hollywood, people are debating how GenAI will or won’t affect jobs; when it will be technically and legally feasible to use in “final pixel;” how much will it really reduce production costs; will (and which) talent will lean in; and will audiences really tolerate or embrace it.
These are important and valid subjects (hopefully I think so, since I write about them), but they suffer from the same failure of imagination I exhibited 20 years ago. They assume that GenAI’s primary application will be a new way to make the same old stuff. But, like all other new media, GenAI will also enable creatives to make new things in new ways.
Most discussions about GenAI video assume that it will create new ways to essentially make the same old stuff. What can it do that’s entirely new?
The “Neumorphic” Applications of GenAI Video
There is no established antonym of skeuomorphic, but I was recently discussing the concept of skeuomorphism with Mike Gioia, author of Intelligent Jello, and he suggested “neumorphic.” Sounds good to me.
So, what are the “neumorphic” applications of GenAI video?
A couple of caveats about trying to answer this question. First, it is obviously impossible to answer with any precision. Looking back at how the internet changed the video business—including the rise of entirely new forms, like social video and live-streaming—clearly one can’t predict what new use cases will emerge out of new technologies or which ones consumers will adopt. (When I wrote my IPTV report in 2005, for instance, there was no iPhone and therefore no real mobile internet to speak of; Netflix hadn’t launched streaming; and YouTube was three months old. Twitch didn’t launch until 2011.) I use the word “emerge” deliberately. These are complex systems with a lot of moving parts and, like all complex systems, they will produce emergent outcomes that are impossible to predict.
Second, predicting future applications of technology is a slippery slope. It’s easy to veer into the sci-fi realm of jetpacks, neural implants and being arrested for crimes that one hasn’t yet committed. These predictions can be so far off as to be practically irrelevant. Often they also exhibit a kind of naive technological determinism, namely that if something is technologically possible, then it will inevitably occur. Just because something is possible, doesn’t mean consumers will want it.
Predictions about the future often exhibit a kind of naive technological determinism, namely that if something is technologically possible, it will occur. Sometimes, consumers don’t actually want it.
So, before digging in, I’ll lay out a couple of assumptions:
People will always want stories. Humans have been telling each other stories since before we had written language. We will always have a need for the crucial elements of story, like character, conflict, tension and resolution.
The predominant use case of video will continue to be storytelling, but we shouldn’t dismiss the “weird.” Some of the potential applications of GenAI video I discuss below may seem a little out there—surreal, non-linear, abstract, etc.—and rightfully raise the question of whether anyone will care. I expect that the main use cases of video will continue to be storytelling and conveying information. But keep in mind that a lot of popular video genres don’t really tell stories, like music videos, ambient/relaxation videos, ASMR, vlogs, time lapse videos, and fashion videos. Stuff that may at first seem “too weird” may take root in surprising ways.
Good stories will require a human “in the loop.” A good story evokes emotion. My (albeit somewhat weakly held) view is that since computers don’t have emotions, we will always need a human to overlay his or her judgment as to whether a story is good or not.
Stories have an important social component. One of the key “jobs” performed by stories is to create shared experiences. I don’t believe we will all want to retreat into lonely theaters of one.
People have different need states when consuming stories. Sometimes they want low friction and low effort; sometimes they want to actively engage.
With these caveats and assumptions in mind, let’s explore the unique properties of GenAI tools and make a few educated guesses about what “neumorphic” applications might emerge.
Cheap
The most obvious difference between GenAI video and traditional production techniques is the dramatic difference in cost. As I described in AI Use Cases in Hollywood, the below-the-line and post production costs for a blockbuster movie are now about $1-2 million per minute (i.e., all costs other than the above-the-line talent, like directors/showrunners and top actors). With continued improvement in GenAI, over time these costs could converge with the costs of compute, or dollars per minute—four or five orders of magnitude lower. This will enable a flood of content, the quality of which will no longer be constrained by access to resources, but only by the skill, imagination and dedication of the creator.
Over time, non-above the line production costs may converge with the cost of compute, four or five orders of magnitude lower than they are today.
Of course, this has been the main focus of most discussion about the implications of GenAI, namely its ability to reduce costs and labor demands. It has also been the main thrust of my argument why GenAI may be so disruptive to Hollywood. But these dramatically lower costs have other implications too:
More experimentation and risk taking. Owing to the very high cost and risk of producing high quality Hollywood content, development executives tend to be risk averse. They gravitate to formats, genres and story structures that have worked before. (This tendency was famously parodied in the movie The Player, in which screenwriters pitch ideas like “it’s Out of Africa meets Pretty Woman” and “it’s The Graduate meets Psycho.”) By contrast, GenAI will give creators enormous latitude to experiment with formats, structures, length, you name it—in ways we can’t imagine.
Broader representation. Hollywood is the dominant force in filmmaking globally, largely because the size of the U.S. market supports budgets that aren’t feasible anywhere else. And within Hollywood, the prevalence of white men is well documented, as exemplified by the #OscarsSoWhite social media campaign. The advent of GenAI will make filmmaking accessible to more people, in more places.
A/B testing at scale. A couple of decades ago, I went to a live taping of Friends1 and was surprised to learn that it could take six or seven hours to tape a 22-minute episode. Part of the reason was that Friends would A/B test jokes. (The actors would run through the same set-up a couple of times, but deliver different punchlines each time.) Testing is also common with big budget films, through screenings or by having audiences read screenplays. With GenAI, it will be possible to do this at scale, testing a much wider variety of storylines and essentially crowdsource the story. (Which is not to say that creatives will want to do this, but that they’ll be able to.)
Fan creation. I’ve written a lot about fan creation (for instance, see IP as Platform). It is abundantly clear that fans have an urge to create using their favorite IP. Across media, the popularity of fan creation is directly correlated with the accessibility of the medium: because most people can write, literary fanfic is massive; there are also vast numbers of song covers, although this is a little less accessible because it requires some musical talent; game modding is even less common because it usually requires some technical fluency/coding ability; and video fan creation is relatively uncommon, because it’s so difficult, time consuming and expensive to do. As the cost of video creation plummets, it will become much more accessible. As I’ve written before, I think that progressive IP holders will not only enable this, but encourage it. One potential implication is that, over time, time spent consuming video may increasingly find itself competing with time spent creating video. The lines between creation and consumption may increasingly blur.
One potential implication of much more accessible video fan creation: time spent consuming video may increasingly compete with time spent creating video. The lines between the two may increasingly blur.
Dynamic
Another fundamental distinction between GenAI video and traditional production is the ability to dynamically change it. Traditionally, when a TV episode or movie is completed, it is “locked” and not subject to change. GenAI makes it possible to continually alter the video. Eventually, as models improve and compute increases, it will be possible to render video in real time.
The implications of the contrast between video being “mutable” and static are hard to grasp today. But here are a few ideas, which are not mutually exclusive:
Contextual content. Video could be dynamically altered to be contextually relevant. This doesn’t just mean localized for different cultures, languages or geographies, but perhaps dynamically incorporating current events too.
I highly doubt people will want personalized content all the time. But some forms of personalization may be appealing sometimes.
Personalization. The strongest version of contextual content is personalized content. Many have speculated that one day viewers will be able to use GenAI to create their own movies. A couple of years ago, Sequoia published an essay stating that by 2030, “video games and movies will be personalized dreams.” I’m skeptical of this for the reason I wrote above: one of the key “jobs” performed by media is to create shared experiences. Still, some forms of personalization might be appealing, at least for some viewers, some of the time. Maybe a five-year old is in the mood to watch Bluey play a baseball game? Maybe some will want to watch their favorite series or movie in a different visual style? Perhaps some would have preferred a different ending to Game of Thrones.2 Maybe parents of tweens will want to watch an age-appropriate version of their favorite 80’s comedy with their kids? Maybe some will want a version of a movie or TV episode edited to the time they have available? An early look at this idea is a streaming service called Showrunner, from Fable Studio. The content currently leaves much to be desired, but the goal is to enable viewers to create their own versions of shows.
Interactive stories. A form of personalization that requires active engagement from the viewer is interactivity. Similar to the old “choose your own adventure” books or Netflix’s Black Mirror: Bandersnatch experiment, stories could proceed based on viewer input. In the case of Bandersnatch, the number of permutations was limited by the need to film all of them. (The creators have said that they might not have done it had they known how hard it would be.) With GenAI, in theory there would be no limitation. (An early experiment here is a startup called Dreamflare, which lets viewers watch stories in which the outcomes are determined by their choices and AI.) In this way, the lines between what is a show or movie and what is a game may start to blur.
Infinite and emergent stories. Similar to the concept of open world games, there could be open world narrative content. Imagine storyworlds with clearly defined characters and mythology, but stories that evolve in probabilistic and unpredictable ways and, potentially, never end.
3D
Traditional live action video is 24 still frames per second. The perspective of each of those still frames is determined by a cinematographer and fixed in space.
GenAI video is not bound by a fixed perspective. In theory, it can assume any perspective within the 3D space of the scene. Today, the most advanced models have some sense of 3D space, time and motion. As they evolve, this understanding will get more sophisticated. Runway, for instance, has a research project underway to create “general world models” that have a more comprehensive model of real world physics. Last month, “godmother of AI” Fei Fei Li announced her new startup, World Labs, which aims to build models with greater spatial intelligence and understanding of how the world works.
With GenAI video, everyone can be the cinematographer.
Combined with the ability to render in real time, described above, that means that eventually viewers will be able to watch video from any angle they want, including from within the scene. Everyone will be the cinematographer. It also means that viewers could watch the same narrative from different perspectives, sort of like an infinitely possible version of Rashoman. And this will be even more relevant and important if spatial computing (i.e., AR/VR/MR) takes off.
Unconstrained
Today, movies and TV are unconstrained by physics. Scale models, green screens, physical special effects and VFX make it possible to depict impossible things. However, these manipulations are time consuming and expensive. Plus, they are bound by human imagination. Since we live in a world in which the laws of physics are ironclad—and we come to understand them intuitively even before we understand language—it can be hard to imagine environments that are radically different.
Movies are not bound by the laws of physics, but they are bound by our understanding and internalization of the laws of physics.
GenAI has no such limitations. Altering physics is no more expensive or time consuming than any other prompt. GenAI could create alternative realities in which entirely new laws of physics are codified. It could represent higher-dimensional spaces or non-Euclidean geometries. (Whether we would understand this is a different question.) It could create physics-defying environments that are emergent, rather than explicitly designed. Combined with interactivity described above, it could create “fractal storytelling” - stories that work at different scales, enabling viewers to zoom in or out on any element of the narrative. Stuff that’s hard to conceptualize today.
For one early (and straightforward) example, watch the video for “The Hardest Part” pasted below. Artist Paul Trillo used Sora to create an “infinite zoom” of a couple’s life over time. As he put it, the effect “couldn't quite be shot with a camera, nor could it be animated in 3D, it was something that could have only existed with this specific technology.”
Skeuomorphic Thinking is Limited and Self-Limiting
Some of this may seem a little out there. As I wrote, it’s impossible today to know how GenAI technology will evolve, what other adjacent technologies will emerge and what will (and won’t) capture consumers’ attention. It will take time to figure all that out and it will almost certainly develop in surprising ways. But it’s worth thinking about today.
Twenty years ago, I didn’t realize that the internet was not just a new distribution medium, but would birth new forms. Even a decade ago, no one would’ve guessed that a kid in Greenville, North Carolina posting videos from his bedroom would eventually become one of the biggest celebrities in the world or that Mr. Beast’s videos would commonly attract 100 millions viewers within a few days of release. No one would have guessed that people would watch other people play Minecraft more than 1 trillion times. No one would have guessed that the most popular ASMR YouTuber would have over 30 million subscribers. But here we are.
The scarcest resources in media are consumer time and attention. The new forms that emerge from GenAI video will inevitably compete for both.
The major studios aren’t likely to embrace GenAI anytime soon, as I discussed in Fear and Loathing (and Hype and Reality) in Los Angeles. The reasons are understandable. If I were still working at a big media company, I would be leery of the talent backlash and unresolved legal issues, too. Even so, it’s important to understand that GenAI is not likely just a way to cut costs, but is a new medium that will birth new forms. The scarcest resources in media are consumer time and attention. These new forms will inevitably compete for both. That’s an opportunity for those who understand it and a risk for those who don’t.
All I can remember about the storyline is that Chandler tried to break up with Janice by telling her he was moving to Yemen.
Like everyone who watched it.
Thanks a lot Doug - you help clarify it so well. Thanks again. Karthik.
Thank you for this really insightful article Doug. This seems to be the consensus that more and more of the cognoscenti are coming to. GenAI is not just about cutting costs or rehashing traditional media—it’s a completely new canvas for storytelling. Sure, it’ll streamline production and make it cheaper, but the real power lies in creating experiences we haven't even imagined yet.