Phoster

Research and Development

Computational Storyboarding

Introduction

Computational storyboarding builds upon traditional storyboarding techniques, combining elements from screenplays, storyboards, functions, diagrams, and animation.

Computational storyboards are intended to be of use as input for generative artificial-intelligence systems to create longer-form output video.

A motivating use case is simplifying the creation of educational videos, e.g., lecture videos. With computational storyboards, content creators could describe single-character stories where the main characters were tutors instructing audiences with respect to provided subject matter, utilizing boards or screens displaying synchronized multimedia content from textbooks, encyclopedia articles, or slideshow presentations.

Screenplays

A screenplay is a form of narration in which the movements, actions, expressions, and dialogue of characters are described in a certain format. Visual and cinematographic cues might also be given as well as scene descriptions and changes.

Storyboards

A storyboard is an organization technique consisting of illustrations or images, thumbnails, traditionally displayed in sequences. Storyboards have traditionally been used for pre-visualizing motion pictures, animations, motion graphics, and other interactive media sequences.

Storyboards' thumbnails have traditionally provided information about content layering, audio and sound effects, camera shots, character shots, transitions between scenes, and more.

Web of Computational Storyboards

In theory, nodes in diagrammatic computational storyboards could refer to other diagrams by URLs, weaving webs of interconnected diagrams. End-users could click on these referring nodes to expand them, loading referenced content from URL-addressable resources into diagrams.

Wiki of Computational Storyboards

Computational storyboard diagrams could be collaboratively editable, enabling wiki platforms.

Functions

Functions would enable modularity and the reuse of storyboard content. Beyond referring to other diagrams by URLs, function-calling nodes in computational storyboard diagrams could refer to function-like diagrams by URLs while invoking and passing arguments to them.

With computational storyboarding functions, scenes’ characters, settings, props, actions, dialogue, and properties of these could all be parameterized.

Arguments provided to invoked functions could be in the form of multimedia content, structured objects, or text. Arguments and variables in functions could be used to create the prompts to be provided to generative artificial-intelligence systems including those prompts with which to generate thumbnails' images.

Markers, resembling keykodes or timecodes, could be placed between thumbnails in computational storyboard diagrams. Alternatively, some or all of the thumbnails could be selected to serve as referenceable markers, keykodes, or timecodes in resultant video. With markers, content creators could refer to instants or intervals of video generated from invoked functions.

Metadata

Components in computational storyboard diagrams could be annotated with metadata.

Functions, for instance, could be annotated with metadata describing one or more sample argument sequences. In this way, content creators could have options for generating thumbnails' images while designing.

Control Flow

With respect to computational storyboarding functions and their diagrams, there are two varieties of control-flow constructs to consider.

A first variety of control-flow construct would route execution at runtime to paths of subsequent thumbnails. Such branching could occur either based upon the evaluation of expressions involving input arguments and variables or upon asking questions of interoperating artificial-intelligence systems.

A second variety of control-flow construct would result in branching or interactive video output, with routes or paths to be selected by viewers during playback. Generated interactive video content could interface with playback environments, e.g., in Web browsers, to provide viewers with features. Uses of interactive video include providing viewers with options, e.g., navigational menus.

Execution Contexts

While computational storyboards were executed or run to generate video, execution contexts, these building on the concepts of “call stacks”, could be utilized. Execution contexts would include nested frames, these building on the concepts of “stack frames”, which would each include those active nodes in functions' diagrams and those values of their input arguments and variables.

Variation

In addition to computational storyboards' functions providing their diagrammatic contents with their input arguments and variables, functions could contain nodes for obtaining “random values” from specified numerical intervals or, perhaps, for randomly selecting from nodes in containers.

Random variation could, optionally, be utilized by content creators to vary resultant video.

Optimization

In theory, beyond using “random values” to simply vary generated video contents, diagram nodes for providing “automatic values” could be used to provide values, either scalars from intervals or selections from nodes in containers, which were intended to be optimized across multiple executions or runs while observations and data were collected.

As envisioned, developing and providing these components for computational storyboarding diagrams would simplify A/B testing and related techniques for content creators.

Generating Thumbnail Images

As considered, at least some computational storyboards’ thumbnails would have their images created by generative artificial-intelligence systems. Multimodal prompts, in these regards, could be varied including by using functions’ input arguments and variables.

Generating Video

A goal for computational storyboards is that generative artificial-intelligence systems could process them into longer-form video content.

Towards this goal, computational storyboards could provide materials beyond extensible thumbnails for generative artificial-intelligence systems. Notes about directing, cinematography, and characters or acting could be provided to systems. Multimedia materials with respect to characters, settings, props, and style could be provided to systems. That content intended to be synchronized and placed onto one or more display surfaces in generated video could be provided to systems.

Generated videos could utilize one or more tracks to enable features in playback environments. Transcripts or captions, for instance, alongside accompanying metadata track items, could be sent to viewers' artificial-intelligence assistants for these systems to be able to answer questions about videos’ contents.

Debugging and Revision

With respect to generating video from computational storyboards, there could exist a “debugging” mode. When generated from such a mode, output video would contain extra metadata tracks providing objects for content creators to utilize to be able to jump into computational storyboards resumed to appropriate execution contexts for points of interest in the generated videos.

Processing Video

In theory, existing video content could be processed into computational storyboards.

Conclusion

Envisioned computational storyboards build on traditional storyboarding techniques while intending to enable generative artificial-intelligence systems to create longer-form output video, e.g., educational video.