Research and Development

Alignment and Contextual Action Selection


The challenge addressed here is one of ensuring that relevant rules, laws, and regulations are loaded into agents' "working memories" when these systems encounter a wide variety of situations.

By being able to search for and to load contextually relevant resources from large libraries, agents could be aligned with and select actions in accordance with contextually relevant human values, rules, laws, and regulations.

With these technologies, in the near future, people will be able to interact with artificial-intelligence systems, narrating situations to them or engaging in dialogues with them, to establish contexts with which to search for and retrieve relevant rules, laws, and regulations.

States and Contexts

A simple graph-based analogy has long been employed throughout artificial intelligence. In this analogy, a node corresponds to a state and an edge to a state transition.

A Markov decision process (MDP) is a discrete-time stochastic control process where the probabilities of subsequent events depend only on the state. The probability that an MDP moves into a new state is influenced by its chosen action and is, more specifically, given by a state transition function. It is conditionally independent of all previous states and actions.

A partially-observable Markov decision process (POMDP) assumes that a system's dynamics are determined by an MDP, but that the agent cannot directly observe the underlying state. Probabilistic distributions over nodes can be considered, in these regards, with the interpretation that one of the nodes is the underlying system state.

Simple recurrent neural networks can well map and reconstruct POMDPs (Schäfer, 2008) and are equivalent to a subclass of probabilistic finite-state automata (Rabin, 1963; Svete & Cotterell, 2023). Accordingly, hybrid artificial-intelligence systems are considered, here, which utilize recurrent neural networks to manage agents' complex states.

Weighted Sets of Nodes

Let us consider a different interpretation of probabilistically-weighted sets of nodes. Instead of an unknown one of the nodes being the underlying system state, system states could be constructed from probabilistically-weighted sets of nodes. That is, a probabilistically-weighted set of nodes could be a system state.

When designing architectures where probabilistically-weighted sets of nodes are to be interpreted as system states, means should be devised for combining or merging the properties of such nodes into those properties desired for system states.

For instance, some nodes might have sets of propositional formulas, or belief sets, connected to them. Both classical Boolean and fuzzy belief sets can be combined or merged from probabilistically-weighted sources. Noteworthy is that some resultant belief sets might contain logical contradictions.

In addition to propositional formulas, or belief sets, other varieties of resources could be connected to nodes (e.g., rules, behavior trees, decision trees, flowcharts, or workflow diagrams) and nodes could have multiple such resources connected to them. Connections between probabilistically-weighted nodes and attached resources could be weighted with scalar values also ranging from zero to one.

Beyond considering weighted sums and other related logical combinations of those resources connected to probabilistically-weighted nodes, some varieties of connected resources could be loaded into agents' working memories where they could be either processed in parallel, combined or "blended" together, or selected from by means of decision-making processes or procedures.

As weights varied across sets of nodes, those weights could be summed from across weighted edges connecting nodes to resources, "activating" connected resources. Those resources activated beyond a (potentially dynamic) threshold could be loaded into an agent's "working memory" for further processing. Multiple such resources loaded into an agent's working memory, in parallel, could contribute to cognitive processes including action selection.

The Affordance Competition Hypothesis

The affordance competition hypothesis suggests that behavior involves a constant competition between currently available demands and opportunities for action. This hypothesis is based on the idea that the brain's basic functional architecture evolved to mediate real-time interactions with the world, which requires animals to continuously specify potential actions and to select between them (Cisek, 2007).

In this hypothesis, demands and opportunities for actions are available to agents simultaneously and in parallel. This suggests artificial-intelligence architectures where agents' contextual states involve presenting them with multiple, relevant opportunities for actions.

Opportunities for actions could result in expressed behaviors, in further cognitive processing, in changes to systems' modes, e.g., heightening alertness, in the production of new internal cues or stimuli, or in raising events, e.g., for event-logging purposes.

Opportunities for actions may be refined or selected through cognitive processes with direct neural correlates in artificial neural networks, through fast and automatic decision-making processes, or through slower, deliberative decision-making processes upon the contents of a "working memory".

Agents' "working memory" systems should also provide them with contextually relevant rules, laws, and regulations to benefit their decision-making processes.

Contextual Sensing, Perception and Comprehension

More coming soon...

Contextual Instruction

Researchers and developers may soon desire to be able to input multiple rules, behavior trees, decision trees, flowcharts, or workflow diagrams into artificial-intelligence systems with the expectation that these systems would subsequently make decisions and exhibit behavior in accordance with them. These resources might be provided to artificial-intelligence systems during their training, fine-tuning, prompting, or subsequent multimodal dialogical interactions.

To be able to input multiple such resources into artificial-intelligence systems, these resources would either enqueue such that earlier-entered ones would have priority over later-entered ones, the earliest ones being "constitutional", or these resources would have weights or priorities assigned to them. In this latter case, constitutional resources could still be provided, these having assigned weights or priorities in a reserved range greater than that range allowed for users' subsequent resources.

Beyond desiring to be able to describe acontextual behavior, behavior relevant in all contexts, researchers and developers may desire to be able to describe contextual behavior, behavior relevant in specified contexts. Certain rules, behavior trees, decision trees, flowcharts, or workflow diagrams may be relevant only in certain states and it might be desired for these resources to be loaded into a working memory only in those states.

Contextual Inspection

Task-related experiences and efficiencies have been all too often overlooked with respect to the development and operation of artificial-intelligence systems.

With unfolding advancements to multimodal dialogue systems, as researchers and developers can upload multimedia files to artificial-intelligence systems and subsequently discuss them, it may soon be possible for them to be able to upload and subsequently discuss the contents of one or more rules, behavior trees, decision trees, flowcharts, or workflow diagrams.

Similarly, researchers and developers may soon be able to inspect, visualize, and use natural language to discuss with agents the contextual contents of their working memories.

Event logging can be envisioned with respect to kinds of events including, but not limited to, when resources enter and exit agents' working memories.

Multi-track recordings of agents' broader cognitive dynamics could accompany recordings of stories, computer simulations, and real-world environments.

Contextual Learning

Researchers and developers, beyond desiring to have the capabilities to manually instruct and inspect agents with respect to rules, behavior trees, decision trees, flowcharts, or workflow diagrams, both acontextually and contextually, may desire for agents to be able to automatically learn behavior from experiences.

More coming soon...

Contextual Action Selection

More coming soon...

Contextual Decision-making

More coming soon...

Related Work

Other approaches considered include using embedding vectors to represent system states and vector databases to store and retrieve contextually relevant rules, laws, and regulations, ensuring that these were loaded into agents' working memories.

Uses of state-driven workflows to enable stateful large-language-model-based agents and multi-agent systems are being explored (Wu, Yue, Zhang, Wang, & Wu, 2024).

More coming soon...


Balsam, Peter D., and Arthur Tomie, eds. Context and learning. Lawrence Erlbaum Associates, 1985.

Cisek, Paul. "Cortical mechanisms of action selection: The affordance competition hypothesis." Philosophical Transactions of the Royal Society B: Biological Sciences 362, no. 1485 (2007): 1585-1599.

Rabin, Michael O. "Probabilistic automata." Information and Control 6, no. 3 (1963): 230-245.

Schäfer, Anton M. Reinforcement learning with recurrent neural networks. PhD diss., Osnabrück, Univ., 2008.

Svete, Anej, and Ryan Cotterell. "Recurrent neural language models as probabilistic finite-state automata." arXiv preprint arXiv:2310.05161 (2023).

Wu, Yiran, Tianwei Yue, Shaokun Zhang, Chi Wang, and Qingyun Wu. "StateFlow: Enhancing LLM task-solving through state-driven workflows." arXiv preprint arXiv:2403.11322 (2024).

Multi-agent Question-answering Systems


Agents representing ideological stances, positions, perspectives, or schools of thought can serve in multi-agent systems which generate encyclopedic answers to end-users' complex questions.

Agent Design, Reuse and Selection

Large language models can generate content while role-playing, or impersonating, characters and personas (Shanahan, McDonell, & Reynolds, 2023). They can be fine-tuned using the works of individual philosophers to subsequently generate virtually indistinguishable responses (Schwitzgebel, Schwitzgebel, & Strasser, 2023). They can generate content from specified stances, positions, perspectives, and schools of thought. They can also generate content aligned with the attitudes and opinions of described groups, sub-populations, or demographics of interest (Santurkar, Durmus, Ladhak, Lee, Liang, & Hashimoto, 2023).

When should agents be searched for, retrieved, reused, designed, created, or varied? Which agents should be consulted when generating encyclopedic answers to end-users’ complex questions? Which agents’ responses would prove most valuable to consolidate, summarize, or synthesize into resultant encyclopedic answers? Should it be anticipated that selected teams of agents will recur across questions?

Automatically and manually designed agents, beyond potentially differing in terms of their models, training, fine-tuning, and prompts, could be provided with differing libraries of documents and could weigh, rank, or prioritize these documents differently.

Should agents and their libraries of documents be logically consistent and ideologically coherent? How will agents synthesize multiple challenging, potentially conflicting documents on complex issues and the arguments in them? Will these capabilities, additionally or instead, be emergent capabilities of orchestrated multi-agent systems?

Multi-agent Orchestration

Processes and strategies from multiple-text comprehension, reading group discussions, the Socratic method, the dialectic method, consensus building, group decision-making, and synthesis writing are anticipated to be of use to manager, facilitator, or moderator agents orchestrating teams of other agents, some representing individuals, groups, stances, positions, perspectives, or schools of thought.

Multiple-text comprehension results from processes and strategies with which readers make sense of complex topics or issues based on information presented in multiple texts. These processes and strategies are necessary when readers encounter multiple challenging, conflicting documents on complex issues (Anmarkrud, Bråten, & Strømsø, 2014; List & Alexander, 2017).

Reading group discussion strategies can enhance multiple readers’ comprehensions of texts. Transcripts from these multi-agent processes should prove valuable to consolidate, summarize, or synthesize (Goldenberg, 1992; Berne & Clark, 2008).

The principles and guidelines of the Socratic method include: the use of open-ended questions, clarifications of terms, providing examples and evidence, challenging arguments, summarization, drawing conclusions, and reflecting on the process. These key principles are realized through strategies such as: definition, generalization, induction, elenchus, hypothesis elimination, maieutics, dialectic, recollection, irony, and analogy (Chang, 2023).

The dialectic method involves dialogues between groups holding different points of view about subjects but wishing to arrive at truths through reasoned argumentation. With respect to multi-agent systems, formal, computational, and game-theoretic approaches have been and remain topics of ongoing research (Wells, 2007). The advancement of large-language-model-based agents has inspired a renewed interest in multi-agent argumentation and debate (Du, Li, Torralba, Tenenbaum, & Mordatch, 2023; Wang, Yue, & Sun, 2023; Wang, Du, Yu, Chen, Zhu, Chu, Yan, & Guan, 2023).

Processes which to build rational consensus and related decision-making procedures may be brought to bear during the orchestration of multi-agent systems (Lehrer & Wagner, 2012).

Synthesis writing is a set of processes and strategies through which the contents of multiple texts, including agent-generated contents, can be integrated into resultant output texts (Van Ockenburg, van Weijen, & Rijlaarsdam, 2019; Van Steendam, Vandermeulen, De Maeyer, Lesterhuis, Van den Bergh, & Rijlaarsdam, 2022). Argumentative synthesis writing combines intratextual and intertextual integration processes and strategies to generate texts from diverse sources, perspectives, and arguments (Mateos, Martín, Cuevas, Villalón, Martínez, & González-Lamas, 2018).

Document Generation

Teams of agents could search for, retrieve, reuse, or generate new documents and subcomponents combining natural language, structured knowledge, source code, multimedia, charts, diagrams, and infographics.

Let us consider computational notebooks, e.g., Mathematica and Jupyter notebooks, and hypermedia encyclopedia articles. Relationships between these include that encyclopedia articles have layouts with respect to their subcomponents and that these subcomponents could be results of generative computations described in computational-notebook cells.

Building upon previous research into the automatic generation of open-domain encyclopedia articles while considering these relationships, a preliminary approach is presented here to utilize planners to orchestrate agentic systems to search for, retrieve, reuse, or generate new layouts and content-related macroplans (Bao, 2023; Qiao, Li, Zhang, He, Kang, Zhang, Yang, et al., 2023; Wu, Bansal, Zhang, Wu, Zhang, Zhu, Li, Jiang, Zhang, & Wang, 2023).

Content-related macroplans would be provided to subordinate agentic systems orchestrated by the aforementioned manager, facilitator, or moderator agents. Individual agents in these subordinate teams would either: (1) interact with one another via natural language with these interactions then consolidated, summarized, or synthesized, or (2) otherwise engage with one another on collaborative software platforms such that their structured interactions would automatically result in natural-language content.

The production, preservation, aggregation, analysis, and maintenance of citations to referenced materials through these processes would be subjects of continuing research (Gao, Yen, Yu, & Chen, 2023).

Specialized agents would be invoked to produce kinds of computational-notebook cells positioned in layouts from which document subcomponents, e.g., multimedia, charts, diagrams, and infographics, could be searched for, retrieved, reused, or generated (Dibia, 2023).

Documents and their individual subcomponents would be subsequently editable, each having one or more accompanying computational-notebook cells, a changelog or revision history, and a discussion forum or other collaborative space. This would enhance the subsequent reusability and revisability of generated documents and their subcomponents while enabling man-machine collaboration scenarios.

Man-machine Collaboration

Transcripts of multi-agent processes could be preserved and accompany resultant documents and their subcomponents. These transcripts could be forum-based, having multiple threads of structured discussions, or could be more intricate.

People and artificial-intelligence agents could interact in these multi-threaded, structured discussion forums or collaboration spaces. Man-machine interactions could potentially result in automatic updates to documents or their subcomponents.

Automatically-generated content could include hyperlinks, context menu items, or other means of navigating from portions of content to any relevant argumentation or procedures in the accompanying multi-threaded, structured discussion forums or collaboration spaces.

Changelogs or revision histories could accompany documents and their subcomponents. People and artificial-intelligence agents could provide rationale, explanations, or justifications in them for modifications made to reusable, revisable documents and their subcomponents.

People could be provided with opportunities to provide structured feedback or open-ended, natural-language comments about portions of documents, sections, paragraphs, sentences, or content selections, and about other document subcomponents. These feedback, comments, and annotations could be displayed for only those opting into viewing them or quality-filtered subsets. When displayed, these could be expandable margin notes proximate to relevant document content.

People desiring to provide feedback or comments about document could additionally be provided with opportunities to interact with dialogue systems conducting contextual and adaptive surveys and opinion polls.


How should encyclopedic answers to end-users’ complex questions be evaluated? How should agents’ performances in coordinated dialogues, debates, and processes be evaluated? How should their contributions to collaborative document-generation processes be evaluated?

With evaluation frameworks and rubrics, components of automatically or manually designed and varied agents could be independently measured and compared. These kinds of scientific architectures could empower teams of humans to continuously improve multi-agent systems.

Large language models have been evaluated with respect to their exhibited moral beliefs (Scherrer, Shi, Feder, & Blei, 2023).

Algorithmic fidelity is defined to be the degree to which the complex patterns of relationships between ideas, attitudes, and socio-cultural contexts within a model accurately mirror those within a range of human sub-populations (Argyle, Busby, Fulda, Gubler, Rytting, & Wingate, 2023).

Value stability, the adherence to roles, characters, or personas during unfolding interactions, is argued to be another dimension of large language model comparison and evaluation alongside knowledge, model size, and speed (Kovač, Portelas, Sawayama, Dominey, & Oudeyer, 2024).

With respect to resultant encyclopedic answers, desired qualities include: verifiability and accuracy, objectivity, and neutrality, plurality, diversity, fairness, balance, and comprehensiveness with respect to relevant points of view (McGrady, 2020).

Related Work

Socratic assistants have been explored for both moral enhancement and educational purposes (Lara & Deckers, 2020). The manager, facilitator, or moderator agents, discussed above, could coordinate teams comprised of artificial-intelligence agents, humans, or combinations of both.

Artificial intelligence systems for facilitation with respect to group meetings and discussions have been previously researched in the form of group support systems (Bostrom, Anson, & Clawson, 1993).

Educational applications of the technologies under discussion include intelligent tutoring systems for teams (Sottilare, Burke, Salas, Sinatra, Johnston, & Gilbert, 2018) and artificial-intelligence-enhanced pedagogical discussion forums (Butcher, Read, Jensen, Morel, Nagurney, & Smith, 2020).

Artificial intelligence systems capable of debating with humans are a subject of ongoing research (Slonim, Bilu, Alzate, Bar-Haim, Bogin, Bonin, & Choshen, 2021).

Modular systems containing multiple interlocutors, each with their own distinct points of view reflecting their training in a diversity of concrete wisdom traditions, have been previously considered (Volkman & Gabriels, 2023).


Anmarkrud, Øistein, Ivar Bråten, and Helge I. Strømsø. "Multiple-documents literacy: Strategic processing, source awareness, and argumentation when reading multiple conflicting documents." Learning and Individual Differences 30 (2014): 64-76.

Argyle, Lisa P., Ethan C. Busby, Nancy Fulda, Joshua R. Gubler, Christopher Rytting, and David Wingate. "Out of one, many: Using language models to simulate human samples." Political Analysis 31, no. 3 (2023): 337-351.

Bao, Yunqian. "Towards automated generation of open domain Wikipedia articles." Master's thesis, University of Illinois at Urbana-Champaign, 2023.

Berne, Jennifer I., and Kathleen F. Clark. "Focusing literature discussion groups on comprehension strategies." The Reading Teacher 62, no. 1 (2008): 74-79.

Bostrom, Robert P., Robert Anson, and Vikki K. Clawson. "Group facilitation and group support systems." Group support systems: New perspectives 8 (1993): 146-168.

Butcher, Tamarin, Michelle Fulks Read, Ann Evans Jensen, Gwendolyn M. Morel, Alexander Nagurney, and Patrick A. Smith. "Using an AI-supported online discussion forum to deepen learning." In Handbook of research on online discussion-based teaching methods, pp. 380-408. IGI Global, 2020.

Chang, Edward Y. "Prompting large language models with the Socratic method." In 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0351-0360. IEEE, 2023.

Dibia, Victor. "Lida: A tool for automatic generation of grammar-agnostic visualizations and infographics using large language models." arXiv preprint arXiv:2303.02927 (2023).

Du, Yilun, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, and Igor Mordatch. "Improving factuality and reasoning in language models through multiagent debate." arXiv preprint arXiv:2305.14325 (2023).

Gao, Tianyu, Howard Yen, Jiatong Yu, and Danqi Chen. "Enabling large language models to generate text with citations." arXiv preprint arXiv:2305.14627 (2023).

Goldenberg, Claude. "Instructional conversations: Promoting comprehension through discussion." The Reading Teacher 46, no. 4 (1992): 316-326.

Kovač, Grgur, Rémy Portelas, Masataka Sawayama, Peter Ford Dominey, and Pierre-Yves Oudeyer. "Stick to your role! Stability of personal values expressed in large language models." arXiv preprint arXiv:2402.14846 (2024).

Lara, Francisco, and Jan Deckers. "Artificial intelligence as a Socratic assistant for moral enhancement." Neuroethics 13, no. 3 (2020): 275-287.

Lehrer, Keith, and Carl Wagner. Rational consensus in science and society: A philosophical and mathematical study. Vol. 24. Springer Science & Business Media, 2012.

List, Alexandra, and Patricia A. Alexander. "Analyzing and integrating models of multiple text comprehension." Educational Psychologist 52, no. 3 (2017): 143-147.

Mateos, Mar, Elena Martín, Isabel Cuevas, Ruth Villalón, Isabel Martínez, and Jara González-Lamas. "Improving written argumentative synthesis by teaching the integration of conflicting information from multiple sources." Cognition and Instruction 36, no. 2 (2018): 119-138.

McGrady, Ryan Douglas. Consensus-based encyclopedic virtue: Wikipedia and the production of authority in encyclopedias. North Carolina State University, 2020.

Qiao, Bo, Liqun Li, Xu Zhang, Shilin He, Yu Kang, Chaoyun Zhang, Fangkai Yang, et al. "TaskWeaver: A code-first agent framework." arXiv preprint arXiv:2311.17541 (2023).

Santurkar, Shibani, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto. "Whose opinions do language models reflect?." arXiv preprint arXiv:2303.17548 (2023).

Scherrer, Nino, Claudia Shi, Amir Feder, and David Blei. "Evaluating the moral beliefs encoded in LLMs." Advances in Neural Information Processing Systems 36 (2023).

Schwitzgebel, Eric, David Schwitzgebel, and Anna Strasser. "Creating a large language model of a philosopher." arXiv preprint arXiv:2302.01339 (2023).

Shanahan, Murray, Kyle McDonell, and Laria Reynolds. "Role-play with large language models." Nature 623, no. 7987 (2023): 493-498.

Slonim, Noam, Yonatan Bilu, Carlos Alzate, Roy Bar-Haim, Ben Bogin, Francesca Bonin, Leshem Choshen, et al. "An autonomous debating system." Nature 591, no. 7850 (2021): 379-384.

Sottilare, Robert A., C. Shawn Burke, Eduardo Salas, Anne M. Sinatra, Joan H. Johnston, and Stephen B. Gilbert. "Designing adaptive instruction for teams: A meta-analysis." International Journal of Artificial Intelligence in Education 28 (2018): 225-264.

Van Ockenburg, Liselore, Daphne van Weijen, and Gert Rijlaarsdam. "Learning to write synthesis texts: A review of intervention studies." Journal of Writing Research 10, no. 3 (2019): 401-428.

Van Steendam, Elke, Nina Vandermeulen, Sven De Maeyer, Marije Lesterhuis, Huub Van den Bergh, and Gert Rijlaarsdam. "How students perform synthesis tasks: An empirical study into dynamic process configurations." Journal of Educational Psychology 114, no. 8 (2022): 1773.

Volkman, Richard, and Katleen Gabriels. "AI moral enhancement: Upgrading the socio-technical system of moral engagement." Science and Engineering Ethics 29, no. 2 (2023): 11.

Wang, Boshi, Xiang Yue, and Huan Sun. "Can ChatGPT defend its belief in truth? Evaluating LLM reasoning via debate." In Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 11865-11881. 2023.

Wang, Haotian, Xiyuan Du, Weijiang Yu, Qianglong Chen, Kun Zhu, Zheng Chu, Lian Yan, and Yi Guan. "Apollo's oracle: Retrieval-augmented reasoning in multi-agent debates." arXiv preprint arXiv:2312.04854 (2023).

Wells, Simon. "Formal dialectical games in multiagent argumentation." PhD thesis, University of Dundee, 2007.

Wu, Qingyun, Gagan Bansal, Jieyu Zhang, Yiran Wu, Shaokun Zhang, Erkang Zhu, Beibin Li, Li Jiang, Xiaoyun Zhang, and Chi Wang. "AutoGen: Enabling next-gen LLM applications via multi-agent conversation framework." arXiv preprint arXiv:2308.08155 (2023).

The Contextual Recommendation of Advice and Wisdom


Wise people share advice and wisdom in the forms of allegories, anecdotes, aphorisms, apologues, fables, folklore, historical analogues, jokes, literature, lyrics, parables, poems, proverbs, quotations, stories, and witticisms. It is a multidisciplinary challenge to build artificial intelligence systems capable of these tasks.

Towards solving this challenge, a new approach is presented: story-based search and recommendation. In this approach, individuals provide stories to retrieve content that is to be useful for selected story characters. The stories they provide could be real-world stories and the characters they select, in these cases, could be themselves or other people. Interestingly, individuals’ social media posts and feeds could be of similar use for establishing contexts for search and recommendation.

While the comprehension of story and social situations are key to the contextual search for and recommendation of advice and wisdom, there is a need for overarching architectures, frameworks, and models for artificial intelligence systems to best do so at scale. Intelligent coaching systems are indicated to be of use in these regards.

Search and recommender system approaches are considered here, in addition to dialogue systems and chatbots, because intelligent coaching systems, at scale, are envisioned as extensively reusing items, e.g., messages of advice, rather than utilizing natural-language generation algorithms to contextually produce new such messages for individuals in an on-the-fly manner.

Applications of the technologies under discussion include social media, education, library and information science, knowledge management, and history.

Story-based Search and Recommendation

In story-based search and recommendation, individuals provide stories to retrieve content that is to be useful for selected story characters. These provided stories can be fictional or real-world stories.

Use case scenarios for fictional stories include their uses in training, testing, and evaluation. These datasets could utilize metadata for indicating stories’ reading levels and other developmental narratological factors.

Use case scenarios for real-world stories include those where individuals seek to retrieve content for themselves and those where individuals, e.g., peers, teachers, or guidance counselors, seek to retrieve content for other individuals or audiences.

Stories provide a natural means of establishing cognitive contexts. Viewing them in this way, active or conversational story comprehension can be considered. Narratees can ask questions of narrators during conversational processes of narration. Vague or partial cognitive story comprehension contexts can inform narratees’ processes of forming questions for narrators about unfolding narratives.

In incremental story-based search and recommendation, individuals engage in dialogues, narrating to artificial intelligence systems, and receive dynamically updating lists of recommended content for selected story characters. These recommended items could include question items and individuals could select these to have systems ask them them in unfolding dialogues. As individuals narrate to and answer questions from artificial intelligence systems incrementally comprehending their stories, content recommendations for selected characters would be provided.

Story and Social Comprehension

In order for artificial intelligence systems to be able to search for and recommend content for selected characters in provided stories, these systems should be able to comprehend stories.

One thing which separates machine reading comprehension from text processing is inferencing. Taxonomies describing reading-related inferences make distinctions between: automatic and strategic; online and offline; text-connecting, knowledge-based, and extratextual; local and global; coherence and elaborative; unconscious and conscious; bridging; text-connecting and gap-filling; coherence, elaborative, knowledge-based, and evaluative; and anaphoric, text-to-text, and background-to-text (Kispal, 2008).

Types of reading-related inferences include: referential, case structure role assignment, antecedent causal, superordinate goal, thematic, character emotion, causal consequence, instantiation noun category, instrument, subordinate goal action, state, reader’s emotion, and author’s intent (Graesser, Singer, & Trabasso, 1994).

Situation models, types of mental models, were devised to understand comprehension. These models are applicable to both story and social comprehension (Morrow, Bower, & Greenspan, 1989; Zwaan, Magliano, & Graesser, 1995; Zwaan & Radvansky, 1998; Wyer Jr, 2003).

Early research into machine story comprehension produced artificial intelligence systems which applied scripts, plans, plot units, and thematic structures. Examples of such systems include: SAM, PAM, FRUMP, and BORIS (László, 2008).

More recently, character networks can be extracted from stories (Labatut & Bost, 2019). In these dynamic networks, nodes correspond to characters and edges to the interactions between them. These nodes and edges can be mapped to embedding vectors (Lee & Jung, 2020; Hoang, Jeon, You, Yoon, Jung, & Lee, 2023).

Similarly, individuals in dynamic social-media networks can be mapped to embedding vectors (Pan & Ding, 2019; Hoang, Jeon, You, Yoon, Jung, & Lee, 2023).

Human lives can be viewed as sequences of events and represented in a way which shares a structural similarity with language. In the “life2vec” approach, resultant embedding spaces were found to be robust and highly structured (Savcisens, Eliassi-Rad, Hansen, Mortensen, Lilleholt, Rogers, Zettler, & Lehmann, 2023).

Characters in fictional and real-world stories could be mapped to corresponding “life2vec” vectors. These vectors would be updated as pertinent events occurred. With computational representations of situational contexts which include embedding vectors for characters, story-based contextual recommendations could be made for selected characters.

Systems capable of predicting stories’ trajectories and inferring characters’ mental states would make better story-based contextual recommendations (Gordon, Bejan, & Sagae, 2011; Chaturvedi, Peng, & Roth, 2017). The “life2vec” approach has shown promise with respect to both its predictive capabilities and its modeling of individuals’ personality nuances.

Inferring the goals and objectives of story characters and individuals will prove critical for contextually providing that content which is to be of the most use to them (Richards & Singer, 2001; Trabasso & Wiley, 2005). Computational approaches to these topics are explored in artificial intelligence with respect to robotic systems (Van-Horenbeke & Peer, 2021) and broader applications (Mao, Liu, Zhao, Ni, Lin, & He, 2023).

Beyond extracting character networks from stories, knowledge graphs could be extracted, mapped with embedding vectors, and subsequently utilized (Andrus, Nasiri, Cui, Cullen, & Fulda, 2022).

Narrative Psychology

Narrative psychology includes multiple parallel approaches: cognitive, psychometric, hermeneutic, scientific, and computational (László, 2008). Considered here are overlaps between artificial intelligence and those scientific and computational approaches of narrative psychology.

The contents and styles of the stories that individuals tell about their lives are of considerable importance. As story-based search and recommendation systems are constructed and continue to advance, opportunities for computer-aided and automated narrative coaching are expected to arise.

Narrative coaching works with coachees at three primary levels: (1) drawing on narrative psychology to understand and connect to the narrator, (2) drawing on narrative structure to understand and elicit the material in the narrated stories, and (3) drawing on narrative practices to understand and harvest the dynamics of the narrative field. The goal is to help coachees to forge new connections between their stories, their identity, and their behaviors in order to generate and embody new options in these three domains (Drake, 2010).

Mentoring and Coaching

Established theoretical models from mentoring and coaching will be of use for designing artificial intelligence systems which process stories or social media data to contextually search for and recommend items in a personalized manner for characters or individuals at scale.

Definitions of mentoring and coaching vary throughout the literature and have been the subjects of considerable debate (Passmore, Peterson, & Freire, 2016). For clarity, and for discussing artificial intelligence systems which can perform pertinent tasks, generic definitions of coaching and mentoring are offered here.

Mentoring is a relationship in which a mentor shares their knowledge, skills, and experience with a person, a mentee, to help them to progress.

Intelligent mentoring systems have been considered with respect to education, self-regulated learning, lifelong learning, career counseling, and beyond. With respect to mentoring beyond the scopes of educational courses or programs, challenges include the collection and integration of data from multiple sources to construct and maintain models of mentees (Kravčík, Schmid, & Igel, 2019).

Coaching is a form of human development in which a coach supports learners, clients, or coachees to achieve specific personal or professional goals by providing training and guidance. Coaching differs from mentoring by its focus on specific tasks or objectives, as opposed to a focus on more general goals or overall development. Applications of coaching include: business and executive, career, co-coaching, dating, education, financial, health and wellness, homework, life, relationship, religious, sports, vocal, and writing.

Individuals’ specific goals and objectives could be inferred by artificial intelligence systems and/or obtained through direct interactions using established theoretical models. Systems could interact with individuals using natural-language dialogues or by means of adaptive input forms. With detailed knowledge of individuals’ goals and objectives, intelligent coaching systems could better contextually recommend items for them.

The PRACTICE model details the following steps: problem identification, development of realistic goals, generation of alternative solutions, consideration of each solutions’ consequences, targeting of the most feasible solution, implementation of the chosen solution, and evaluation (Palmer, 2007). When goal-setting, SMART principles suggest that individuals’ goals should be specific, measurable, achievable, relevant, and time-bound (Doran, 1981).

To best obtain and maintain knowledge of individuals’ dynamic and unfolding goals and objectives, over time, frameworks for the design of intelligent coaching systems describe system attributes for developing strong and efficacious relationships: trust, empathy, transparency, predictability, reliability, ability, benevolence, and integrity (Terblanche, 2020).

Other models from positive psychological coaching can guide the design of intelligent coaching systems including: authentic happiness coaching, the flow-enhancing model, the co-active coaching model, positive organizational psychology, and the good work and good mentoring approach (Passmore, Peterson, & Freire, 2016).

Areas where intelligent coaching systems are expected to excel include evidence-based coaching and continual improvement. In these regards, multi-armed and contextual bandits address the primary difficulty of sequential decision-making under uncertainty, namely, the exploitation versus exploration dilemma. Exploitation involves choosing the best option based upon current knowledge of a system, while exploration involves trying out new options that may lead to better outcomes in the future at the expense of an exploitation opportunity. Applications of these techniques include: healthcare, e.g., clinical trials, recommender systems, information retrieval, and dialogue systems (Bouneffouf & Rish, 2019).


Research into advice can be organized into four paradigms: the message, discourse, psychological, and network paradigms. Each of these provides different insights about the characteristics, functions, and outcomes of advice (MacGeorge, Feng, & Guntzviller, 2016).

The message paradigm focuses on qualities of advice messages and on the effort to predict supportive outcomes for recipients, often between peers.

The discourse paradigm provides insights into the structure and interpretation of advice in interactions.

The psychological paradigm focuses on cognitive and emotional processes which predict the uses of advice in decision-making.

The network paradigm highlights the utility of advice, often in organizational settings, as well as emergent global outcomes which arise from exchanges of advice.

Social Media

In the future, users of social media could be provided with means of browsing content pertinent to the situations described in their recent or selected posts, content aligned with their preferences and aesthetic tastes, while having the capability to provide feedback on the contextual recommendations and on the content recommended.

Artificial intelligence systems could provide multiple personas, each having different values, styles, or configurations with respect to content recommendation. In this way, individuals could browse and select from values, styles, and configurations using anthropomorphized personas. Opting into and out of content recommendation services could be as easy for individuals as friending and unfriending artificial intelligence personas.

At least initially, individuals might receive paginated lists of recommended items. Eventually, more advanced systems might be able to more intelligently sort items, refine items, and even decide upon single items.

Personalization and user modeling can be of use for enhancing contextual content recommendations. With personalization, systems can select and prioritize items aligned with individuals’ preferences and aesthetic tastes. Individual users, their preferences, and their aesthetic tastes can be represented using embedding vectors (Pan & Ding, 2019; Rizkallah, Atiya, & Shaheen, 2021).

Individuals should be able to provide feedback about contextual recommendations and the content recommended by means of using “like” buttons, upvoting mechanisms, input forms, or follow-up dialogues. Artificial intelligence systems could learn from and continuously improve using these and other sources of feedback.

While personalized content from artificial intelligence personas might be sent to individuals’ direct message inboxes, individuals should be able to easily repost or share these contents alongside any of their positive or negative comments, reactions, opinions, or evaluations.

Towards determining the value provided by contextually recommended content, artificial intelligence systems could observe individuals’ trajectories in embedding spaces after their encounters with recommended content. Encounters with recommended content could accompany individuals’ other social media data.

Research into moderating large language models is applicable to moderating story-based search and recommendation systems (Rebedea, Dinu, Sreedhar, Parisien, & Cohen, 2023). With respect to input moderation, for example, regions in situation spaces could be defined by system administrators as being inappropriate for their systems to provide content, advice or items of wit and wisdom, for.

Other Applications

In addition to their commercial applications, the technologies under discussion have applications to education, library and information science, knowledge management, and history.

With respect to education, contextually recommended items of advice and wisdom can provide educational value to individuals. Educational recommender systems have been previously explored for recommending academic advice, courses, educational programs, exams, learning resources, online learning opportunities, papers, pedagogical resources, professions, programming problems, study sequences or syllabuses, teaching practice resources, and schools or universities (Urdaneta-Ponte, Mendez-Zorrilla, & Oleagordia-Ruiz, 2021).

With respect to social-emotional learning and character education, representing learners’ paths as trajectories through embedding spaces could provide a new and powerful tool for understanding when best to use which pedagogical strategy.

With respect to library and information science, contextually recommended content, e.g., excerpts and quotations from literary works, could include hyperlinks to relevant books and materials.

With respect to knowledge management, organizations could index, search for, and retrieve content utilizing story-based contexts.

With respect to history, historians could contextually retrieve content, e.g., historical events and analogues, pertinent to contemporary societal-scale narratives.

Related Work

Recommending quotations for dialogue systems and writing tasks are being researched (Ahn, Lee, Jeon, Ha, & Lee, 2016; MacLaughlin, Chen, Ayan, & Roth, 2021).

Improving recommender systems by incorporating social contextual information is being explored (Ma, Zhou, Lyu, & King, 2011) and so too are context-aware recommender systems for social networks (Suhaim & Berri, 2021).

Research is underway into advice-related interactions between individuals and artificial intelligence systems (Liao, Oh, Feng, & Zhang, 2023).


Andrus, Berkeley R., Yeganeh Nasiri, Shilong Cui, Benjamin Cullen, and Nancy Fulda. "Enhanced story comprehension for large language models through dynamic document-based knowledge graphs." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 10, pp. 10436-10444. 2022.

Ahn, Yeonchan, Hanbit Lee, Heesik Jeon, Seungdo Ha, and Sang-goo Lee. "Quote recommendation for dialogs and writings." In CBRecSys@RecSys, pp. 39-42. 2016.

Bouneffouf, Djallel, and Irina Rish. "A survey on practical applications of multi-armed and contextual bandits." arXiv preprint arXiv:1904.10040 (2019).

Chaturvedi, Snigdha, Haoruo Peng, and Dan Roth. "Story comprehension for predicting what happens next." In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1603-1614. 2017.

Doran, George T. "There's a SMART way to write management’s goals and objectives." Management review 70, no. 11 (1981): 35-36.

Drake, David B. "Narrative coaching." In The complete handbook of coaching edited by Elaine Cox, Tatiana Bachkirova, and David Clutterbuck. p 120-131. SAGE. 2010.

Gordon, Andrew, Cosmin Bejan, and Kenji Sagae. "Commonsense causal reasoning using millions of personal stories." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 25, no. 1, pp. 1180-1185. 2011.

Graesser, Arthur C., Murray Singer, and Tom Trabasso. "Constructing inferences during narrative text comprehension." Psychological review 101, no. 3 (1994): 371.

Hoang, Van Thuy, Hyeon-Ju Jeon, Eun-Soon You, Yoewon Yoon, Sungyeop Jung, and O-Joun Lee. "Graph representation learning and its applications: A survey." Sensors 23, no. 8 (2023): 4168.

Kispal, Anne. Effective teaching of inference skills for reading: Literature review. National Foundation for Educational Research. The Mere, Upton Park, Slough, Berkshire, SL1 2DQ, UK. 2008.

Kravčík, Milos, Katharina Schmid, and Christoph Igel. "Towards requirements for intelligent mentoring systems." In Proceedings of the 23rd International Workshop on Personalization and Recommendation on the Web and Beyond, pp. 19-21. 2019.

Labatut, Vincent, and Xavier Bost. "Extraction and analysis of fictional character networks: A survey." ACM Computing Surveys (CSUR) 52, no. 5 (2019): 1-40.

László, János. The science of stories: An introduction to narrative psychology. Routledge, 2008.

Lee, O-Joun, and Jason J. Jung. "Story embedding: Learning distributed representations of stories based on character networks." Artificial Intelligence 281 (2020): 103235.

Liao, Wang, Yoo Jung Oh, Bo Feng, and Jingwen Zhang. "Understanding the influence discrepancy between human and artificial agent in advice interactions: The role of stereotypical perception of agency." Communication Research (2023): 00936502221138427.

Ma, Hao, Tom Chao Zhou, Michael R. Lyu, and Irwin King. "Improving recommender systems by incorporating social contextual information." ACM Transactions on Information Systems (TOIS) 29, no. 2 (2011): 1-23.

MacGeorge, Erina L., Bo Feng, and Lisa M. Guntzviller. "Advice: Expanding the communication paradigm." Communication yearbook 40 (2016): 239-270.

MacLaughlin, Ansel, Tao Chen, Burcu Karagol Ayan, and Dan Roth. "Context-based quotation recommendation." In Proceedings of the International AAAI Conference on Web and Social Media, vol. 15, pp. 397-408. 2021.

Mao, Yuanyuan, Shuang Liu, Pengshuai Zhao, Qin Ni, Xin Lin, and Liang He. "A review on machine theory of mind." arXiv preprint arXiv:2303.11594 (2023).

Mieder, Wolfgang, ed. Wise words: Essays on the proverb. Routledge, 2015.

Morrow, Daniel G., Gordon H. Bower, and Steven L. Greenspan. "Updating situation models during narrative comprehension." Journal of memory and language 28, no. 3 (1989): 292-312.

Palmer, Stephen. "PRACTICE: A model suitable for coaching, counselling, psychotherapy and stress management." The Coaching Psychologist 3, no. 2 (2007): 71-77.

Pan, Shimei, and Tao Ding. "Social media-based user embedding: A literature review." arXiv preprint arXiv:1907.00725 (2019).

Passmore, Jonathan, David Peterson, and Teresa Freire, eds. The wiley blackwell handbook of the psychology of coaching and mentoring. Nashville, TN: John Wiley & Sons. 2016.

Rebedea, Traian, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, and Jonathan Cohen. "Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails." arXiv preprint arXiv:2310.10501 (2023).

Richards, Eric, and Murray Singer. "Representation of complex goal structures in narrative comprehension." Discourse Processes 31, no. 2 (2001): 111-135.

Rizkallah, Sandra, Amir F. Atiya, and Samir Shaheen. "New vector-space embeddings for recommender systems." Applied Sciences 11, no. 14 (2021): 6477.

Savcisens, Germans, Tina Eliassi-Rad, Lars K. Hansen, Laust H. Mortensen, Lau Lilleholt, Anna Rogers, Ingo Zettler, and Sune Lehmann. "Using sequences of life-events to predict human lives." Nature Computational Science (2023): 1-14.

Suhaim, Areej Bin, and Jawad Berri. "Context-aware recommender systems for social networks: review, challenges and opportunities." IEEE Access 9 (2021): 57440-57463.

Terblanche, Nicky. "A design framework to create artificial intelligence coaches." International Journal of Evidence Based Coaching & Mentoring 18, no. 2 (2020).

Trabasso, Tom, and Jennifer Wiley. "Goal plans of action and inferences during comprehension of narratives." Discourse processes 39, no. 2-3 (2005): 129-164.

Urdaneta-Ponte, María Cora, Amaia Mendez-Zorrilla, and Ibon Oleagordia-Ruiz. "Recommendation systems for education: Systematic review." Electronics 10, no. 14 (2021): 1611.

Van-Horenbeke, Franz A., and Angelika Peer. "Activity, plan, and goal recognition: A review." Frontiers in Robotics and AI 8 (2021): 643010.

Wyer Jr, Robert S. Social comprehension and judgment: The role of situation models, narratives, and implicit theories. Psychology Press, 2003.

Yankah, Kwesi. "Do proverbs contradict?." In Wise words: Essays on the proverb, pp. 127-142. Routledge, 2015.

Zwaan, Rolf A., Joseph P. Magliano, and Arthur C. Graesser. "Dimensions of situation model construction in narrative comprehension." Journal of experimental psychology: Learning, memory, and cognition 21, no. 2 (1995): 386.

Zwaan, Rolf A., and Gabriel A. Radvansky. "Situation models in language comprehension and memory." Psychological bulletin 123, no. 2 (1998): 162.

Adaptive Instructional Systems, Interactive Storytelling and Character Education


In response to calls for more rigorous approaches for character education programs and their evaluation (Person, Moiduddin, Hague-Angus, & Malone, 2009; Sojourner, 2012), interactive stories are indicated as being useful as exercises for both instructing and assessing learners.

Adaptive instructional systems can administer these exercises to learners. To provide personalized and optimized instruction and assessment, these systems model learners. Modeling learners is particularly useful for the domains of character education and social and emotional learning. This article discusses the adaptive game-based psychometric assessment of intrapersonal values, interpersonal values, and civic virtues.

By introducing exercises to character education and social and emotional learning programs, these educational programs can be more precisely evaluated and continuously improved.

Interactive Storytelling

Types of interactive stories include: guided play, role-playing games, the case method, decision games, simulations, literature and literary discussions, story-based items, digital gamebooks, interactive films, and serious games.

These latter four types of interactive stories – story-based items, digital gamebooks, interactive films, and serious games – can be administered by computer and, accordingly, it is straightforward to record and to analyze learners’ decisions and responses. For these reasons, these latter four types are focused on herein.

Opportunities for learners to interact with interactive stories can be provided before, during, or after story content, with learners’ responses potentially shaping the stories as they unfold. The instructional value of interactive stories greatly surpasses that of ordinary stories as, after being presented with choices, learners can be presented with consequences of their decisions.

In addition to presenting learners with choices resembling “what ought character X do next?”, interactive stories could present after-the-fact questions resembling “did character X do the right thing?” and present follow-up questions resembling “why or why not?”.

Not every choice or question presented to learners need have one simply correct answer. Some choices or questions may have more than one correct answer and others may have none.

Adaptive Instructional Systems

Adaptive instructional systems are educational technologies which select and sequence exercises to provide learners with individualized and optimized instruction and assessment. Types of adaptive instructional systems include: intelligent tutoring systems, educational recommender systems, and intelligent media.

Horace Mann, a pioneer of public schooling and modern education, felt that “one of the most important concepts for teachers to understand and implement pertaining to character education is the correct use of instructional timing, as well as the proper implementation strategy, when considering moral development in students” (Watz, 2011). Adaptive instructional systems can optimize the instructional timing of interactive story exercises’ instructional strategies.

Interactive story exercises should be administered to their intended audiences so as to be both appropriate to age and stage of development. Across stages of human development, changes occur in terms of moral schemas (Narvaez, 2002), social cognition, theory of mind, and imagination. With respect to the development of imagination, Gajdamaschko (2006) indicates that, according to Vygotsky, the imagination undergoes developmental shifts which profoundly impact learners’ cultural, intellectual, personality, behavioral, and sensemaking capabilities.

Adaptive instructional systems could have amongst their selection criteria that selected exercises be thematically relevant to the topics under discussion in character education courses. To enhance the thematic variety of exercises, however, systems could intersperse exercises from both previous and forthcoming course topics.

Adaptive instructional systems could intersperse interactive story exercises intended for use across multiple character education and social and emotional learning programs, e.g., standardized assessment items.

Adaptive instructional systems could make use of groups of exercises, referred to also as “testlets” or “panels”. These are discussed in Wainer, Dorans, Eignor, Flaugher, Green, Mislevy, Steinberg, and Thissen (2000) and, taken as units of activity construction and analysis, can mitigate context effects, item ordering, and content balancing difficulties.

Modeling Character

Peterson and Seligman (2004) provide a universal catalog of character strengths and virtues. This catalog is inspired from a collection of many dozens of historical and contemporary inventories and it can be of use when designing character education and social and emotional learning programs.

Harvard’s EASEL Laboratory’s Taxonomy Project indicates that the field of social and emotional learning is structured around “a large number of organizational systems or frameworks that often use different or even conflicting terminology to talk about a similar set of skills.” The Taxonomy Project seeks to “create greater precision and transparency in the field of social and emotional learning and to facilitate more effective translation between research and practice”, providing “information and tools that summarize and connect the major frameworks.”

Interactive story exercises and classroom discussions can also support moral literacy. “If we want our children to possess the traits of character which we most admire, we need to teach them what those traits are. They must learn to identify the forms and content of those traits” (Bennett, 1988). The abilities of recognizing character traits in oneself and others in situational contexts – and the related vocabulary skills – are essential components of cultural and moral literacy. Interactive story exercises can instruct and assess with respect to moral literacy and related vocabulary skills by asking learners about the particular traits exhibited by characters in depicted scenarios.

Modeling Learners

Modeling learners is how adaptive instructional systems can best select and sequence interactive stories for individualized and optimized instruction and assessment. Adaptive instructional systems can use the decisions made and responses provided by learners as they play to model their personality traits, intrapersonal values, interpersonal values, and civic virtues.

Existing psychometric instruments for measuring moral development, moral judgment, and moral reasoning include: the moral judgment interview (MJI; Colby, Abrahami, & Kohlberg, 1987), the sociomoral reflection measure (SRM; Gibbs & Widaman, 1982), the defining issues test (DIT, DIT-2; Rest, 1979), and the intermediate concepts measures (ICM; Bebeau & Thoma, 1999). Most of these instruments utilize story-based items, describing complex situations or moral dilemmas and subsequently presenting questions about what story characters ought do next.

Paradeda, Ferreira, Martinho, and Paiva (2017) describe the use of an interactive storytelling scenario to identify personality traits according to the Myers-Briggs Type Indicator theory. De Lima, Feijó, and Furtado (2018) describe a system which models traits according to the Big Five model. By extracting the decisions made and responses provided during the play of interactive stories, personality traits can be modeled and predicted.

Cutler and Montgomery (2014) describe adaptive psychometric inventories, techniques for including large batteries on surveys while minimizing the number of questions that respondents must answer.

According to Mislevy, Oranje, Bauer, von Davier, Hao, Corrigan, Hoffman, DiCerbo, and John (2014), a challenge today is to extend the accomplishments of psychometrics methodologies, e.g., statistical inference and probability-based reasoning, “from applications to relatively sparse and encapsulated data, for inferences cast in trait and behaviorist psychology, to the richer data made possible in interconnected digital environments, for inferences cast in contemporary sociocognitive psychologies, as encountered for example in game-based assessment.”

Interactive stories can be utilized for modeling and predicting personality traits, intrapersonal values, interpersonal values, and civic virtues. Techniques from computerized adaptive psychometric testing can be of use for providing learners with personalized and optimized sequences of these stories as exercises.

Open learner modeling is encouraged for character education and social and emotional learning programs. Open learner modeling provides learners with access to systems’ models and assessments of their performance and this often has a positive effect on their progress. Open learner modeling can promote reflection, encourage self-assessment, support planning and monitoring, and allow learners to take greater control and responsibility over their learning.

Modeling Exercises and Progressions

Partlan, Carstensdottir, Snodgrass, Kleinman, Smith, Harteveld, and El-Nasr (2018) detail 24 distinct metrics for interactive stories with these metrics organized into three categories: narrative structural complexity, action space, and interactive affordances.

Carstendottir (2020) discusses the modeling of interactive stories and learners’ progressions through them. She describes that the history of interactive storytelling is comprised of two broad approaches, system-centric and player-centric. This article builds on player-centric (learner-centric) approaches to educational interactive storytelling. Beyond branching as a result of learners’ responses or decisions, interactive stories may branch as a result of: models of learners, learners’ mental states (e.g., affect, mood, attention, motivation, engagement, or flow), learners’ response times, settings and configurations, data, variables, program logic, or random numbers.

She indicates that both system-centric and player-centric historical approaches to interactive storytelling have made use of graph-based formalisms. One benefit of graph-based formalisms is that, as learners play interactive-story-based exercises, they traverse paths across their graph-based representations. These paths could be utilized for purposes of mathematical modeling and analysis, e.g., measuring the internal consistency, or statistical interrelatedness, between paths from traversals of different exercises in sets of exercises.

Event-stream processing is another approach to game analytics. In this approach, as learners play games, they produce streams of typed events which can be processed in real-time and/or logged for subsequent analysis.

Mislevy, Corrigan, Oranje, DiCerbo, Bauer, von Davier, and John (2016) detail approaches to game analytics and psychometrics based on evidence and argumentation, providing a number of measurement models with which to synthesize “nuggets of evidence” from across observations. They state that using a measurement model is one way to “accumulate information across multiple sources of evidence, expressed as belief about characteristics of players whether transitory or persistent,” and that so doing “provides tools to sort out evidence in complicated circumstances, quantify its properties, and flexibly assemble evidence-gathering and evidence-accumulating components.”

It will be important to create and utilize a shared, common vocabulary for data logs so that educational data can be readily accumulated and processed from interactive story exercises produced by multiple vendors.

Modeling Choices

Mawhorter, Mateas, Wardrip-Fruin, and Jhala (2014) describe choice models as consisting of the framings of, options presented for, and anticipated and subsequent outcomes associated with choices. With respect to their theory of choice poetics, they outline three main avenues of investigation: mode of engagement, choice idioms, and dimensions of player experience.

Aspects of mode of engagement include player perspective, motivation, and the particulars of play practice. There could be individual differences with respect to mode of engagement, different types of play and different types of players. A partial list of types of play includes: avatar play, role play, power play, exploratory play, analytical play, and critical play.

Choice idioms are generic structures or patterns for entireties or parts of choices which generally achieve specific effects. A partial list of types of choices includes: dead-end option, false choice, blind choice, dilemma, flavor choice, delayed effect, puzzle choice, and unchoice.

A partial list of dimensions of player experience includes: agency, influence, autonomy, identification, transportation, absorption, responsibility, and regret.

Modeling Situations and Contexts

Situation models are representations, e.g., cognitive or computational representations, of states of affairs such as those relayed through stories. Situation models are relevant for modeling the framings of choices presented to learners in interactive stories. Learners’ situation models can be described as being their mental representations of the states of affairs in stories, e.g., as they encounter choices or questions.

In addition to being relevant to understanding reading and film comprehension, the perception and mental representations of situations and contexts are matters of considerable importance to the behavioral sciences. Cantor (1981), for instance, presents that there may be conceptual prototypes for situations, categories and taxonomies of situations, and that situations may have traits, features, or attributes. She indicates that understanding the perception and mental representations of situations is useful for understanding the processes of generalizing over observations of behaviors as occurring in situational contexts.

Wyer (2003) indicates that situation models are relevant to both social cognition and moral judgment.

Modeling Decision-making Processes

Sanfey and Chang (2008) indicate that “within judgment and decision making, many multiple-processing theories have been proposed, all of which posit different fundamental modes of processing that alternately cooperate and compete in reaching a decision.” They note that while “there are nuances specific to each theoretical conception, for the most part, these dual-process models are all structurally very similar.” These models each include both automatic and volitional processing operating in parallel. Van den Bos (2018) indicates that intuitive and deliberative processes may operate in parallel with respect to moral judgment.

For these reasons, educational data could reasonably include response times, mental chronometry data, from learners’ interactions with the choices presented by interactive story exercises.

Adaptive instructional systems could hypothesize about how competing values or principles were ranked or weighed by learners when making decisions.

Modeling Learning

The modeling, measurement, analysis, and visualization of learners’ course progressions as pertaining to multiple, simultaneous, interrelated course objectives are topics of acute interest.

Models of learners and their course progressions formed by adaptive instructional systems could be of use to educators with respect to course grading. Other potential components for course grading include: the study of the history, theory, and philosophy of character and virtue, classroom discussions and participation, individual and group projects, essays, and overall effort and progress.

Arthur, Kristjánsson, Harrison, Sanderse, and Wright (2016) discuss how educators should measure virtue and evaluate character education programs. They indicate that program evaluators should desire evidence of improvement in virtue literacy, evidence of improvement in moral behavior, evidence of their interrelation, and evidence that educational programs were causal to these improvements.

Interactive story exercises and adaptive game-based psychometric assessment could be important components of a patchwork of methods with which to triangulate upon and measure these factors.

Some exercises could serve as standardized assessment items and be interspersed across different programs by adaptive instructional systems.

A/B testing could be performed between groups of learners whom have participated in character education and social and emotional learning programs and groups whom have not.

While exercises could help to strengthen moral knowledge, moral reasoning, moral sensitivity, moral judgment, and moral literacy, learners must determine to habitually apply these to their real-world behavior.

Modeling Educational Contexts and Climates

Interactive story exercises will seldom be administered in isolation from educational settings. Overarching educational policies, plans, strategies, and school cultures will often contribute to the complex educational contexts and climates in which learners encounter exercises.

Multilevel modeling can be of use for simultaneously modeling learners, groups, classes, schools, districts, states, and countries. Ma, Ma, and Bradley (2008) discuss how multilevel modeling can be of use for investigating school effects.

They use the term “context variables” to refer to the “hardware” of schools, variables such as physical backgrounds (e.g., school location and resources), student bodies (e.g., school socioeconomic and racial-ethnic compositions), and teacher bodies (e.g., teacher education and experience).

They use the term “climate variables”, also often referred to as “evaluative variables”, to refer to the “software” of schools, with characteristics descriptive of learning environments including administrative policies, instructional organization, school operation, and the attitudes, values, and expectations of learners, parents, teachers, and administrators.

They emphasize the importance of understanding the distinction between educational contexts and climates, noting that school-effects research tends to focus on educational climate variables as these are under the direct control of learners, parents, teachers, and administrators.

Computer-aided and Automatic Item Generation

Artificial intelligence technologies can be of use for creating, producing, and evaluating interactive stories, screenplays, storyboards, and production schedules.

Stefnisson and Thue (2018) argue that manually creating interactive stories is inherently difficult and that there is a need for advanced authoring tools. This difficulty is expected to be further pronounced when the matter is, beyond one of creative writing, one of evidence-based and efficacious educational interactive story design and engineering.

Related Work

The first automatic story generation system was Automatic Novel Writer (Klein, Aeschlimann, Balsiger, Converse, Court, Foster, Lao, Oakley, & Smith, 1973) followed by TALE-SPIN (Meehan, 1977), Author (Dehn, 1981), Universe (Lebowitz, 1983), Minstrel (Turner, 1993), Mexica (Pérez y Pérez, 1999), Brutus (Bringsjord & Ferrucci, 1999), and Fabulist (Riedl, 2004).

Interactive drama systems include: Oz (Bates, 1992), DEFACTO (Sgouros, 1997), the Virtual Theater Project (Hayes-Roth, van Gent, & Huber, 1997), I-Storytelling (Cavazza, Charles, & Mead, 2002), Façade (Mateas & Stern, 2003), IDtension (Szilas, 2003), Mimesis (Young, Riedl, Branly, Jhala, Martin, & Saretto, 2004), NOLIST (Bangsø, Jensen, Jensen, Andersen, & Kocka, 2004), OPIATE (Fairclough, 2004), the Interactive Drama Architecture (IDA; Magerko, 2005), FAtiMA (Aylett, Dias, & Paiva, 2006), IN-TALE (Riedl & Stern, 2006), U-Director (Mott & Lester, 2006), SASCE (Nelson, Roberts, Isbell, & Mateas, 2006), Bards (Pizzi, Charles, Lugrin, & Cavazza, 2007), PaSSAGE (Thue, Bulitko, Spetch, & Wasylishen, 2007), DED (Arinbjarnar & Kudenko, 2008), GADIN (Barber & Kudenko, 2009), and Erasmatron (Crawford, 2013).

Riedl and Young (2006) describe techniques for automatically generating, beyond linear stories, branching interactive stories.

Barber and Kudenko (2007) describe a system which adaptively models users to generate interesting dilemma-based stories, noting that such stories require “fundamentally difficult decisions within the course of the story.” In 2009, they presented the GADIN system (Barber & Kudenko, 2009).

Artificial intelligence systems for ethics education include: PETE (Goldin, Ashley, & Pinkus, 2001), AIENS (Hodhod, Kudenko, & Cairns, 2009), Conundrum (McKenzie & McCalla, 2009), and Umka (Sharipova, 2015).


Interactive stories can be utilized as classroom and homework exercises for character education and social and emotional learning programs. Adaptive instructional systems can select and sequence these exercises for learners at scale, providing learners with individualized and optimized instruction and assessment.

Discussed herein were interactive storytelling, adaptive instructional systems, and the modeling of character, learners, exercises and progressions, choices, situations and contexts, decision-making processes, learning, and the educational contexts and climates in which these occur.


Arinbjarnar, Maria, and Daniel Kudenko. "Schemas in directed emergent drama." In Joint International Conference on Interactive Digital Storytelling, pp. 180-185. Springer, Berlin, Heidelberg, 2008.

Arthur, James, Kristján Kristjánsson, Tom Harrison, Wouter Sanderse, and Daniel Wright. Teaching character and virtue in schools. Routledge, 2016.

Aylett, Ruth, Joao Dias, and Ana Paiva. "An affectively driven planner for synthetic characters." In 16th International Conference on Automated Planning and Scheduling, pp. 2-10. 2006.

Bangsø, Olav, Ole G. Jensen, Finn V. Jensen, Peter B. Andersen, and Tomas Kocka. "Non-linear interactive storytelling using object-oriented Bayesian networks." In Proceedings of the international conference on computer games: Artificial intelligence, design and education. 2004.

Barber, Heather, and Daniel Kudenko. "A user model for the generation of dilemma-based interactive narratives." In Workshop on Optimizing Player Satisfaction at AIIDE, vol. 7. 2007.

Barber, Heather, and Daniel Kudenko. "Generation of adaptive dilemma-based interactive narratives." IEEE transactions on computational intelligence and AI in games 1, no. 4 (2009): 309-326.

Bates, Joseph. "Virtual reality, art, and entertainment." Presence: Teleoperators & Virtual Environments 1, no. 1 (1992): 133-138.

Bebeau, Muriel J., and Stephen J. Thoma. "“Intermediate” concepts and the connection to moral education." Educational Psychology Review 11, no. 4 (1999): 343-360.

Bennett, William J. "Moral literacy and the formation of character." NASSP Bulletin 72, no. 512 (1988): 29-34.

Bringsjord, Selmer, and David Ferrucci. Artificial intelligence and literary creativity: Inside the mind of brutus, a storytelling machine. Psychology Press, 1999.

Cantor, Nancy. "Perceptions of situations: Situation prototypes and person-situation prototypes." In Toward a psychology of situations: An interactional perspective, pp. 229-244. Psychology Press, 1981.

Carstensdottir, Elin. Automated Structural Analysis of Interactive Narratives. Northeastern University, 2020.

Cavazza, Marc, Fred Charles, and Steven J. Mead. "Character-based interactive storytelling." IEEE Intelligent systems 17, no. 4 (2002): 17-24.

Colby, Anne, Anat Abrahami, and Lawrence Kohlberg. The measurement of moral judgment: Theoretical foundations and research validation. Cambridge University Press, 1987.

Crawford, Chris. "Interactive storytelling." In The video game theory reader, pp. 259-273. Routledge, 2013.

Cutler, Josh, and Jacob M. Montgomery. "The efficient measurement of personality: Adaptive personality inventories for survey research." (2014).

de Lima, Edirlei Soares, Bruno Feijó, and Antonio L. Furtado. "Player behavior and personality modeling for interactive storytelling in games." Entertainment Computing 28 (2018): 32-48.

Dehn, Natalie. "Story generation after TALE-SPIN." In Proceedings of the 7th international joint conference on Artificial intelligence Volume 1, pp. 16-18. 1981.

Fairclough, Chris R. "Story games and the OPIATE system." (2004).

Gajdamaschko, Natalia. "Theoretical Concerns: Vygotsky on Imagination Development." Educational Perspectives 39, no. 2 (2006): 34-40.

Gibbs, John C., and Keith F. Widaman. "Social intelligence: Measuring the development of sociomoral reflection." (1982).

Goldin, Ilya M., Kevin D. Ashley, and Rosa L. Pinkus. "Introducing PETE: computer support for teaching ethics." In Proceedings of the 8th international conference on Artificial intelligence and law, pp. 94-98. 2001.

Gray, Kurt, and Jesse Graham, eds. Atlas of moral psychology. Guilford Publications, 2019.

Hayes-Roth, Barbara, Robert van Gent, and Daniel Huber. "Acting in character." Creating personalities for synthetic actors (1997): 92-112.

Hodhod, Rania, Daniel Kudenko, and Paul Cairns. "AEINS: adaptive educational interactive narrative system to teach ethics." In AIED 2009: 14th International Conference on Artificial Intelligence in Education Workshops Proceedings, vol. 79. 2009.

Klein, Sheldon, John F. Aeschlimann, David F. Balsiger, Steven L. Converse, Claudine Court, Mark Foster, Robin Lao, John D. Oakley, and Joel Smith. "Automatic novel writing: A status report." Text processing (1979): 338-411.

Laurel, Brenda. Computers as theatre. Addison-Wesley, 2013.

Lebowitz, Michael. "Creating a story-telling universe." In Proceedings of the Eighth international joint conference on Artificial intelligence Volume 1, pp. 63-65. 1983.

Ma, Xin, Lingling Ma, and Kelly D. Bradley. "Using multilevel modeling to investigate school effects." Multilevel modeling of educational data (2008): 59-110.

Magerko, Brian. "Story representation and interactive drama." In Proceedings of the First AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, pp. 87-92. 2005.

Mateas, Michael, and Andrew Stern. "Façade: An experiment in building a fully-realized interactive drama." In Game developers conference, vol. 2, pp. 4-8. 2003.

Mawhorter, Peter, Michael Mateas, Noah Wardrip-Fruin, and Arnav Jhala. "Towards a theory of choice poetics." (2014).

McKenzie, Adam, and Gord McCalla. "Serious games for professional ethics: An architecture to support personalization." In AIED 2009: 14th International Conference on Artificial Intelligence in Education Workshops Proceedings, p. 69. 2009.

Meehan, James R. "Using planning structures to generate stories." American Journal of Computational Linguistics (1975): 78-94.

Meehan, James R. The Metanovel: Writing Stories by Computer. Yale University, 1976.

Meehan, James R. "TALE-SPIN, an interactive program that writes stories." In Proceedings of the 5th international joint conference on Artificial intelligence Volume 1, pp. 91-98. 1977.

Mislevy, Robert J., Seth Corrigan, Andreas Oranje, Kristen DiCerbo, Malcolm I. Bauer, Alina von Davier, and Michael John. "Psychometrics and game-based assessment." Technology and testing: Improving educational and psychological measurement (2016): 23-48.

Mislevy, Robert J., Andreas Oranje, Malcolm I. Bauer, Alina von Davier, Jiangang Hao, Seth Corrigan, Erin Hoffman, Kristen DiCerbo, and Michael John. Psychometric considerations in game-based assessment. GlassLab Games, 2014.

Mott, Bradford W., and James C. Lester. "U-Director: a decision-theoretic narrative planning architecture for storytelling environments." In Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 977-984. 2006.

Murray, Janet H. "Hamlet on the Holodeck: The Future of Narrative in Cyberspace." (1998).

Narvaez, Darcia. "Does reading moral stories build character?" Educational Psychology Review 14, no. 2 (2002): 155-171.

Nelson, Mark J., David L. Roberts, Charles L. Isbell Jr, and Michael Mateas. "Reinforcement learning for declarative optimization-based drama management." In Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 775-782. 2006.

Paradeda, Raul, Maria José Ferreira, Carlos Martinho, and Ana Paiva. "Using interactive storytelling to identify personality traits." In International Conference on Interactive Digital Storytelling, pp. 181-192. Springer, Cham, 2017.

Partlan, Nathan, Elin Carstensdottir, Sam Snodgrass, Erica Kleinman, Gillian Smith, Casper Harteveld, and Magy Seif El-Nasr. "Exploratory automated analysis of structural features of interactive narrative." In Fourteenth Artificial Intelligence and Interactive Digital Entertainment Conference. 2018.

Pérez y Pérez, Rafael. "MEXICA: a computer model of creativity in writing." PhD diss., University of Sussex, 1999.

Person, Ann E., Emily Moiduddin, Megan Hague-Angus, and Lizabeth M. Malone. "Survey of Outcomes Measurement in Research on Character Education Programs. NCEE 2009-006." National Center for Education Evaluation and Regional Assistance (2009).

Peterson, Christopher, and Martin EP Seligman. Character strengths and virtues: A handbook and classification. Vol. 1. Oxford University Press, 2004.

Pizzi, David, Fred Charles, Jean-Luc Lugrin, and Marc Cavazza. "Interactive storytelling with literary feelings." In International Conference on Affective Computing and Intelligent Interaction, pp. 630-641. Springer, Berlin, Heidelberg, 2007.

Rest, James R. Development in judging moral issues. U of Minnesota Press, 1992.

Riedl, Mark O. "Narrative Generation: Balancing Plot and Character." PhD diss. North Carolina State University, Raleigh, NC, 2004.

Riedl, Mark O., and Andrew Stern. "Believable agents and intelligent story adaptation for interactive storytelling." In International Conference on Technologies for Interactive Digital Storytelling and Entertainment, pp. 1-12. Springer, Berlin, Heidelberg, 2006.

Riedl, Mark O., and R. Michael Young. "From linear story generation to branching story graphs." IEEE Computer Graphics and Applications 26, no. 3 (2006): 23-31.

Ryan, Marie-Laure. "Narrative and the split condition of digital textuality." The Aesthetics of Net Literature: Writing, Reading and Playing in Programmable Media (2007): 257-281.

Sanfey, Alan G., and Luke J. Chang. "Multiple systems in decision making." Annals of the New York Academy of Sciences 1128, no. 1 (2008): 53-62.

Sgouros, Nikitas M. "Dynamic, user-centered resolution in interactive stories." In Proceedings of the Fifteenth international joint conference on Artifical intelligence Volume 2, pp. 990-995. 1997.

Sharipova, Mayya. "Supporting students in the analysis of case studies for professional ethics education." PhD diss., University of Saskatchewan, 2015.

Sojourner, Russell J. "The rebirth and retooling of character education in America." Character Education Partnership 19 (2012).

Stefnisson, Ingibergur, and David Thue. "Mimisbrunnur: AI-assisted authoring for interactive storytelling." In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital entertainment, vol. 14, no. 1, pp. 236-242. 2018.

Szilas, Nicolas. "IDtension: A narrative engine for interactive drama." In 1st International Conference on Technologies for Interactive Digital Storytelling and Entertainment pp. 24-26. 2003.

Szilas, Nicolas. "A computational model of an intelligent narrator for interactive narratives." Applied Artificial Intelligence 21, no. 8 (2007): 753-801.

Thue, David, Vadim Bulitko, Marcia Spetch, and Eric Wasylishen. "Interactive storytelling: A player modelling approach." In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 3, no. 1, pp. 43-48. 2007.

Turner, Scott R. Minstrel: a computer model of creativity and storytelling. University of California, Los Angeles, 1993.

van den Bos, Kees. "On the possibility of intuitive and deliberative processes working in parallel in moral judgment." (2018): 31-39.

Wainer, Howard, Neil J. Dorans, Daniel Eignor, Ronald Flaugher, Bert F. Green, Robert J. Mislevy, Lynne Steinberg, and David Thissen. Computerized adaptive testing: A primer. Routledge, 2000.

Watz, Michael. "An historical analysis of character education." Journal of Inquiry and Action in Education 4, no. 2 (2011): 3.

Wyer Jr, Robert S. Social comprehension and judgment: The role of situation models, narratives, and implicit theories. Psychology Press, 2003.

Young, R. Michael, Mark O. Riedl, Mark Branly, Arnav Jhala, R. J. Martin, and C. J. Saretto. "An architecture for integrating plan-based behavior generation with interactive game environments." Journal of Game Development. 1, no. 1 (2004): 1-29.