Phoster

Research and Development

Training and Testing Artificial Intelligence Systems in Simulations and Game Environments

Introduction

A breadth of topics are indicated and discussed with respect to the training and testing of artificial intelligence systems in simulations and game environments.

Comparative Cognition

The study of animal behavior is multidisciplinary and includes the scientific fields of: ethology, behavioral ecology, evolutionary psychology, comparative psychology and, more recently, comparative cognition. Comparative cognition is the study of cognitive processes across all species of animals, including humans.

Psychometrics

Psychometrics is a scientific field of study concerned with the objective measurement of skills, knowledge, abilities, mental capacities, mental processes, attitudes, personality traits and educational achievement. Psychometrics topics include the assessment of cognitive development and intelligence. Psychometric measurements can be made of non-human animals, humans and AI systems.

Collective Intelligence

Collective intelligence is a shared or group intelligence that emerges from collaboration, collective efforts and competition. Collective intelligence can emerge from the interactions of multiple AI systems in simulations and game environments.

Simulations and Game Environments

The field of artificial intelligence makes use of simulations and game environments to train and to test AI systems. Examples of such simulations and game environments include: the Arcade Learning Environment, the OpenAI Gym, the Behavior Suite for Reinforcement Learning, the Obstacle Tower, and the Animal-AI Environment.

Machine Teaching

Machine teaching is the control of machine learning. Machine learning algorithms define dynamical systems where states, or models, are driven by training data. Machine teaching designs optimal training data with which to drive learning algorithms to target models.

Intelligent Tutoring Systems and Interactive Narrative

AI systems can be envisioned which accelerate and optimize the training and testing of other AI systems.

Intelligent tutoring systems are AI systems which provide personalized instruction to learners. Traditionally, the learners are human students. The techniques of intelligent tutoring systems, however, generalize to the training and testing of AI systems.

Computerized adaptive testing is a form of examination that adapts to the exhibited capabilities of examinees. Items to be administered to examinees depend upon the nature of or the correctness of examinees’ previous responses.

Interactive storytelling is a form of digital entertainment in which storylines are not predetermined. While authors may create any settings, characters and situations which a narrative must address, readers or players experience unique stories based upon their interactions with storyworlds.

Intelligent narrative technologies can be envisioned which generate dynamic narratives in simulations and game environments for the training and testing of AI systems. Such narratives would unfold based upon the behaviors exhibited by and the decisions made by AI systems.

Automatic Item Generation and Procedural Content Generation

Automatic item generation uses computer algorithms to produce items, the basic building blocks of exams, tests, questionnaires and other instruments of psychometric measurement.

Procedural content generation uses computer algorithms to produce elements of simulations and game environments. Procedurally-generated content could be puzzles, tasks, tests or other varied content useful for the training or testing of AI systems.

Item Response Theory and Content Evaluation

Item response theory is a paradigm for the design, scoring, analysis and evaluation of items, exams, tests, questionnaires and other instruments of psychometric measurement.

Content evaluation is the analysis and evaluation of procedurally-generated content and narrative elements, for instance in terms of their utility with respect to the training and testing of AI systems.

Inspecting and Modeling Artificial Intelligence Systems

Can the internals of AI systems be inspected and monitored during training and testing or are such systems effectively black boxes?

If one can inspect and monitor the internals of AI systems, then metrics based upon their algorithms, e.g. deep reinforcement learning, could be obtained.

If, instead, AI systems are effectively black boxes, then such systems might be modeled as one models players or students. In this regard, models of AI systems undergoing training or testing would update based upon observations of their behaviors or decisions in simulations or game environments.

Event Processing

Event processing is the analysis of streams of events and the deriving of conclusions from them. This includes the processing of events which occur in simulations and game environments.

Psychometric measurements and other important metrics can be obtained from processing those event streams which originate in simulations and game environments during the training and testing of AI systems.

Conclusion

A breadth of topics were indicated and discussed with respect to the training and testing of artificial intelligence systems in simulations and game environments.