MI Banner

Finalists in the 2004 Competition: December 14th 2004

Five finalists were selected in 2004. The winner was CogVis Demo: Integrating Perceptual Learning with Task Induction. Read the report in the form of the Electrolux Press Release. More details of the entrants can be found below.

Press: New Scientist Article: Machine learns games 'like a human'


CogVis Demo: Integrating Perceptual Learning with Task Induction

School of Computing, The University of Leeds, Dr Chris Needham and Dr Derek Magee {chrisn,drm}@comp.leeds.ac.uk

A cognitive vision system for autonomous learning from observation is presented. The system has been trained on a sequence of two players playing a card game. An attention mechanism is used to segment the continuous input audio-visual streams, from which features are extracted and clustered in an unsupervised manner. From this, a symbolic data stream is formed, representing the perceived state of the world (what's on the table) and the actions (speech of the participants). Inductive Logic Programming is used to generalise a set of hypotheses from the examples, which in turn are used by an inference engine to drive a talking head that can interact with the game. It is this synthetic agent with its learned perceptual classes and protocol rules which will be demonstrated on live video input.


The New Model for Air Traffic Control

Private Entry, David Parkinson (mi.parkinson@sensus-dp.demon.co.uk)

Throughout the world Air Traffic Control systems still use human beings to observe aircraft on radar and instruct the pilots by radio. The New Model proposes that the essential calculations for this task can now be done by an AI computer making value judgements according to the same set of rules as the human controller.

The New Model Demonstrator simulates traffic in a part of the London Terminal Area airspace and controls this traffic by observing its progress and formulating the appropriate instructions. The instructions are voiced using existing Text-to-Speech software.

The system adapts continuously to the changing situation and can be used in a manual advisory mode if required. Safety is assured as the quality of the advice can be easily verified even though the processing is complex.


Interactive Video Based Agent

University of Exeter, José Lopes (J.E.Lopes@exeter.ac.uk)

The proposed agent is a multi-modal system which interacts with human users in a natural, sociable and affective way. It involves the creation of an extensible modular framework that supports the most basic input modes, extracting fundamental perceptual cues (such as locating the user and its focus of attention) and output feedback that is represented in the form of the animation of a graphical 3D character model and synchronized speech.

In its most basic form, the agent recognizes the presence of the user and reacts to events in the background. It also incorporates additional modular components that increase the agent’s complexity and response, such as bMotion (a script-based dialog engine) and speech recognition/synthesis engine.


SOPHIE Conversational Agent

Human Computer Learning Foundation (HCLF), Donald Michie (profdmichie@aol.com),
Richard Wheeler (richard.wheeler@ed.ac.uk), Dave Mason (dave_mason@totalise.co.uk)

SOPHIE is a conversational agent able to converse freely in natural language. The agent's utterances are delivered by text-to-speech, concurrently with text displayed under a colour photo. The human user types his or her remarks and questions at the keyboard, hitting the return key to signal the end of each input utterance. Volunteers will be invited from the audience to conduct conversations projected on-screen in real time. SOPHIE addresses a task with analogies to the machine simulation of Master chess, but more far-reaching. Not only is human chat relatively domain-independent, it also demands from both players a show of understanding of the exchange. Chess machines are not asked to show understanding even of their limited domain, e.g. by summarising or explaining their play.

From a Machine Intelligence standpoint two points stand out. (1) We humans arguably spend more hours of the waking day in conversation than in any other of the varied pursuits to which we turn our minds. As a result we can all reasonably be said to be experienced Master chatters. (2) As in 'speed chess' the chatter's mental decision taking is almost wholly subliminal. But in contrast to chess, the goals of chat are co-operative - namely to exchange personal information and to enhance mutual rapport. Machine simulation of this skill offers an opportunity to study an aspect of 'machine understanding' in the sometimes neglected context of social intelligence.


Yhaken. A Conversational Interface

Elzware Limited, Bristol. UK. Phil Hall (phil.hall@elzware.com)

Yhaken is a Chatterbot development that engages with users on the basis of natural language in all its vagaries. It manipulates the text, or voice, input in two discreet levels. Firstly the underlying parsing, processing and validation with respect to requirements of accuracy. Secondly an overarching management of information by the application of various discourse methods. These methods allow for an understanding of context, user state and goal orientation amongst other developmental objectives.