WINOGRAD'S SHRDLU

Reference Cognitive Psychology 3 No 1 1972 Academic Press 1972

Winograd's program for understanding natural language.

INTRODUCTION

It is not a simulation but it uses important ideas about human syntactic semantic and problem solving activities and about their interactions in understanding natural language discourse.

The system answers questions executes commands and accepts information in an interactive English dialogue. Understanding of English requires an integrated study of syntax semantics and inference. Winograd felt that the best way to experiment with complex models of language was to write a program which can actually understand language within some domain. In this case with a robot which has a hand and eye and the ability to manipulate toy blocks.

The program attempts

1 to be a useable language understanding system

2 to gain a better understanding of what language is and how to put it together

3 to understand what intelligence is and how it can be put into computers.

A comparatively recent commendation of SHRDLU.

SHRDLU is head and shoulders above contempoary systems when it comes to intelligent conversation. Although its domain of discourse is restricted to a tabletop world of coloured objects SHRDLU really understands this world in terms of the relation between semantics and the physical properties of the blocks and the tabletop. It consists of subsystems that parse interpret and construct sentences carry out dictionary searches and semantic analyses and makes logical deductions. Conversational systems such as SHRDLU undoubtedly herald the future the advantages of a computer that is able to discuss problems intelligently with humans rather than passively accepting programs to solve the problems are too obvious to miss. Computer scientists in artificial intelligence work in part to this goal.

The system uses Halliday's systemic grammar which emphasises the limited and highly structured choices made in producing syntactic structure abstracting the features that are important for conveying meaning. The parser is special and interprets the recognition grammars. Meaning is covered by the development of a formalism for concepts within a language user's model of the world representing objects events and relationships. Semantics is represented by a system which is developed to work in conjunction with the parser a dictionary and the problem solving programs. It considers not only meaning but context.

1 MONITOR is a small LISP program which calls the other parts of the system.

2 INPUT is a LISP program which accepts typed English input looks up words in the dictionary performs morphemic analysisand returns a string of words with their definitions. This is the string upon which the grammar works.

3 The GRAMMAR is the main coordinator of the understanding process it is written in PROGRAMMAR. The routines are then translated into LISP.

4. SEMANTICS is a collection of LISP programs which work witrh GRAMMAR to interpret sentences. Each basic unit of grammar has many programs asociated with it.

5 ANSWER is a collection of LISP programs that control the responses of the system and it also remembers conversation for future discussion. It uses heuristics to produce the best English in its answers.

6 PROGRAMMAR is a parsing system which interprets the grammars written in terms of programs. It is written in LISP and builds a parse tree and has mechanisms for manipulating and exploring the tree.

7 The dictionary consists of two parts the first part holds the syntactic features of each word. The semantic features of each word are held in the second part and are interpreted by the SEMANTICS program. There are facilities for irregular forms and the definitions are held in a property list.

8 SEMANTIC features are also held in a property list and used in semantic analysis.

9 BLOCKS is a collection of PLANNER theorems which handle the dynamics of the building blocks on the tabletop.

10 MOVER is a set of display routines which simulate robot's world on the DEC340 graphic display attached to the PDP-10

11 PLANNER is the deductive system used to direct the parsing process and to deduce facts in the BLOCKS world. STRUCTURE OF THE SYSTEM

Click here for Picture

SAMPLE DIALOGUE

start at figure 3

1 pick up a red block

OK

in order to pick up the red block it had to find a space to place the gren one on top.

2 grasp the pyramid

I DON'T UNDERSTAND WHICH PYRAMID YOU MEAN

there are three pyramids so nothing is done

3 find a block which is taller than the one you are holding and put it in the box

BY "IT" I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING OK

the system interprets one and it and puts down the red block and picks up the taller blue block reach figure 4

4 what does the box contain?

THE BLUE PYRAMID AND THE BLUE BLOCK

5 what is the pyramid supported by?

THE BOX

earlier the system did not know which pyramid now it assumes the one in the box

6 how many blocks are not in the box?

FOUR OF THEM

7 is at least one of them narrower than the one which I told you to pick up?

YES THE RED CUBE

the phrases "at least one of " and " the one which " refer back to previous questions and the form of noun groups ie the syntactic structure helps here.

8 is it supported?

YES BY THE TABLE

passive

9 can the table pick up blocks?

NO

the semantic rules make this silly thus false

10 can a pyramid be supported by a block?

YES

the deductive system can find an example of this occurring

11 can a pyramid support a pyramid?

I DON'T KNOW

the deductive system cannot find an example

12 stack up two pyramids

I CAN'T

the system tries and fails

13 the blue pyramid is mine

I UNDERSTAND

14 I own blocks which are not red but I don't own anything which supports a pyramid

I UNDERSTAND

15 do I own the box?

NO

the box supports a pyramid

16 do I own anything in the box?

YES TWO THINGS THE BLUE PYRAMID AND THE BLUE BLOCK

17 will you please stack up both of the red blocks and either a green cube or a pyramid?

OK

it places the green block on the big red block removes the green pyramid and then puts the red block on the top see figure 5

18 which cube is sitting on the table?

THE LARGE GREEN ONE WHICH SUPPORTS THE RED PYRAMID

this helps to distinguish the correct cube

19 is there a large block behind a pyramid?

YES THREE OF THEM A LARGE RED ONE A LARGE GREEN CUBE AND THE BLUE ONE

20 put a small one onto the green cube which supports a pyramid

OK

a small one means a small block

21 put the littlest pyramid on top of it

OK

littlest must be interpreted from its root

22 how many things are on top of green cubes

I'M NOT SURE WHAT YOU MEAN BY " ON TOP OF" IN THE PHRASE

" ON TOP OF GREEN CUBES"

DO YOU MEAN

1 DIRECTLY ON THE SURFACE

2 ANYWHERE ON TOP OF?

2

THREE OF THEM

where there is ambiguity questions involving paraphrases are developed using heuristics

23 had you touched any pyramid before you put the green one on the little cube?

YES THE GREEN ONE

the system needs to find the correct event which involves the tenses of verb groups

24 when did you pick it up?

WHILE I WAS STACKING UP THE RED CUBE A LARGE RED BLOCK AND A LARGE GREEN CUBE

the time consists of the goal

25 why?

TO GET RID OF IT

26 why did you do that?

TO CLEAN OFF THE RED CUBE

the answer involves considering the subgoals in the tree

27 why did you clear off that cube?

TO PUT IT ON A LARGE GREEN CUBE

28 why did you do that?

TO STACK UP THE RED CUBE A LARGE RED BLOCK AND A LARGE GREEN CUBE

29 why did you do that?

BECAUSE YOU ASKED ME TO DO IT

it cannot remember the command

30 how did you do it?

BY PUTTING A LARGE RED BLOCK ON THE TABLE THEN LETTING GO OF IT THEN PUTTING A LARGE GREEN CUBE ON IT THEN LETTING GO OF THAT CUBE THEN PUTTING THE RED CUBE ON THAT CUBE THEN LETTING GO OF THAT CUBE

this is a complete printout of all the subgoals achieved

31 how many objects did you touch while you were doing it?

FOUR OF THEM

32 what did the red cube support before you started to clean it off?

THE GREEN PYRAMID

it uses its memory of past events

33 there were five blocks to the left of the box then

NO ONLY FOUR OF THEM THE RED CUBE TWO LARGE GREEN CUBES AND A LARGE RED BLOCK

34 put the blue pyramid on the block in the box

OK

there is an ambiguity here but as there is no pyramid on a block in the box it understands that it has to move the pyramid on the block into the box

35 is there anything which is bigger than every pyramid but is not as wide as the thing that supports it

YES THE BLUE BLOCK

36 does a steeple

SORRY I DON'T KNOW THE WORD STEEPLE

37 a "steeple" is a stack which contains two green cubes and a pyramid

I UNDERSTAND

38 are there any steeples now?

NO

39 build one

OK

the process begins by clearing one of the green cubes removal of a red cube then the two green cubes and a red pyramid are stacked

40 call the biggest block superblock

OK

41 have you picked up superblock since we begun

YES

42 why did you drop it?

BY "IT" I ASSUME THAT YOU MEAN THE SUPERBLOCK

I CAN'T EXPLAIN A NONEXISTENT EVENT

there is no record of this

43 is there anything to the right of the red pyramid?

YES FOUR THINGS THE BOX THE RED CUBE THE BLUE PYRAMID AND THE BLOCK NAMED SUPERBLOCK

44 END OF SESSION

SYNTAX

SYNTAX and MEANING

The decision to consider syntax as a proper study devoid of semantics is a basic tenet of many recent linguistic theories. Language is viewed as a way of organising strings of abstract symbols and competence is explained in terms of symbol manipulating rules. Although these methods have been remarkably successful in one sense they have almost completely lost the semantics. The problem is not that current theories are finding wrong answers to the questions they ask it is that they are asking the wrong questions. What is needed is an approach which can deal meaningfully with the question

"how is language organised to convey meaning?" rather than

"how are syntactic structures viewed in isolation?"

Syntax helps the speaker convey meaning beyond the meaning of words. The sentence is a series of syntactic structures chosen by the speaker to encode meaning. The listener must detect these structures to interpret the meaning of the sentence. The system is based upon Halliday's grammar which places greater emphasis on the way language is structured. It deals with word groupings rather than deep structures. Systemic grammar pays more attention to the way language is organised into units each of which has a special role in cinveying meaning

.

In English we can distinguish three basic ranks of units

the CLAUSE,

the GROUP and

the WORD.

There are several types of GROUP

NOUN GROUPS NG

VERB GROUPS VG

PREPOSITION GROUPS PREPG

ADJECTIVE GROUPS ADJG

Consider the analysis of the sentence

the three big red dogs ate a raw juicy steak

which is parsed using systemic grammar as shown on the next page.

The WORD is the basic building block and any particular word has features so dogs is seen as plural, the word ate is eat in a different tense. Click here for Picture

The next unit above a WORD is a GROUP and there are four types. Each group has a particular function in conveying meaning NG describe objects VG can convey messages about time. Each group can have variable numbers of slots for the words. NG allows number determiners adjectives and nouns. Each group exhibits features such as definite -the- plural modal -could have been seen-. The CLAUSE can be a QUESTION or a DECLARATIVE or an IMPERATIVE. It can be active or passive.

The above parse tree has a three level structure where the CLAUSE is split up into three GROUPS and each of the GROUPS is split into individual WORDS. As not all sentences have this structure we encounter the concept of a rankshift where a group can occur at a different level , for example , the man who came to dinner , or ,by leaving the coutntry. The rankshift is one of the principles of the systemic grammar.

The next question to tackle is why does the rankshift provide an advantage over the deep structure and the answer lies in the features that each unit can have. We can sat that all sentences can be a QUESTION or a DECLARATIVE or an IMPERATIVE and that if they are questions then they can be subdivided into YES_NO or WH types. This last choice is meaningless if the sentence is not a question but if it is then the distinction must be made. If the CLAUSE is a sentence or a MAJOR clause then it must be partitioned into one of the permissable types but this distinction is not appropriate for a secondary clause SEC. We can depict these structures as follows on the next page. Click here for Picture

The work of relating the set of features of an actual sentence to the surface structure is done in systemic grammar by realisation rules. If we consider sentences in different forms

Sally saw the squirrel

did Sally see the squirrel?

the squirrel was seen by Sally

In transformational grammars these forms would be equivalent but the word orders would be achieved by means of transformations. In systemic grammar the features would be noted and these would have a good overlap except for active | passive declarative | question. The realisation rules would then signal the exact word order.

PARSING

The parsing for this program is achieved by using the language PROGRAMMAR and an interpreter. It is basically a top-down left-to right parser but it can modify this technique if it sees fit. Chomsky said that the grammar should associate a structural description to each permissable sentance in the language and this parsing program does that. We can set up special tools to handle complexities such as 'and, or' and these cause interrupts in the parsing process. Interrupts are also caused by idioms. The parsing process can be interrupted at any time and semantivc considerations can be introduced. When we see the sentence

he gave the boy plants to water.

we do not confuse it with

he gave the house plants to charity.

The phrase boy plants is not meaningful like boy scouts so we reject any structure based upon this concept. This ability to integrate semantics with syntax is particularly important.

AMBIGUITY

The program tries to find a possible parsing for a sentence and as soon as ot finds an answer it returns it this is what it is designed to do. The process is unified with the results of the semantic interpretation being used to guide the parsing. If in parsing a noun group a call to the semantic interpreter fails to make sense of the noun group the parsing is immediately redirected.

The way to treat ambiguity is not to list all possible solutions to the parsing problem by finding all possible interpretations of the sentance but in being clever in choosing the first possible parsing. The parser will consider a failure if one occurs analyse the reasons for its occurrence and then take appropriate action. Suppose we have the sentance

I rode down the street in a car

the parser may wish to consider the noun group

the street in a car

the semantic analyser will reject the phrase

in a car

as a qualifier of a street.

Since the semantic programs are part of a general deductive system with a definite worl model the semantic evaluation which guides parsing can include both general knowledge and specific knowledge. Few sentences seem ambiguous to humans when first read. They are guided by an understanding of what is said to pick a single parsing and a few very different meanings. By using this same knowledge to guide its parsing a computer understanding system can take advantage of the same technique to parse meaningful sentences quickly and efficiently.

A semantic theory must account for multiple meanings of wordsphrase and sentences. Words may have several senses producing multiple possible interpretations of phrases and sentences involving them. Sometimes a sentence may also be described with several different syntactic structures leading to ambiguities. The sentance

the woman sitting in the room saw the flash of lightning

will be ambiguous if each word has a specific meaning. It could mean

the woman who was sitting in the room

the woman who is sitting in the room

In parsing we do not carry forward two possible structures we try to find the best and only seek others if we run into trouble. In semantics we take the other approach and if a word has two possible meanings we proceed in parallel.

There is a danger here of combinatorial explosion. If each word in a group of four has three meanings there could be 81 alternatives.

Consider the phrase

the green ball

we know that green can mean the colour unripe or a rooky

ball can be a sherical object or a grand dance

but it would be naive to associate the meanings

a rooky dance

or an unripe dance

so we are left with a coloured spherical object.

PROCEDURAL DEDUCTIVE SYTEMS

The system requies programming techniques capable of using procedural information but at the same time expressing this information in ways which do not depend on the idiosyncracies of particular languages. PLANNER is a goal-oriented procedural language designed to specify the bill. It handles simple assertions efficiently and it can handle complex information expressable in predicate calculus. Complex information is expressed in procedural form and this can include ideas of how to go about a proof. Being goal-oriented it is not concerned about interaction between procedures. The language is concerned with achieving goals not in how they are achieved and any valid accelerator can be introduced. New theorems can be easily added. PLANNER is a uniform notation for expressing procedural knowledge just as predicate calculus is a notation for describing a limited range of information.

COMPARISON WITH OTHER PARSERS

The early machine-translator dseigners were forced to develop their own linguistics as they worked and they produced rough versions. The parsers were packages of routines that evolved as the grammars emerged and grew to handle more complex sentances. When the machine-translation failed it seemed clear that it had been premature to tackle the natural language problem without sufficient linguistic development. Computer programs for natural language took separate paths:

the first was to ignore syntax completely and to use a general pattern matching process to derive information for example SIR STUDENT and ELIZA. They limited the user to a small set of inputs or their understanding to things that ignored syntax.

the second was to take a simplified subset of English which could be handled by a well understood form of grammar. Bobrow has produced a summary of this work.

AUGMENTED TRANSITION NETWORKS

ATN's have the power of Turing machines since they have changeable registers and can transfer control depending on the state of these registers. They can handle any type of gramar that can be parsed by any machine. ATN's appear to operate more closely with the way humans work thus they give a natural and understandable representation for grammars.

GRAMMAR

As we have seen it uses the concepts of CLAUSE GROOUP and WORD

the structure of a CLAUSE is quite complex and a simplified form has been shown.

NOUN GROUP

The noun group consists of the components

determiner the

ordinals first

numbers six

adjectives intelligent

classifier wales

noun england

qualifier prepositional group ---in the moon

VERB GROUPS

ACTIVE

took--past

takes--present

will take--future

can take--modal

has taken--past in present

was taking--present in past

was goint to have taken--past in future in past

was going to have been taking--present in past in future in past

PASSIVE

is taken--present

could have been taken--past in modal

has been going to have been taken--past in future in past in present

Words have features

There is a sophisticated algorithm to analyse word endings

The grammar appears in LISP

for example

1 S ->NP VP

2 NP ->DETERMINER NOUN

3 VP ->VERB/INTRANSITIVE

4 VP ->VERB/TRANSITIVE

5 DETERMINER ->the

6 NOUN ->giraffe

7 NOUN ->apple

8 VERB/INTRANSITIVE ->dreams

9 VERB/TRANSITIVE ->eats

(DEFUN SENTENCE

(((PARSE NP) NIL FAIL)

((PARSE VP) FAIL FAIL RETURN)))

(DEFUN NP

(((PARSE DETERMINER ) NIL FAIL)

((PARSE NOUN ) RETURN FAIL))

(DEFUN VP

((PARSE VERB ) NIL FAIL)

(ISQ H TRANSITIVE) NIL INTRANS)

((PARSE NP) RETURN NIL)

INTRANS

((ISQ H INTRANSITIVE ) RETURN FAIL)))

(DEFPROP GIRAFFE (NOUN) WORD)

(DEFPROP DREAM (VERB INTRANSITIVE) WORD)

SEMANTICS

We need a transducer that can work with a syntax analyser and produce data which is acceptable to a logical deductive system. The semantic theory must describe relationships at three different levels.

It must define the meaning of words which involves its relationship to a vocabulary and a structure of concepts. The next level relates the emaning of groups of words in syntactic structures. We need an analysis of the ways in which English structures convey meaning and the roles that words play. Finally an English sentance is not viewed in isolation. A semantic theory must describe how the meaning of a sentance depends on its context.FUNNY SENTENCES WITH AMBIGUITY

I rode down the street in a car.

I carried regularly goldfish in a bowl.

The councillors refused the demonstrators a permit because they feared violence.

The councillors refused the demonstrators a permit because they advocated revolution.

He hit the car with a large rock.

He hit the car with a missing bumper.

The woman played the piano with a wooden leg.

Fruit flies like a banana

WOODS

The ATN through its use of flags allows for the merging of similar parts of the network by recording information in registers and interrogating it ... and to merge states whose transitions are similar except for conditions an the contents of the registers;

the networks capture the regularities of the language ... whenever ther are two essentially identical parts of the grammar which differ only in that the finite control part of the machine is remembering some piece of information ... it is sufficient to explicitly store the distinguishing piece of information in a register and use only a single copy of the subgraph .

The use of subroutines with an argument and a parameter.

Click here for Picture