Learning and Computing ©2004, Robert W. Lawler

Dual Purpose Learning Environments

Robert W. Lawler and Alan Garfinkel

1. Introduction

Feurzeig [3] described his view of "intelligent microworlds" as permitting the mode of interaction between the user and system to be switched (by the user) from exploratory to tutorial to evaluative. The view offered here derives from Feurzeig's suggestion but moves in a different direction, to focus more on the purposes of the parties involved -- instructor and student -- than on the performance mode of the system. Consider the following as an example of a system that will permit dual usage.

When engaging in explanation of grammar, a teacher wants to offer his students a lucid and well-articulated description of the principles which govern the forms of a language in use, along with succinct examples illustrating those principles. If an intelligent system supports such use, one can have instructional use of the system. Students often prefer a more exploratory approach to learning. One might say, for example, "Let me try to do something novel to find out what I can do with this language" or "Let me probe what the system can do, beyond requiring me to generate a sentence it will accept. Can I determine what the limitations of the system are ? Can I improve the system's grammar in such a way that it will be more nearly perfect ?" Pursuing such questions puts a student in a very active mode, one in which some students will learn much better than most other ways regardless of the domain or language focus [1].

A system with such flexibility in use would be a dual purpose learning environment, one in which the instructor can have his way, provide his best guidance, but hopefully one in which the student can also act in a powerful and positive way to correct and augment the system itself and through doing so develop his own knowledge as much as he cares to.

2. Project Objectives

Observing that people in Europe have been more sensitive than Americans to the need for learning multiple languages and to second language instruction, we tried to take advantage of their expertise. A primary objective of this project was to port a European PC based instructional system, LINGER [2], for use on Macintosh computers in the USA. LINGER is an application package directed to instruction in several foreign languages. It's a Prolog-based intelligent tutoring system. The original version of LINGER was a research vehicle. We tried to scale up that system for wider use. Adapting LINGER for the Macintosh had two dimensions. First was the issue of the Macintosh port. We chose LPA Prolog as the implementation language, expecting little trouble in converting the source code. So we found the case. Some low level routines, primarily based on I/O calls had to be re-written. Interface improvements, especially when set up to take advantage of Macintosh features, required additional coding.

The second dimension of the project revolved around scaling up the size of the dictionaries and grammars used with LINGER. Although this sort of activity is commonly undervalued [3], it often reveals problems not apparent with small-scale prototypes. LINGER has three main components: a dictionary representing the words of the language, a data base of rules which amount to the grammar specified for that language, and an inference engine which uses the dictionary and the rules to test the grammaticality of strings of words submitted to the program.

3. LINGER as an ITS

LINGER was initially designed as a grammar checker for novices at a second language. Artificial Intelligence (AI) techniques were needed in LINGER because novice text production typically deviates considerably from the standard of the target language. We use ITS (for "Intelligent Tutoring System") here in a loose sense [4] to mean that LINGER uses AI programming technology for instructional purposes. In effect, LINGER was designed to guess the user's intended text from the input then judge the closeness of fit of the entered text to a correct expression of the inferred intended text. LINGER is in character more analytic than didactic.

When LINGER receives a string of text, it returns to the user a parse of the string with comments and suggestions about the string's validity. It also attempts to compose it's own version of what the string should have been had the user produced a grammatically correct version. This may seem an audacious goal -- unless one considers the limitations of the original context of use. LINGER was created originally for a foreign language instructor who was tired of correcting novice-students' obvious errors. He hoped that his students could improve their assigned essays by typing sentences into LINGER and receiving grammar criticism at a fairly low level of sophistication. LINGER was intended to be used as a kind of "language calculator" to catch obvious errors. It was necessary for the system to deal with unrecognized words because one could not count on students typing correctly.

4. Grammar and LINGER's Architecture

The prototype grammar of LINGER is a set of Prolog rules. The prototype grammars of the three ported LINGER systems (Spanish, English, and French) are largely similar within the limited domain of the grammars' covereage of the languages. The prototype grammars were intended as an examples to be changed and developed by the final user. Even given such an intention, one needs describe the starting point to see what progress is possible. The character of the grammar can be judged from its depth, size, breadth, and modifiability. Consider the English grammar as typical of the other LINGER prototypes. It is five levels deep. The English grammar has four levels of structural rules (a fifth set, "checks", verify grammaticality after structure has been determined). There are sentence to clause rules (2 in number); clause to phrase rules (4); phrase to phrase rules (23); and word-lookup rules by part of speech (17). The number of checks is 28. The size of the grammar, in total number of rules then, is 74.

Simple dictionaries require complex processing rules, and vice versa. Decisions about grammar rules and their coding interact with the knowledge representation used in the dictionary. In LINGER, for example, non-standard plural inflections are coded directly into the dictionary. So also are bound comparative forms of adjectives. This representation decision has two primary consequences. First, multiple dictionary entries are needed for those words which can serve as different parts of speech. Second, one should expect a persistent trade-off in the implementation between extending the dictionary (and thus parsing time of entered strings) and extended parsing times through increased complexity of processing [5].

The issue of breadth is complex, since it measures the extent to which the rule set covers the grammar of the language. The LINGER grammar prototypes are narrow in some ways which can be easily modified and in other ways which require a major redesign of the system. Consider the easily modifiable cases first. In English, the passive is formed with a past participle and an auxilliary verb. LINGER's prototype dictionary recognized only forms of the verb "be" as a passive-forming auxilliary. English permits the use of two other auxilliaries, "get" and "become" to form either action-oriented or developmentally focussed passives; for example, "the thief got caught and in due time and with due process became imprisoned". Such omissions can be simply corrected at need by adding definitions of the two auxilliaries to the dictionary. The more complex limitations of LINGER derive from interactions of the dictionary, grammar, and the inference engine.

5. Macintosh LINGER Dictionaries

At the conclusion of this project, dictionaries available for use with Macintosh LINGER in three languages have been scaled up by a factor of twenty. Original LINGER prototype dictionaries were a mix of parts of speech totaling between fifty and seventy words. The new LINGER English dictionary is approximately 1200 words and is implemented as three separately loadable Prolog files, of approximately 200, 500, and 500 words each. The new LINGER Spanish dictionary is 1300 words long. It is implemented as eight separately loadable files. The French dictionary, of approximately 1000 words, is implemented as a single large file. It is a word collection typical of those used for vocabulary review in US high school French courses. It is not partitioned because the source vocublary did not include any principles justifying grouping of the words in a usage-based, meaningful way.

The Spanish dictionary is based on Heywood Keniston's [7] collection of "Common Words in Spanish". The list derives from dialogue appearing within plays and novels. It was an early attempt at representing real conversational vocabulary and a reasonable one given the absence of recording equipment. The words are grouped in eight levels of use by frequency. The source frequency partition explains why one could break-up that list into groups which might be of a managable size for study and/or instruction. One reason for choosing that list was that for more than fifty years the Keniston word list has been used by text book publishers of commercially available textbooks in Spanish; in this specific sense it is still "state of the art". Some more recent frequency counts have been based on interviewing people (this has a natural appeal as far as realism goes), but they are not vastly different from the Keniston list.

The English dictionary is derived from frequency counts of words appearing in a collection of stories told by children. It is based upon a study of childrens' story telling by Moe, Hopkins, and Rush [9]. They asked children to tell their favorite stories, then counted the frequency of occurrences of all the words. We took that study as a basic source, then deleted references to individuals (Spiderman , Superman, Goldilocks, etc.,) to create our own list of 1200 frequently used words. Since the words were produced by young children, this list might be especially suitable as a list of very common words in English. The original collection of words has been been somewhat compressed by removal of duplications based on inflections and contractions.

5.a. Creating Dictionaries as Compiled Structures

The LINGER dictionaries are Prolog data-structures. We have been able turn the development and extension of such dictionaries from a programming task into primarily an authoring task. Dictionary construction begins with a prototype core, containing prolog data definitions and examples of the data structures for the various parts of speech. (See Figure 1)

Using Macintosh cut and paste functions within a word processing program, a clerk (with adequate language knowledge) is able to key new words for the dictionary and specify variable information -- such as irregular plural formation, verb transitivity and reflexivity, or adjective position and inflection. Saving these newly created entries as ascii files permits their subsequent loading and incremental compilation within the Prolog language application. This final step is necessary for execution; it is also a check which catches clerical typing errors.

Figure 1

/* DICTIONARY EXTENSION TEMPLATE */
/* STANDARD ENDINGS */
info(gp_end,[[gender(m),plurality(s)],[gender(m),plurality(p)],[gender(f),pluralty(s)],[gender(f),plurality(p)]],[],[]).

ending_type(gp_end,["ms","mp","fs","fp"],[]).

/* NOUNS */
info(noun,[[plurality(s)],[plurality(p)]],[gender],[person(3)]).

word("",["root"],noun,["","s"],[m]). /* common plural forms */
word("",["root"],noun,["","s"],[f]). /* common plural forms */

/* SAMPLE ENCODED DICTIONARY ENTRIES */
/* NOUNS */
info(noun,[[plurality(s)],[plurality(p)]],[gender],[person(3)]).

word("",["amigo"],noun,["","s"],[m]).
word("",["chico"],noun,["","s"],[m]).
word("",["día"],noun,["","s"],[m]).
word("",["Dios"],noun,["","es"],[m]).
word("",["don"],noun,["",""],[m]).
word("",["esposo"],noun,["","s"],[m]).
word("",["agua"],noun,["","s"],[f]).
word("",["amiga"],noun,["","s"],[f]).
word("",["casa"],noun,["","s"],[f]).
word("",["doña"],noun,["","s"],[f]).
word("",["fin"],noun,["","es"],[m]).
word("",["hermano"],noun,["","s"],[m]).
word("",["hijo"],noun,["","s"],[m]).
word("",["mundo"],noun,["","s"],[m]).
word("",["niño"],noun,["","s"],[m]).
word("",["gente"],noun,["",""],[f]).
word("",["mujer"],noun,["","es"],[f]).
word("",["niña"],noun,["","s"],[f]).
word("",["padre"],noun,["","s"],[m]).
word("",["papel"],noun,["","es"],[m]).
word("",["rato"],noun,["","s"],[m]).
word("",["señor"],noun,["","es"],[m]).
word("",["tiempo"],noun,["","s"],[m]).
word("",["puerta"],noun,["","s"],[f]).
word("",["razón"],noun,["","es"],[f]).

The linguistic knowledge required for data structure authoring is essentially of surface features; for example, the irregularity of specific verbs, or the plural forms of adjectives, or the relation of inflection to pre- or post-positioning in Spanish, etc. One also needs skills with computer based word-processing. On the other hand the author doesn't have to know the Prolog language. The practical point is that cut-and-paste text manipulation, along with global change capablity, permit one to go directly from a kind of clerical work into production of compiled code, even if that is not sophisticated code.

6. Macintosh LINGER Function and Timings

The essential function of LINGER has the following character: a user types a sentence such as "I like the banjo." LINGER spends some time analyzing the text string, then responds either that it agrees that the typed sentence is valid in the language, or it offers diagnostic comments about the text with suggestions for its modification. In this specific case, "I like the banjo" is judged a simple and correct English sentence. The word "the" is considered a determiner. By way of contrast, if one types "I like banjo", LINGER judges that a determiner need be inserted in the sentence and reprints the sentence in its preferred form. This judgment (an error in fact for English) is a decision not based upon semantics at all. The decision is made by applying rules of grammar to the set of words and developing possible orderings of those words that would be grammatically correct. Obviously then Linger's dictionary encodes some information about the grammatical type of the found words in the entered string.

6.a How LINGER works

Generally, the sequence followed by LINGER involves finding in its dictionary each word of the entered string, then constructing a syntax driven parse. LINGER chooses the parse of the entered text that provides the best fit to the grammar in the rule base, then criticizes that sentence by comparision with the perfect form of the sentence, according to that same set of rules.

6.b Timings

The discussion of performance focusses on parsing times as a function of the length of word strings submitted to LINGER and on the size of the loaded dictionary. The basic results of timing studies with the English language dictionary are presented in the Table I:

Table I
Loaded Dictionary Size (words) 200 700 1200

Parsing times for word strings

Words submitted to LINGER [6]

he is big 3 0:31 1:12 1:51

the bear is big 4 0:40 1:34 2:27

the big baby is little. 5 0:50 1:57 3:02

the big baby is a bear. 6 1:03 2:23 3:41

the big baby is a little bear. 7 1:13 2:46 4:17

the big baby is in the house. 7 [7] 1:20 2:53 4:24

the one big baby is in the house. 8 1:30 3:17 5:01

the one big baby is in the little house. 9 1:40 3:40 5:38

the one big baby is in the one little house. 10 1:52 4:06 6:15<

**Table I**
Loaded Dictionary Size (words)		200	700	1200
		Parsing times for word strings
	Words	submitted to LINGER [6]
he is big	3	0:31	1:12	1:51
the bear is big	4	0:40	1:34	2:27
the big baby is little.	5	0:50	1:57	3:02
the big baby is a bear.	6	1:03	2:23	3:41
the big baby is a little bear.	7	1:13	2:46	4:17
the big baby is in the house.	7 [7]	1:20	2:53	4:24
the one big baby is in the house.	8	1:30	3:17	5:01
the one big baby is in the little house.	9	1:40	3:40	5:38
the one big baby is in the one little house.	10	1:52	4:06	6:15<

The results obtained with the Spanish and French dictionaries are comparable to those for the English dictionary, within the limits of the differences of the dictionaries' structures. Finally, the timings above reflect processing of sentences that contain only words that exist in the dictionary. When unrecognized words are added to the entered string, the processing takes longer because LINGER must make speculations about what part of speech the unrecognized word might be and then decide which among possible speculations would be the best parse. This takes more time.

These timings are long, and the pattern of dependence on length of the dictionary is so obvious that some explanation is needed. After delving into the code of the LINGER shell, we found that the Prolog search routines had been coded at the primitive level to scan the entire dictionary for each word in the entered string. Why ? The original LINGER parser recognizes that specific lexical items can function as different parts of speech. The word "bear" for example may be a noun; it may be a verb. Thus the first match found in the dictionary might not be the correct part of speech. The parser assumed that any word might be used in various ways and might appear in multiple instances in the dictionary. Because the dictionary entries were encoded by part-of-speech type, the parser exhaustively searched the dictionary for each word appearing in the string. It then proceeded to build multiple parse-trees and select the best from among them. This strategy of exhaustive searching of the dictionary for each word in the string accounts for the pattern of timings in Table I [8].

7. Modifying Prototype LINGER Grammars

Modularity is key to modifiability. This is exhibited most clearly in attempts to extend the sentence to clause rules of the grammar. LINGER's two sentence-to-clause rules permit a sentence to be either a main clause or a main clause followed by a subordinate clause. In order to cover the imperative as well as the declarative mood, we added a rule to permit formation of a sentence from a verb-phrase alone. To permit greater complexity in sentences, we also added rules for introductory subordinate clauses and compound sentences. These changes were made easily and worked well -- so long as test sentences used known words and were grammatically correct as judged by LINGER's rules. When however sentences were submitted which contained unknown words or did not fit LINGER's limited grammar, the sentence correction process went seriously awry. This argues that in redesigning LINGER-like systems for the future, a key goal must a kind of modularity that will permit definition of new rules -- which educational applications absolutely require -- along with connected forms of erroneous variants of sentences permitted by those new rules.

The primary conclusion one can infer from this exploration with LINGER is how different it is in function from language understanding systems and how different it is from "style checkers"[9]. It is also quite clear that performance needs to be improved significantly and that the user-interface also needs to be enhanced. Both of these latter require system redesign; both should be possible, especially given the flexibility one may hope for with standardized interprocess communication promised as a part of Apple's Macintosh system 7.

8. The Use of Polylingual ITS with other media

How should we think of a long-term sequence of language learning activities in which such interactive teaching systems and tools could play a productive role ? We need a practical view with at least one component that takes full advantage of the technology and yet also respects the limitations of expense and cost that technology involves. How could future LINGER-like systems fit in with less expensive, more traditional eductional technologies and practice ?

Let's suppose a language training program involves an immersion experience at a hypermedia-capable language training center. Such could be a place for total immersion in the second language, where people would speak and listen to the second language as well as working with systems for second language instruction. Before people attend such a center, it would be important for them to be familiar with the kinds of systems they would work with, what such a systems could do and what such a systems' goals were. It would be efficient if future center students could be introduced to the training system through remote site viewing of videotapes about them. The optimal way to do so would be to strip from the interactive system introductory demonstrations made in the native language of the future student , so that when at the training center the student could concentrate on use of the system with the target langauge of instruction.

When people leave the training center and return to their normal positions, they might then find a LINGER-like facility useful primarily in the mode of a linguistic calculator. A LINGER-like system with a nearly complete grammar -- designed for maximum processing efficiency and NOT using any hypermedia training materials -- would be most cost effective. At this time, an interactive teaching system fits a niche within a larger language training and learning program.

9. LINGER Usability for Future Instruction

The LINGER system we have discussed is not usable in classrooms today, but it serves as an interesting and promising research tool. What sorts of utility might it have after a significant redesign and further development ? The target audience for LINGER is one of people who are learning a second language [10]. One may think grammar is important as a crutch for learning second languages -- not necessarily because of the good fit of grammatical rules to what is in the mind [11] but primarily because such systems of rules have been of proven value to people in making judgments about how writing and talking should proceed. Grammar will continue to be a subject of instruction while second languages are taught [12].

One strength of LINGER is its ability to continue processing even when it encounters words not encoded in its dictionary, using the structural rules to guess at the type of unknown words. Such flexibility is essential for educational applications. A major new objective for future LINGER systems should be extension of this capability to permit the addition of user-defined rules to the grammar. Adding such a capability will not be easy, but it should be possible through fusion of techniques based on programming by example and through definition of erroneous variations of new structures as "near misses" at the time of grammar rule extension. There are three broad categories of application we can foresee now for such systems as LINGER or its descendents:

9.a Traditional Instruction

One can imagine LINGER systems as linguistic calculators, using which a student might verify that his composed sentences are correct before commiting them to paper in an essay. Such could increase the feasibility of writing assignments in second languages (and thus enjoyabiliy for both student and instructor). By itself, this would be a significant enhancement for many language instruction programs.

9.b Student Guided Discovery:

One of the key questions in education is the extent to which students are actually active in learning what they are studying. We believe this approach to the use of educational technology holds the most promise for individual students through engaging them activiely in their studies, but making this approach effective will require redesign of LINGER systems. In self-guided discovery mode, the student would use a LINGER-like system as a modifiable grammar, one whose operations and performance he could explore and change.

9.c Instructor Experimentation:

Experienced instructors try to diagnose the errors their students make so that they may offer them effective advice on how to improve their grasp of the language. One approach in current use is to develop lists of common errors and offer them to students as warnings of what they should avoid. Lists such as that of Tables II and III might serve future ITS as bug catalogs. LINGER could enhance an instructor's ability at diagnosis in a way that goes beyond lists of errors or bug catalogs by providing a developing language modelling capability. This could enhance the development of teachers' intuition and their explicit knowledge of the roots of errors manifest in speech production.

Table II
Categories of Common Structural Novice Errors in Spanish [13]
Concordance [14]:	Passive
- gender	-true
- subject-verb	- Se passive
- noun-adjective	Negatives
Word Order	Apocopation
Redundancy	Nominalization
- subject pronoun	- adjectives
- indirect object	- possessives
Pronouns	- infinitives
- direct	Substitutions & Confusions
-indirect	- ser / estar / haber
- reflexive	- por / para
- demonstrative	Personal a
- relative	Prepositions
Subjunctive	- bound
- present; past	- simple
- impersonal	- complex
- noun clause	Noun Markers
- adverbials	- definite
- adjectives	- indefinite
- conjecture	- otro
Imperatives	Possessives
- participles	Numerals
- present	Adverbials
- past	Conjunction
Tense/Morphology	Augmentatives/Diminutives
- present	Hace + time
- preterit/imperfect	Comparatives
- perfect	Más que
- future	Tanto como
- conditional	Infinitives
- progressive	- verbal phrases
Interrogatives	- prepositional
- Qué/Cuál

Table III
Expansion of A Single Concordance Novice Error Description (Spanish) [15]
Nouns listed in the knowledge domain as masculine or feminine require an accompanying determiner and adjective of the same gender.

la mañana = the morning . This is the way to say "the morning." Changing the determiner to el usually makes no change in meaning. It only creates something that doesn't exist in standard Spanish , so we call it an error. (Here, the change would be meaningful. El mañana can be used to say "the future" in a general sense.)

* Es una mañana hermos(o)s vs. Es una mañana hermosa.

A common error is illustrated above. Using the wrong ending on the adjective does not provide for concordance of gender.

Here, the example shows the addition of tres (three)

* ...tres mañanas hermos(a) vs. tres mañanas hermosas

A common error is illustrated above. Leaving the Letter "s" off the end does not provide for concordance of number.

* ..tres mañanas hermos [os] vs. tres mañanas hermosas

This example shows a less common error which does not provide for either concordance of number or gender.

Even now, in LINGER's early state of development, one can load with the shell the grammar of one language with the dictionary of another, since the dictionaries and grammars are modules. One ask may ask then what sort of performance would come out a system with a "native" grammar of one language and the beginnings of a vocabulary in another language ? What would be the results, in terms of the performance of the system, if one began to add to the native language a rule representing a specific grammatical construct of the grammar in the second language ?

But how does that relate to what students and teachers actually do and learn ? "*Me llamo es," says one of the thousands of beginning Spanish students that pass through one's class -- and it appears that there are two approaches to understanding this common pattern (with an eye towards its easy eradication). We can use a list of commonly found incorrect patterns, such as that of Table II, to locate an explanation of precisely what the error is then correct the student with a lecture on the need to use a reflexive verb construction to identify oneself in Spanish. Any experience as a language teacher will convince you that this is almost useless. Clearly, some other approach is needed and perhaps a better understanding of the nature of the error would help.

One road to that understanding is to be seen in the three kinds of analyses identified by James [6] as "learner language." These are contrastive analyses (comparisons of native language to target language), error analyses, (comparisons of what Selinker, [13], called interlanguage to a specific target language), and transfer analyses (comparisons of interlanguage to the learner's native language made in an effort to find evidence of inappropriate transfers from the native language to the interlanguage). Certain of these analyses have had important effects on language teaching practices. For example, contrastive analyses have, in the past thirty years, had their effects on the ways some language teachers generalize about such items as aspect of verb tense in Romance Languages; and error analyses have revolutionized the ways some language teachers respond to error. There may be the possibility for using LINGER-like systems to effect another analytical approach which may have a similarly beneficial effect. LINGER now accomodates more kinds of grammatical structures in more languages that it used to. Further development will lead to improvement and, perhaps, a new outlook on the analyses mentioned above.

The capablities of LINGER-like systems include developing parallel implementations in different languages, even of mixing and intermingling different grammars and vocabularies, to create a kind of exploratory learning environment for foreign languages. The new outlook could be effected by these capabilities deriving from LINGER's language independence. Contrastive analysis emphasizes the ways language differ from one another. But LINGER-like systems can be developed multi-lingually and function in a language-independent manner within the context of the Romance Languages. Thus the archtecture of LINGER-like systems might one day enable applied linguists to examine the similarities among languages and thus explore new ways of thinking about error. Given the flexibility and polylingual commitments of LINGER, future LINGER-like systems may be the first kind of ITS that are naturally congenial to such a view of language and language learning.

If the instructor has a facility with which he can model the language learning process, it should enhance his ability to diagnose student errors and to refine his suggestions of how the student could avoid them. LINGER-like systems could become an AI-based workbench for the diagnosis of error as a language independent phenomenon -- or as a tool to help in accounting for the differences in particular errors made while learning a target language given a specific native language as the student's starting point. For the teachers, working with language-learning modelling systems would provide an experience which would improve their diagnostic capability. This specific area of application, teacher skill-enhancement through student cognitive modelling, could be a significant new research area for future language instruction.

In conclusion, three answers are available to the question of how one might use future LINGER-like systems: as a linguistic calculator; as an environment for student discovery of new grammatical knowledge; and as a kind of an experimental workbench for teachers to explore the nature of language and the nature of language learning.

10. Technical Suggestions for Future LINGER-like systems

10.a Efficiency of Hybrid Vigor

If one's commitment is less to an intellectual program than to designing a system which will work well on a substantial scale, performance must be a concern from the beginning. We should certainly consider developing a hybrid system, one where each part would be locally optimal, in the hope to develop a system with hybrid vigor.

The original LINGER system was created entirely in Prolog. A better strategy would be to design carefully the modularization of the system and -- where performance considerations are paramount -- encode each module in that language best suited for its processing. There may be some parts of the system where Prolog provides the language of choice for representing some of the knowledge, but Prolog should not be involved in the foreground of any user interface, especially in the USA where it is not well known.

Since graphical user interfaces have proven their appeal, and since language involved various senses simultaneously, it is natural to look for a user interface that would be a hypermedia system, such as Apple's HyperCard, with its embedded language HyperTalk. Inasmuch as a LINGER-like system would involve shell development, typically done in other languages than hypertalk, it is clear that the next design should be one that looks for an intermarriage of AI and hypermedia facilities in such a fashion as to profit from hybrid vigor [16]. The implication here is that the user should benefit from the advantages of various systems or languages without even knowing with which one he is interacting. Each part of the system should be created in a locally optimal language, with a well defined and stable interface between the various modules. The feasibility of such an approach is clearly established. XCMD's in hypertalk exhibit such hybrid capabilities. Cross application interfaces exist for Lisp and for LPA Prolog . Ultimately, the development of sych systems will depend on the quality of the Macintosh Interprocess control interface to be introduced with Macintosh System 7.

Performance again requires that the database sizes be kept under control and that it's contents and organization be reconsidered. The development of morphological analysis modules in Enhanced LINGER is an attempt to do precisely the first. That is not, however, a solution to the problem of the indeterminacy of word category when semantic disambiguation is not employed. When a word has a common form for different grammatical categories, how can a system determine which part of speech is intended ? A first technique to use for that problem would be a flag in the knowledge representation to indicate for every entry whether or not that particular word had multiple entries in the dictionary. A better technique would be to complicate the knowledge representation sufficiently to permit the alphabetical organization of the dictionary across word categories. This would permit locally exhaustive word matching without requiring an exhaustive search through the entire dictionary [17]. Even if one could find efficiently all possible grammatical categories which have the word as an instance, there are bound to be cases which are in fact syntactically indeterminate but which a user will be able to judge based on his intentions. If the system is modest, it can afford to ask the user which of the possible interpretations is preferred -- and then go on to criticize the details of entered string.

10.b Intelligent Enough to Learn from User

What is special technically about LINGER is the attempt to construct well formed sentences based on free input. In our experiments, this attempt worked tolerably well for sentences based on the proto-typical grammar and the sample dictionary. When, however, the grammar was modified, the routine for reconstructing new well formed sentences no longer worked properly -- and they were embedded in the shell of the original LINGER. In future LINGER systems, they will surely be segregated as a second grammar-relevant database. Most important for educational applications is to provide a system capability for the progressive definition of additional grammar rules and related errors that could be anticipated based on the attempt to use forms represented by the additional grammatical rules. What LINGER-like systems must do to learn a new rule of a grammar is first, infer the rule from one or more concrete examples. The second task, generating rules for the reconstruction of new sentences based on deviance from known grammar rules, will be more difficult, but it will be essential if LINGER is to fulfill its promise.

How rules for correct sentence reconstruction be created ? Consider this idea. Let's suppose a user types into LINGER a sentence which he knows is a good sentence, but the system judges it to be flawed. The person asserts the sentence is correct and a common form (as opposed to an idiom). Intelligence will be required in a LINGER-like system to enable it to elicit from a user his inarticulate grammatical knowledge. Example based learning is an area of artificial intelligence research that could be of help with this problem, but the problem is more difficult and richer in the case of a LINGER-like system. The positive-example cases are relatively straightforward. The harder challenge will be in developing rules for reconstructing well formed sentences from errors. One way to do that would be to introduce examples of sentences that are "near misses" [18]. The LINGER-like system would first determine the structure of the newly exemplified well-formed sentence and then inquire of the user what would be possible variations from the sentence that would count as ungrammatical. These "near misses" would then be used to establish patterns which, when matched, it would be legitimate and reasonable to then consider as deviations from an intended correct form. This would argue that the grammar related modules of future LINGER-like systems should contain components such as these:

Such a structure would require a significant redesign of LINGER. A system with such capabilities, however, would be capable of functioning as a dual purpose learning environment. Constructing such a system will be a major challenge, but the promised new modes of language learning and instructor skill-enhancement argue that the effort will be justified by the result.

11. REFERENCES

Brown, H. D.: Principles of Language Learning and Teaching. Englewood Cliffs, NJ: Prentice-Hall, 1980. Second edition 1987
Chomsky, N.: Reflections of Language. New York: Pantheon 1975.
Feurzeig, W.: Algebra Slaves and Agents in a Logo Based Mathematics Curriculum. In: Artificial Intelligence and Education (Lawler and Yazdani, eds.) Norwood, NJ: Ablex Publishing 1987
Hamburger, H.: Foreign Language Tutoring and Learning Environment. (This volume.)
Handke, J.: WIZDOM: A Multiple-Purpose Language Tutoring System Based on AI-Techniques.
James, C.: Learner Language. Language Teaching 23, iv, (October, 1990): 205 - 13
Keniston, H.: Common Words in Spanish. Hispania 3, 85-96: (1920)
Keniston, H.: Spanish Idiom List. New York: Macmillan, 1929
Moe, A, C.; Hopkins, C.J.; & Rush, R.T.: The Vocabulary of First Grade Children. Springfield, IL: Charles Thomas 1982
Minsky, M; & Papert, S.: Perceptrons. Cambridge, MA: MIT Press 1988 (Second edition.)
Ramsey, M. M.: A Textbook of Modern Spanish. Revised by R. K. Spaulding. New York: Hold, Rinehart, Winston, 1894 / 1934 / 1956
Schank, R.& Colby, K.M.: Computer Models of thought and Language. SanFrancisco: W. H. Freeman, 1973.
. Selinker, L.: Interlanguage. IRAL, 10:209-231.
Winston, P.: Learning Structural Descriptions from Examples. In: The Psychology of Computer Vision, P. Winston (ed.). New York: McGraw-Hill 1975
Wolff, D.: Computers and Foreign Language Learning: Results of the Dusseldorf CALL Project.
Yazdani, M.: Multi-lingual Multimedia, Yazxdani (Ed.) Intellect, 1995.
Yazdani, M. An Artificial Intelligence Approach to Second Language Teaching.

Publication notes:

Written in 1989. Unpublished in this form.
A short form of this paper was published with the same title, under the single authorship of Lawler in the International Journal of Computer Aided Language Learning, 4, No. 1, pp. 46-52.
The short form of this paper was re-published in the AISBQ (summer 1991, no. 77), the Quarterly newsletter of AISB (Society for the Study of Artificial Intelligence and the Simulation of Behavior).
The short form of this paper was re-published in M. Yazdani (ed., 1993), Multilingual MultiMedia. Oxford, UK; Intellect Press. pp. 73-84.

Text notes:

One might ask whether a student can help a machine extend its command over a grammar which the student doesn't yet know. We believe so, and will turn to this issue near the end of this paper
The LINGER system was developed by Yazdani and others at Exeter University in England. We are grateful to these colleagues for making available their system, including prototype dictionaries and grammars, and for providing help and guidance throughout this project.
For example, Minsky [10] argues inattention to the question of scaling is a serious weakness in the currently popular AI connectionist movement, one which undercuts the claims and vitiates some components of that research program.
LINGER does not undertake student modelling. Neither has it internal lessons nor teaching strategies.
The desire to limit the size of the dictionary to a single entry per word subsequently led Yazdani and his research group to develop an enhanced version of LINGER (referred to as "eL" for "Enhanced LINGER") wherein a morphological analyzer is added to the parsing routines of the shell [17]. One should not be surprised if this additional computing increases the processing time for entered strings.
Table entry format is M:SS (minute: second,second). Timings were done on an 8mb SE30.
Two sentences having the same length but slightly different structure.
When we modified the LINGER shell code so that the first found entry in the dictionary terminated the search, the parser's output was not sensible. We decided not to pursue this line of system repair because, with the redesign of Enhanced LINGER to exploit morphological analysis, the original parser has been replaced by one based on different techniques [17]
The programs of "Correct Grammar" exemplify a good example of what we consider "style checkers". That well-evaluated program, fast as it is, apparently checks for the existence of well-known errors but fails regularly on detecting seriously malformed sentences, such as a novice would be expected to generate.
Certainly they will be learning that second language at a different age from that at which they learned their native language. Almost certainly as well, they will be learning language in a practical context quite different from that in which they learned their first language. Further, they will need to learn language in ways that would be quite different from those which served in their first language learning.
Theoreticians of various perspectives suggest that gramar instruction is not very important. Chomsky has argued that significant linguistic knowledge -- by which he means functional knowledge of the legitimate grammatical structures of language -- is innate [2]. Schank [12] and colleagues argue that their semantic decomposition of language permits language understanding without consideration of syntax; and therefore syntax is unimportant for machine based language understanding and overvalued generally.
Regardless of the issue of cognitive reality of the grammar and regardless of any judgment that grammar is superfluous for machines in comprehension, a focus on grammar is not necessarily superfluous for instruction or exploration in language learning.
Unpublished. Developed by Professor Wm. Flint Smith, University of Syracuse.
See the expansion of this entry in Table III, immediately following.
Such expansions of the meaning of error list items is merely a summary and exemplification of the essential grammatical information more fully described in the classic grammars of the specific language. For Spanish, [11] is a good source of such information.
See Yazdani [16].
The Lisp-based techniques of Handke's WIZDOM system ([5]) provide an excellent example of how to achieve such an organization.
This phrase comes from Winston's well-known study "Learning Structural Descriptions from Examples". In Winston, [14].