MA Thesis: Chapter 1: Introduction

Demystifying Computer Science: An Approach Using Interactive Multimedia
New York University, Gallatin School, MA Thesis

Chapter 2: Theory and Foundations

Computers and Cognition
Educational Theories
Children and Computers
Learning About Computers
Naive Models

Computers and Cognition

Cognitive science has provided a foundation for those experimenting with new educational media; it is up to these educational researchers to find practical applications for the theories of learning developed by cognitive scientists. The field of cognitive science explores the way information is received, organized, retained and used by the brain. The research in this area often overlaps research in psychology and education, but one thing that distinguishes Cognitive Science from these other disciplines is its close relationship to the field of Artificial Intelligence. Cognitive Scientists have traditionally used AI as the experimental proving ground for theories of cognition - they conjecture that the programming structures used in AI are analogous to ways that the human brain encodes information.

In publications in 1959 and 1972 Allen Newell and Herbert A. Simon presented theory, methodology and experimental results that explained human problem-solving behavior as the product of laws involving symbol manipulation. According to the the model of thinking used by Newell, Simon, and others, symbols are not only used for external human communication in our written language; they are the internal tokens that govern how our minds work. Newell and Simon believed that problem solving occurs in a problem space; this consists of a set of states for any given moment and a set of operators which describe how the situation can be changed. In order for the thinker to discover which sequence of steps will ultimately lead to a solution, the problem solver must have a set of rules which will steer it toward the most promising paths. These condition-action rules are known as "productions;" the idea in a production system is to encode each bit of knowledge as a condition-action rule (or an if-then case). "General Problem Solver," an AI program produced by Newell and Simon, embodied these ideas; their influential theories on the architecture of thinking set a context for debate in the 1970s.

The description of "General Problem Solver" bears striking parallels to the Turing machine described in the previous chapter but ten years later a different computing model - parallel distributed processing - was inspiring cognitive scientists. In the 1980s connectionist models challenged the dominance of symbolic models. Connectionist models resemble the nervous system. Such models are computational systems made from many (millions or more) simple, interconnected computational devices that exchange signals. Such interconnected devices are often referred to as neural networks. Connectionists use a model of cognition based on activities of nodes, and strengths of connections between nodes. Learning in a connectionist model occurs associatively; that is, connection strengths tend to increase when connecting nodes are associated by activity between these two nodes. Information in memory is distributed over a network consisting of a large number of units. A given idea in memory, such as "cat," corresponds to a particular pattern of activity or state of the units in the network. Other ideas in memory may be represented by the same network and many of the very same units, but the pattern of activation will be different. Unlike conventional computers, neural networks are not programmed - they learn by example. Connectionist architecture has been most successfully used to solve problems that cannot be described by a set of rules. Experimental applications using neural networks have been concentrated in the very areas that serial computing found impractical and unsuitable such as sensory signal processing, image recognition, machine vision, robotics and sensor-motor control, speech recognition and synthesis, and natural language.

Quite early on in the history of computing a surprising discovery was made regarding the kinds of things computers can be made to do easily as compared with the kinds of things people do easily. It turned out that it was very difficult to program a computer to do things a child could do easily such as identify a half-hidden face or walk up steps. Paradoxically, it was easy to program a computer to solve complex mathematical problems and to store and retrieve vast quantities of precise data. One explanation for this paradox is that the knowledge children use proficiently exists below the level of consciousness and as such is inaccessible to the programmer (Minsky 1985). Mathematical problem solving, on the other hand, is learned at a stage of development when we are able to comprehend and articulate what is required to accomplish these tasks. Because of this paradox, cognitive scientists have traditionally had more success mimicking the behavior of experts (with expert systems) than the behavior of children (such as image recognition and other basic skills). As a direct consequence of these limitations scientists have spent a great deal of their time trying to understand and model human decision making.

One of the mechanisms identified in the acquisition of new knowledge is reconciliation of cognitive dissonances. In a review of Newell's work on, Roger Schank, another cognitive scientist, says that he shares Newell's belief that the phenomenon of cognitive dissonance can be of primary importance in prompting learning. Or as he puts it "nearly all learning is prompted by failures in processing" (Schank, 132). When expectations are violated we are curious about why there is discordance between these expectations and reality - we try to construct an explanation. This is an efficient system because it allows us to focus cognitive effort only those areas that caused the failure and, hence, are in need of modification.

Psychologists and cognitive scientists have studied and theorized about the cognitive process of creating knowledge structures. Developmental psychologist Jerome Bruner explains the significance of structuring knowledge in this way:

Knowledge is a model we construct to give meaning and structure to regularities in experience. The organizing ideas of any body of knowledge are inventions for rendering experience economical and connected... The power of great organizing concepts is in large part that they permit us to understand and sometimes to predict or change the world in which we live (Bruner 1962, 120).

Marvin Minsky, a Cognitive Scientist at M.I.T., describes the primary cognitive structure as a "frame." These frames are built up as a result of natural experiences or formal education; deciding which frame to apply to a particular problem solving situation involves what cognitive psychologists term "metacognition" or thinking about thinking. Cognitive scientists have determined that one of the most effective tools in learning is a knowledge structure they describe as a mental image or mental model. Author and Professor of Education David Perkins describes a mental model as "a holistic highly integrated kind of knowledge" - it may be any unified mental representation that helps us work with a topic or subject (Perkins 1992, 80). A recent study by educational psychologist Richard Mayer indicated that the use of mental models promotes active use of knowledge. For this study a series of experiments were conducted in which science concepts were taught both conventionally and accompanied by some sort of conceptual model. Typically the model was a visual representation that illustrated, in some simple fashion, the meaning of the concept and how it worked. Mayer found that the verbatim recall of the lesson did not vary but that students exposed to the model had a better recall of the gist of the lesson and performed better when using the concept to solve problems (Perkins 1992, 90).

At the the University of California, Berkeley, researcher Barbara White has developed instructional computing approaches for the teaching of science and engineering. She notes that science and engineering are two of the most feared academic subjects: "Learning these subjects is perceived as an abstract and difficult task," and as such "there is a need to create instructional approaches that make these disciplines interesting and accessible to a wide range of students" (White 1993, 177). Two of her recent projects, BQUEST (about circuits and transistors) and ThinkerTools (about Newtonian mechanics in physics), used "intermediate causal models" to link abstract to more concrete information. "The goal in both projects was to present domain knowledge in a form that enables students to construct powerful mental models by building incrementally on their prior knowledge" (White 1993, 247). White conjectured, based on these two courses, that several properties are crucial to the learnability and cognitive utility of causal models. A brief summary of the most relevant of these properties follows:

They model only the concepts and processes that play a crucial role in understanding domain phenomena.
They represent domain phenomena in simplified form.
They enable students to parse the behavior of a physical system into a sequence of discrete causal events.
They incorporate visual representations that help students to see the causal relationships within the domain.
They are modifiable so that they can be transformed into more sophisticated models.

In White's study of the results of the ThinkerTools curriculum implementation she saw significant increases in the ability to transfer ThinkerTools learned knowledge to new contexts. Model-based reasoning also produced better results for learning circuits and transistors than conventional training methods when learning was measured by transfer performance on related, but not-trained, tasks.

Educational theories

Teachers who introduce innovative methods of instruction are, of course, affected by the dominant educational theories of their day. Educational theory between 1960 and 1980 was dominated by Jean Piaget's developmental perspective. Piaget theorized that youngsters pass though several develop mental stages culminating in adolescence; once that stage is reached thinking may be formal, logical and can cross diverse disciplines. From ages seven through ten physical or mental actions remain restricted to physical objects which can at least potentially be manipulated; Piaget terms these mental actions "concrete" operations. After early adolescence the normal child becomes capable of formal mental operations. From that point on he can operate not only upon objects themselves, and not only upon mental images or models of these objects, but also upon words, symbols, or strings of symbols (e.g., equations) that stand for objects, and for actions upon objects. Although research in the 60s, 70s, 80s confirmed much of Piaget's theory it also suggested that stage advance could be accelerated by using a variety of instructional methods.

The constructivist approach, which grew from Piaget's work, is often invoked in discussing educational applications for computing. Construc tivism treats the learner as an active agent who "constructs meanings" in response to instructional situations. The effort centered rather than ability centered model of learning presented by Constructivism asks the student to struggle with understanding, formulate tentative theories, and then to test those theories out on further instances; the student must discover and uncover the material - the teacher is only a coach.

A number of other learning strategies deserve at least a brief mention as they seem to be popping up in discussions of educational computing; these are multiple intelligences, cooperative learning, peer collaboration, situated learning, and story-telling. The theory of multiple intelligences is advanced by developmental psychologist Howard Gardner in his book Frames of Mind. The principal assumption of this study is that individuals are not all alike in their cognitive potentials and intellectual styles and that education can be more effective if it is tailored to the abilities and the needs of the particular individuals involved. According to Gardner there are educational implications to the theory of multiple intelligences; "it should be possible" he suggests, "to identify an individual's intellectual profile (or proclivities) at an early age and then draw upon this knowledge to enhance that person's educational opportunities and options" (Gardner 1993, 10). Of the seven intelligences identified by Gardner he notes two, the logical-mathematical and linguistic intelligences, as specifically necessary for working with computers and programming.

Cooperative learning requires that different people in a group be given different tasks and/or the task of teaching others in the group their particular specialty - knowledge is distributed within the group. Peer collaboration, on the other hand, calls for pairs or small groups of students to work on the same task simultaneously so that they can work through puzzles together. Peer collaboration is recommended in instances where novel or complex concepts are being presented (Perkins 1992, 64). Cognitive psychologists including Allan Collins, John Seely Brown, James Greeno, and Lauren Resnick have attacked the decontextualized character of the classroom and suggested "situated learning" as a possible corrective. They argue that what happens in school mathematics, writing, or the study of history, bears little resemblance to what mathematicians, authors, or historians do - nor does it reflect the non-professional uses for these disciplines (e.g., filling out tax forms, understanding current events). They point out that in authentic contexts learning gets supported in ways that are absent from the classroom; in these authentic contexts, support may come from apprenticeship-like relationships or from a social network of colleagues (Perkins 1992, 67). Situated learning is often mentioned in educational computing articles because computer simulations have been suggested as a method for approximating authentic contexts for learning. The practice of embedding instruction in stories or narratives has its history in the "Sufi tales" of the Islamic world. Information communicated through story telling seems to be more easily retained; the story provides its own logical structure to assist the learner in retention. 2ⁿ: The Power of Numbers, a short cartoon film by Charles and Ray Eames about binary numbers, is an example of how such a technique may be used.

Children and Computers

By the early 1980s several researchers in the fields of psychology and education had shown an interest in the particular issues that arose when children used computers. This interest generated studies which evaluated how children interacted with computers.

One such study was published in book form by Sherry Turkle, an M.I.T. sociologist. The Second Self (1985) describes Turkle's ethnographic study of children and their relationships with computers and computer toys. The data for this study consisted of numerous tape-recorded encounters with children as they interacted with these relatively new devices. Turkle noted that children would talk to computers, attribute psychological motives to them, and generally treat them as if they were alive. She suggested that these psychological interpretations of the computer were encouraged by its physical opacity:

The impact of the computer is constrained by its physical realities. One such reality is the machine's physical opacity. If you open a computer or a computer toy, you see no gears that turn, no levers that move, no tubes that glow. Most often, you see some wires and one black chip. Children faced with wires and a chip, and driven by their need to ask how things work, can find no simple physical explanation. Even with considerable sophistication, the workings of the computer present no easy analogies with objects or processes that came before, except for analogies with people and their mental processes. In the world of children and adults, the physical opacity of the machine encourages it to be talked about and thought about in psychological terms (Turkle 1984, 22).

With traditional objects like bicycles and wind up toys, physics becomes the framework for understanding how the pedals, gears and springs work together. When a child begins to explain the world in mechanical terms it is recognized as a developmental step forward. Children at this stage try to use this same kind of reasoning with computer toys and computers - they try to understand how these work in physical terms but there is a difference:

Computer toys are for the most part sealed, but even if one takes off the plastic back and breaks inside, all that the most persistently curious child finds is a chip or two, some batteries, and some wire. Physically these objects are opaque. They are frustrating (Turkle 1984, 60).

In a study conducted at Bank Street College in 1984, published as Structured Interviews on Children's Conceptions of Computers, researchers sampled reactions from two classrooms using computers; one classroom of eight and nine year olds and one of eleven and twelve year olds. Children were interviewed individually at the beginning and end of a year of computer exposure. Researchers hoped to identify misconceptions regarding computers that might interfere with learning so that these could be addressed by instruction. These researchers predicted that understanding would be drawn by analogy from familiar devices. The research group defined general computer literacy as knowledge of the distinction between hardware and software and recognition of information capacity (differences between RAM, ROM and data storage). They sought to discover whether the students would understand the difference between hardware and software or whether the computer would be viewed as a fused whole. In response to the question "what is a computer?" students in the Bank Street study classified the computer by physical attributes (e.g., number of buttons, presence of a screen) and by function (used for games or used for work), but most did not separate the computer from its software in defining it. Researchers gathered that "These children overestimated computer power because they did not understand it to be conditional on programming" (Mawby et al. 1984, 33). Understanding, not surprisingly, varied by age; when asked to describe the functioning of a computer, younger children dwelled on the external parts and peripherals - they did not identify that any processing took place between the time input was entered through the keyboard and the moment it was displayed on the screen; older children, on the other hand, were aware of an intervening process between input and output. It almost seemed as though younger children viewed the computer as a natural object - when asked how it produces an answer they responded: "it just knows." Students were also asked questions about information storage and retrieval processes; that is, how "save" and "read" operations are carried out, and what happens inside the computer when these things are done. The precise location of the information inside the computers was an issue that concerned many of the children. The study's authors suggested that children are right to be concerned about this; they wrote, "Understanding `where the information is' by forming an adequate functional model of the filing system is important in guiding operations with peripheral devices." They reported that "Our limited results suggest that this information is not spontaneously induced by children who are just beginning to work with computers" (Mawby et al. 1984, 18). The study concludes:

The computational core of the computer is not visible, and children have few good analogues with which to grasp it. The keyboard, screen and disk drive are more salient and familiar parts of the computer, and children initially focus on the perceptual and user-action features of these components... The consequences of children's inadequate concepts of program and central processor are manifold. Some children fundamentally misconceive the problem-solving power and limits of the computer. Some treat the computer as a display typewriter with disk save which cannot help people solve problems because the user must type in all the answers. With a better grasp of program and processor, children would have a better idea of how computers can be used in problem solving (Mawby 1984, 40).

Both Sherry Turkle's book and this Bank Street study suggest that the opacity of computer technology frustrates the attempts of children to understand how computers work. The Bank Street study cites specific ways that this lack of understanding may affect a child's attempts to use the machine productively.

Learning About Computers

In The Second Self, Sherry Turkle suggests that computers are often the focus of anthropomorphism because the computer's opacity leads curious users to a mysterious dead end. As I noted in the previous chapter, historical trends in the development of computer science have contributed to this opacity. Due to miniaturization, visual examination of a microchip yields no useful information about how it operates. Furthermore, the wide variety of user friendly, shrinkwrapped software packages available has eliminated the need for most users to understand how computers are programmed. The opacity of the computer produces cognitive dissonance in curious users; told that computers only follow directions, they find that their own direct observations contradict this statement since most instructions are invisible to the user. Unfortunately, the user's curiosity can be frustrated even further if they try to go around the immediate dead end by seeking additional information. The way the CPU works is often glossed over in books and programs about computers; when it is described too much detailed information is often given. Donald Norman, a cognitive scientist and interface specialist, suggests why this occurs in his description of how we represent the world in our artifacts.[2] He notes that we concentrate our attention only upon the information that is represented within these artifacts. Norman concludes that "anything not present in the representation can conveniently be ignored. In actuality, things left out are mostly things we do not know how to represent, which is not the same as things of little importance" (Norman 1993, 52). I would speculate that the lack of attention given to this area of computer literacy does not reflect its lack of importance, but rather the difficulties of presenting it - and the difficulty instructors have in understanding it themselves.

Why do many people, including instructors, find the subject matter so difficult? The methods used to describe microprocessor functioning fall short because they fail to connect the computer's processes to existing knowledge structures. For example, my description of computer processing basics in chapter one may have confused those readers without mathematical preparation or a background in computer science. Cognitive Scientists (Bruner 1962; Schank 1984; Minsky 1986) have suggested that learning can best occur if the basic knowledge structures necessary for a discipline are already in place; or, as Minsky simply put it: "The secret of what anything means to us depends on how we've connected it to all the other things we know" (Minsky 1985, 64). This being the case, instruction must be based on a student's existing state of mental organization, or schema. The way knowledge is internally structured or organized has considerable impact on whether new learning will occur. In most situations, the laws of the physical world provides this framework for further exploration. Unfortunately, few lessons learned in the physical world can be applied to a computer environment; the rules governing computer behavior vary from machine to machine, program to program.

I've explained why it is impossible for computer users to spontaneously form an accurate model of computer processing through direct observation. I've also noticed one particular trap that seems to ensnare most instructors who try to describe or illustrate information processing. Many descriptions and representa tions of how computers work omit one important step in the processing of information. Schematic drawings of computer processes generally use 0s and 1s to trace the path of information moving through the gates and buses of the computer. When I studied those diagrams something made me uneasy and although it took me a while to isolate the problem, the cause seems obvious to me now; it never made sense to me that 0s and 1s - numbers which are intangible mathematical constructs - could pass through the physical reality of wires and circuits. A step had been left out of these descriptions: the encoding of numbers as electrical signals. For a computer to process information, that information must be encoded as electrical pulses; it is these sequences of pulses which symbolically represent all kinds of data. This is just one example of the problems that plague the modelling of computer processes. Describing such a complex system without the benefit of direct observation proves very difficult. The difficulty in constructing an accurate model of computer processing can have several side effects: younger children have a tendency to treat the unpredictability of the computer as an indication that it is alive while older children and adults tend to build inaccurate "naive" models of computer processes. In considering the assimilability of new models, one needs to take into account the structure of students' initial knowledge of the domain. There are two competing views of this knowledge in the cognitive science community. One view is that novices possess fragmentary bits and pieces of knowledge, the other is that they possess coherent models. In either case, major conceptual change is required. If their knowledge is fragmentary, then acquiring knowledge in the form of coherent models requires a major change in the structure of novices' knowledge. If novice knowledge is already in the form of incorrect but coherent models, then misconceptions embodied in these models will be difficult to overcome (White 1993).

Naive Models

Computers hide their actual processes; consequently, cause and effect testing that children use to create mental models cannot be relied upon. Direct experience with the computer reveals only input and output - the programmed instructions between these are obscured. It is likely that without knowledge of how computers really work, computer users will develop "naive models" of how computers work. Naive models are models which reflect cognitive biases. Cognitive scientists who study decision-making under situations of uncertainty have discovered that although people usually make remarkably good decisions, there are some areas in which they consistently and predictably err. These are called cognitive biases. A recent journal article, Cognitive Bias in Software Engineering notes that part of the problem is selective representation in memory: we remember the things that cause difficulty or cognitive dissonance. An example is the statement: "the phone always rings while I'm in the shower." The other side of this event (the phone doesn't ring while I'm in the shower) is not nearly as interesting, which means that it is not remembered with the same frequency. This is how we accumulate confirmatory evidence for salient but non-representative events (Stacy & Macmillan 1995, 61).

Naive models are mental models which evolve through personal observation; they develop due to the mind's tendency to try to identify patterns in novel situations and use them to predict behavior that may seem chaotic otherwise (Riley 1986). The problem with these models is that they are often incorrect and yet they tend to be surprisingly persistent; students will even misinterpret or distort information to fit their naive models (Clement, 1983 and diSessa 1983, in Norman 1986). So, without an accurate model, the computer user is saddled with a naive model that is likely to be flawed and which may affect their ability to use the computer. When examining the behavior of a computer system, an understanding of the basic rules of the system is desirable but an accurate mental model cannot be spontaneously generated by direct observation alone. A flawed model of computing may be sufficient to navigate a single program but the flaws will be revealed when the user tries to apply the model in new and unfamiliar situations. For example, novice computer users are often shocked to discover that they have lost hours of work when their computer crashes - this can happen if the document they've been working with has not been saved to a disk. Some programs, like Hypercard, automatically save all changes to a document without any special prompting; however, the majority of programs require that the user select "Save" to save changes.

Metaphor is another cognitive device that can be used to integrate new material. Metaphors allow us to take our knowledge of familiar concrete objects and experiences, and use it to give structure to more abstract concepts. Many metaphors are used (and mixed) to describe computer processing, but ironically, computers are most often compared to the human brain - probably the least understood piece of machinery on the planet. How do the brain and computer actually compare? Cognitive and computer scientists have invested considerable time studying this relationship in an effort either to understand the brain or to improve the computer. John von Neumann's 1958 book The Computer and the Brain is such an attempt. von Neumann applied rules based on how the artificial brain works in an effort to understand how the human brain might work. Do the brain and computer really possess a uniquely close relationship? It is human nature to use metaphors and models to compare the invisible and intangible to familiar things. This process has often been applied to explain how the mind works. Thousands of years ago, Plato's metaphor for memory was an aviary; for him, capturing a memory was like trying to grasp a bird in flight. Many years later the mind was compared to the workings of a clock. The comparisons made were dependent on available technology - today's technology is the computer. But the comparison of computer to brain is one that is embedded in the very language we use to describe the computer and as such it is difficult to ignore. We refer to a computer's "memory," its "language" - and we sometimes even talk about how "smart" it is.

The metaphor of computer as brain can be helpful in explaining some aspects of how computers work. One example of this is David Pogue's explanation of the difference between RAM and disk storage in Mac For Dummies:

Human beings, for the most part, store information in one of two places. Either we retain something in our memory - or, if it's too much to remember, we write it down on a legal pad... Computers work pretty much the same way... They can either store what they know in their relatively pea-brained memory... or they can write it down. A computer writes stuff down on computer disks (Pogue 1993, 27).

However, this analogy can be tricky because, metaphorically speaking, if hardware is the brain, software is the mind. Predictably, this metaphor breaks down at points where we do not understand brain function; for example, neuropsychologists do not know how mind and brain interact. Or, as Decker and Hirshfield write in their introductory computer science textbook, "A computer is, of course a demonstrably physical object... How can such an object run a program, which after all, is really nothing but a collection of ideas?" The authors explain that "This is a simple version of a very old question: For centuries, philosophers have pondered the related problem of how brains can have thoughts" (Decker 1990, 221). In the human realm we usually dip into metaphysics to make this connection; however, in the computer world we cannot explain hardware/software interaction by ascribing a soul to the machine[3] and so the metaphor collapses. There appears to be no reliable metaphor that can be engaged to explain this crucial aspect of computing.

Next Chapter

Demystifying Computer Science: An Approach Using Interactive Multimedia New York University, Gallatin School, MA Thesis