This paper describes the process of automatically tracking user navigation through CALL hypermedia. Beginning with a discussion of the advantages and disadvantages of this research method compared to verbal protocols, personal observation and video recording, it then goes on to describe how logging is achieved, some of the practical problems associated with it, and the use of graphics to analyse and interpret the data obtained.
As computer assisted language learning (CALL) materials have become more popular, there has been a corresponding increase in research into the efficacy and use of these materials. At the same time, researchers in this area have realised that software can automatically track how users are interacting with it. For the researcher, software can be two things at the same time; it can be the material used by the language learner and, at the same time, perform data collection without the student becoming aware of this. The ability of software to track what students are doing at the interface can be a powerful research tool and it is the intention of this article to address the needs of researchers and others who may use this facility to interpret and visualise what raw data obtained in this way is telling about students interaction with hypermedia CALL materials.
Other authors (e.g. Liou 1995; Liou 2000; Hegelheimer and Chapelle 2000; Chun 2001) have described the use to which automatic tracking (logging) can be put and how it is done. This paper, based on experience gained from using logging for research (Moran, 1995; Moran, 2003), differs from these as it focuses on the practical software-related problems associated with using this research methodology and on how logged data on navigation through hypertext can be interpreted using graphic representations of user interaction.
Throughout this paper, the terms hypertext and hypermedia are used extensively. Hypertext is defined as screens (nodes) containing learning materials which are linked to one another so that the hypertext user can go from one node to another by clicking on a button or “hot-word” (text which the user can click on). Hypertext materials structured in this way are often described as “frame-based” as authoring programs often describe the nodes that materials writers create as frames. Hypermedia is defined as hypertext with multimedia functionality where activities are built around still pictures, animations, audio or video. In theory, every node can be linked to every other, but in reality this rarely happens. Learning materials usually have structure and a key feature of authored hypertext is the freedom students have or do not have to move between nodes.
Section 2 of the article will give a short background to the history and use of log file data in CALL together with a comparison with other research methods. Section 3 will briefly describe how logging is done and the major problems a researcher might encounter. Section 4 will then discuss how graphics can help data interpretation and provide examples of different types of diagrams.
Log files, also known as audit trails or dribble files (Gay & Mazur, 1993) can record every decision (as represented by a click on a button or hyper-link) made by a user and the exact time of the decision. These log files therefore consist of what Preece et al (1994) term time-stamped keypresses.
The literature describing the use of log files has quite a long history (see Curtin, Avner & Provenzano, 1981 for an early use of log files in language learning software) and can be categorised according to whether it discusses how to do it (for example, see Gay & Mazur, 1993; Hegelheimer & Chapelle, 2000), what can be gained from it (Liou, 1995; Liou, 2000), or whether it describes research results based on it (for example, see Hulstijn, 1993; Hulstijn, 2000; Lomicka, 1998; Manning, 1996; Liou, 1997; Chun, 2001). The purpose of the research may differ and whether it be for logging how the learner proceeds (navigates) through learning materials (Manning, 1996; Desmarais, Laurier, and Renie, 1998) or uses tools such as sound files (Harben, 1999; Chun, 2001) or on-line reference (Hulstijn, 1993; Lomicka, 1998; Liou, 1997; Chun, 2001), the raw data still consists of a record of the buttons that have been clicked and the exact times they were clicked; in other words, the data will simply be a list of time-stamped keypresses.
Personal observation and video recording can be used to obtain the same information as logging, but there are disadvantages to both of these methods. With personal observation, it is easy for the observer to be distracted or to see what she or he wants to see (Preece et al. 1994) while note-taking is limited by writing speed (DuFon 2002). Video recording is a permanent record and is as accurate as logging in addition to providing information about student behaviour such as intensity of attention, comfort and involvement of interlocutors (DuFon 2002). However, input and analysis of video data can be very time consuming and the presence of video equipment can make students self-conscious (Preece et al. 1994). If these methods are used, therefore, they are more usefully focused on recording the aspects of the interaction that they are most suited to, leaving the computer to do what it does best: record and store numbers and text.
Research in CALL: A Focus on Process Rather than Product
In recent years in CALL research, there has been a movement away from a focus on the results of using CALL materials, the product, to a focus on the learning process (Liou, 1997). Rather than simply ask if using software gets better learning results, the researcher's concern may be with the qualitative aspects of the learning experience (Chapelle, 1994; Oxford, Rivera-Castillo, Feyten & Nutta, 1998; Chun, 2001). For example, we might ask how students make use of the flexibility provided by software. This raises the question of how quantitative data can be used in research that focuses on process - research that may be qualitative in nature.
Research methodology in hypermedia CALL is informed by second language acquisition (SLA) and Human-Computer Interaction (HCI) research methods. Methods such as interviews and verbal protocols1, which are often central to process-focused research in SLA, are also common in CALL research. These methods have inherent weaknesses, however, stemming from worries that the data collection method has interfered with the process being investigated, or that the subjects are not doing what they are saying or vice versa (McDonough 1995). When the research is focused on SLA in a hypermedia CALL context, however, logged data can provide an important balance to the weaknesses of these techniques by matching what subjects say with what they actually do.
Like any other research method, logging has both strengths and weaknesses. Software that tracks user-interaction looks no different from any other software; the programmer has introduced code that records user actions, but has not changed the interface. Since logging is completely transparent to the user (i.e. the user is unaware of it), it does not interfere with the subject and has the further advantages of accuracy, immediacy and reliability (Liou, 1995; Liou, 2000). For these reasons, it might be assumed that it is a foolproof method of data collection, but this is not the case. Log file data is context-bound; it records nothing more nor less than user actions within a given program (Liou, 1995; Liou, 2000). For example, clicking on a button does not necessarily indicate the reason behind the decision; does a repetitive pattern of choices represent the establishment of a preferred method of learning or simply apathy (Lawless & Brown, 1997; Lawless & Kulikowich, 1996; Lawless & Kulikowich, 1998)? Conversersely, as Hegelheimer and Chapelle (2000) point out, if a learner is not interested in accessing a resource, perhaps because the research is taking place in a laboratory situation, data will not be collected. If the researcher's aim is to investigate the cognitive processes that underlie student decisions, the context-bound nature of the data not only means results based on it may not be generalisable to other software, but also that logged data cannot directly tell us about these cognitive processes (Hammond et al. 1980). Therefore, as Liou (2000) argues, despite the reliability of logged data, the researcher may still have to infer the underlying reasons for learner behaviour.
The consequence of these weaknesses is that to make log files meaningful, they may have to be triangulated with other data collection techniques (Liou, 2000). At the same time, however, it must be said that this can be a powerful combination. When the researcher triangulates logging with other data collection methods such as verbal protocols or video, the resulting "thick descriptions" form a very detailed record of user behaviour (Gay & Mazur, 1993). This not only contributes to the reliability and validity of the research but also provides both quantitative and qualitative insight into the learning processes taking place (Preece et al. 1994)2.
Figure 1 below illustrates the basic logging process. The key points are that to move to another screen in a hypertext, access a resource such as a video clip, or input an answer (either text input or clicking on a multiple choice answer), the student has to click on a button. This button must have an ID or name which is then recorded by the programming script which carries out the button's function (e.g. play a video clip). At anytime before or during exit from the program, recorded data can be sent to a text file. After the student has finished, the researcher can copy and paste the data from the text file to a spread sheet or statistics application. The following discussion assumes that a spread sheet is used as it may be necessary to do further calculations before transferring data to a statistics package. Calculations can be done within the program before the data is sent to the text file, but whether or not this is done depends on the situation.
Firstly, apart from having to do the programming for logging, the researcher has to decide how to save the data in a format that can be conveniently analysed, or at least converted to something analysable. This involves deciding on the form in which the information is saved in the text file and formatting the spread sheet. Figures 2 and 3 show two examples of logged data from a text file. The data shown in Figure 2 is easy to understand, but it would have to be coded somehow for a spreadsheet to be able to do much with it entailing extra work for the researcher. On the other hand, the data shown in Figure 3 is meaningless until it is pasted into a spreadsheet and the researcher can read the column or row headings. Note also that both examples show tab characters between data items. These are inserted by the program when it sends the data to the text file so that when pasted into the spreadsheet, each item goes into a separate cell in a row. This requires that the spread sheet is carefully prepared with formulae ready to do any necessary calculations and column headings that match the data. If data is intended to go into a column rather than a row, line breaks have to be inserted between each item.
This shows that the user is doing this on February 25 at 15:33:21. S/he is studying the word “tabloid” and has stated that s/he does not know this word. S/he then goes through a series of three screens before continuing to another word (Go To Nx). Other numbers indicate the number of times a sound file of the word is accessed. Arrows represent tab characters.
Secondly, debugging and piloting are of key importance as it may not be immediately obvious that an item of data is either not being recorded or not being passed to the text file. Each item has to be systematically tested under every possible condition.
Thirdly, the researcher has to consider whether stand alone computers or a network is used and the location of the text file. The simplest situation is the stand alone computer. With this, the text file is saved on the hard drive; a floppy disk could be used, but these are slow and unreliable. With a network, there is a choice between saving on the hard drive or saving to a public network drive that allows all users to save to it. Saving on the hard drive may not be permitted as many networks do not allow users to do this. Where this is possible, the researcher has to go to each computer to get the data. Saving to a public drive saves time, but the network manager should be consulted as the drive might not be as public as it seems; the subjects may not have permission to write to the drive and, if so, the process will fail at the point where the text file is saved - often when the subjects exit the program.
Something that has not been mentioned so far is the possibility of sending data directly to the application in which the data will be analysed. This most commonly uses a Windows function called dynamic data exchange (DDE)3. With large groups of subjects or when software is accessed by individuals when the researcher is not present, this is useful, but there are disadvantages to using it. Firstly, with small samples, it may be quicker to copy and paste manually from the text file to the spread sheet rather than write, pilot and debug the programming for it. Secondly, there is extra preparation for the spread sheet as target cells have to be named and macros may have to be written to handle the incoming data. Thirdly, the researcher becomes very dependent on the reliability and speed of the network. Finally, saving in a text file is still necessary as a safety measure. If the data is lost in transmission, the researcher can still retrieve it.
Research data from student behaviour in CALL hypertext lends itself to graphic representation for two reasons. Firstly, hypertext structure and the learning process can both be visualised in similar ways. Secondly, good use of graphics can help with practical problems encountered in interpreting research data from CALL hypermedia.
Hypertext and the Learning Process
Hypertext structure lends itself to being graphically represented with maps. A spatial metaphor is often used to represent hypertext structure as a physical space that users move around in. This view of hypertext complements Wittgenstein's landscape metaphor (Wittgenstein 1968, Jacobson et al. 1996) in which the acquisition of knowledge is compared to continually criss-crossing a landscape and gaining deeper understandings of the ground beneath.
From both an educational and a research perspective, hypermedia differs from traditional materials in that it can be seen as both the object and the process of learning (Norman 1994). The content of the nodes is the object of learning while the route the student navigates from node to node and the time spent at nodes reflects a series of decisions that the student makes to achieve the learning goal. In hypertext, therefore, learning can be seen as having direction, duration and, in the case of multiple use (i.e. an individual navigating through more than once and/or individuals navigating through the same software individually), volume.
There are two ways in which graphics can help interpretation of log file data. Firstly, if the researcher's focus is on how cognitive processes or strategies can be inferred from the nature, frequency, order, and timing of the choices made, then summarisation in graphic form may aid in making these inferences. Secondly, simply collecting a large amount of data from a few subjects does not necessarily mean that the researcher can do valid statistical analysis as this often depends on sample size and the number and type of variables being measured. However, graphic representation of navigational behaviour is a simple way to visualise what raw data is telling the researcher (Orey & Nelson, 1994) and is a practical solution to showing results of research where it has not been possible to acquire an adequate number of subjects to do valid statistical analyses. Careful diagramming of the pathways taken can indicate strong trends and highlight useful directions for future research.
This section will begin by introducing the software used and the example hypertext program. It will then show examples of diagrams illustrating navigation through frame-based hypertext, comparisons of navigation under varying experimental conditions, time spent at nodes, and access to resources on a single screen.
The hypertext application used to author the materials was Guide (1994) with a tracking facility programmed by the author. The diagrams were created using a flow chart program called Inspiration (Helfgott, Helfgott & Hoof, 2004) which is now up to version 7.5 and which allows great flexibility in creating and moving symbols and links around. When the diagrams are finished, it is a simple matter to insert them into a word processed document or a web page. The drawing facility in Microsoft Word can also be used, but it is not as easy to use as specialised flow chart programs.
The example program (see Appendix 1) was used for an investigation (Moran 2003) of vocabulary learning preferences in a CALL hypertext program. Subjects were presented with an individual word and, before practicing it, they were asked to rate their knowledge of that word on the following scale:
In looking for any trends in the data, the researcher was primarily concerned with:
The main problem for graphic presentation of subject behaviour was how to show navigation patterns by level of prior knowledge. The following sections illustrate how this problem was solved and focus on showing subject choices from one node, navigation patterns by level of prior knowledge, and summaries for quick comparison. Each of these illustrates how graphical display of logged data can aid in interpreting aspects of user interaction.
To show users' navigational choices from one node, a simple directional graph such as Figure 5 below can be used (Orey & Nelson, 1994). This illustrates the percentage of choices made between 3 choices (Figure it Out, See the Definition, or Go to the Next Word). Thicker lines indicate higher percentages. The percentages are calculated using the number of decisions made at this level of prior knowledge. As long as choices are limited to two or three, a graph like this can clearly present language learning preferences or strategies.:
Figure 6, here called a decision flow diagram, shows how many students chose to go from node to node at level 2 of prior knowledge and serves as an example of how to show language learning pathways taken through a hypertext structure. As with the User Choice diagrams, thicker lines indicate heavier traffic (higher percentages). These are also based on suggestions made by Orey & Nelson (1994).
The problem with this graphic was to decide a method of calculating percentages of choices made. The method chosen was to calculate a percentage of the total number of navigational choices made at this level of prior knowledge. For example, if we look at choices made from the decision node above, the total number of navigation choices made at this level of prior knowledge (i.e. between all the nodes and to the next word) was 103. Fourteen of the students chose Figure It Out and 11 chose See the Definition, so the percentages are 13.59% (14/103) for the decision to the inductive node and 10.68% (11/103) for the decision to the deductive node. The actual number of decisions made is given together with the percentage because it is important to show what these calculations are based on.
The advantage of this calculation is that as the users work their way through the program, branching out through the hypertext structure, there is a trend for the “traffic” to become more and more diluted. If a certain pathway is more popular, it will stand out more in relation to the other possible pathways.
Comparisons of navigation patterns under different experimental conditions can be made with the following type of graphic (see Figure 7). In this case, the comparison is between navigation patterns at different levels of prior knowledge of the vocabulary being practiced. This is produced by tracing the most common pathways shown by the decision flow diagrams. If there are equally popular pathways, they are both shown. Pathways that are within a 10% range of the most popular pathway at the same learning level are shown with a broken line. It was decided to do this because the number of subjects was so low that a difference of one subject choice could represent a large percentage.
Time Spent at Nodes
The example software used by the students was not designed to log time spent at specific nodes. However, in cases where the researcher does this, time spent can be shown using circles that vary in size according to the amount of time spent at nodes as in Figure 8 below.
To compare groups or individual accesses, concentric circles are useful for quick visual comparisons. Chavero et al (1998) used this method to good effect in showing the time spent at nodes by individuals. In this case, they utilised concentric circles to demonstrate that students spent less time on successive visits to a node. Each return required less and less time and the circles became progressively smaller. Another use for concentric circles would be to show maximum and minimum times spent at nodes as well as average times.
Showing Access to Resources on a Single Screen
The examples given so far have illustrated navigation through a simple frame-based hypertext in which students move from one content node to another. However, multimedia hypertext is very likely to consist of activities and resources based around a single frame as in the following example (see Figure 9). This screen shot is taken from Listening 123, a program written for an MA portfolio (Yang, 2001) which logged user interaction. The screen contains a short video extract, a multiple choice question, and resources such as a hint or subtitles for the video extract. Students watch the video and try to answer the multiple choice questions.
Loosely based on the screen layout, the following diagram (see Figure 10) categorizes user interaction with Listening 123 according to whether the answer to the multiple choice question was correct or incorrect. The diagram shows how, after viewing the video, subjects who answered the multiple choice question correctly, might go straight to the next screen or, alternatively, watch the video with subtitles. Those who answered incorrectly might access a hint before going on to look at the video again with subtitles and then going to the next screen.
While recognising the limitations of data obtained through logging user-interactions, log files, especially in combination with other data collection methods, can provide a great deal of data on students' learning behaviours. Although the diagrams described here may not be directly applicable to other materials or research designs, the principle that a student's progress through hypermedia materials is a dynamic process that is amenable to graphic representation remains. Most importantly, however, the facility to unobtrusively observe and record student behaviour is a potent research tool that may point us towards new understandings of the language learning process.
1. Verbal protocols involve the subject "thinking aloud" about what they are doing and provide data that is concurrent with their own actions.
2. Preece et al. (1994) point out that attempting to synchronise video with logged data has the disadvantages of high cost of buying or creating the synchronising software and hardware and the production of a “daunting amount of data”.
Eddy Moran has a Ph. D focusing on the relationship between learners' beliefs about language learning and the choices they make in a CALL environment. He teaches CALL at post-graduate level at Durham University in the United Kingdom.