The Khan Academy platform enables powerful on-line courses in which students can watch videos, solve exercises, or earn badges. This platform provides an advanced learning analytics module with useful visualizations. Nevertheless, it can be improved. In this paper, we describe ALAS-KA, which provides an extension of the learning analytics support for the Khan Academy platform. We herein present an overview of the architecture of ALAS-KA. In addition, we report the different types of visualizations and information provided by ALAS-KA, which have not been available previously in the Khan Academy platform. ALAS-KA includes new visualizations for the entire class and also for individual students. Individual visualizations can be used to check on the learning styles of students based on all the indicators available. ALAS-KA visualizations help teachers and students to make decisions in the learning process. The paper presents some guidelines and examples to help teachers make these decisions based on data from undergraduate courses, where ALAS-KA was installed. These courses (physics, chemistry, and mathematics) for freshmen were developed at Universidad Carlos III de Madrid (UC3M) and were taken by more than 300 students.
Massive Open Online Courses (MOOCs) have grown up to the point of becoming a new learning scenario for the support of large amounts of students. Among current research efforts related to MOOCs, some are studying the application of well-known characteristics and technologies. An example of these characteristics is adaptation, in order to personalize the MOOC experience to the learners skills, objectives and profile. Several educational adaptive systems have emphasized the advantages of including affective information in the learner profile. Our hypothesis, based on theoretical models for the appraisal of emotions, is that we can infer the learners emotions by analysing their actions with tools in the MOOC platform. We propose four models, each to detect an emotion known to correlate with learning gains and they have been implemented in the Khan Academy Platform. This article presents the four models proposed, the pedagogical theories supporting them, their implementation and the result of a first user study.
Self-regulated learning (SRL) environments provide students with activities to improve their learning (e.g., by solving exercises), but they might also provide optional activities (e.g., changing an avatar image or setting goals) where students can decide whether they would like to use or do them and how. Few works have dealt with the use of optional activities in SRL environments. This paper thus analyzes the use of optional activities in two case studies with a SRL approach. We found that the level of use of optional activites was low with only 23.1 percent of students making use of some functionality, while the level of use of learning activities was higher. Optional activities which are not related to learning are used more. We also explored the behavior of students using some of the optional activities in the courses such as setting goals and voting comments, finding that students finished the goals they set in more than 50 percent of the time and that they voted their peers' comments in a positive way. We also found that gender and the type of course can influence which optional activities are used. Moreover, the relations of the use of optional activities with proficient exercises and learning gains is low when taking out third variables, but we believe that optional activities might motivate students and produce better learning in an indirect way.
Present MOOC and SPOC platforms do not provide teachers with precise metrics that represent the effectiveness of students with educational resources and activities. This work proposes and illustrates the application of the Precise Effectiveness Strategy (PES). PES is a generic methodology for defining precise metrics that enable calculation of the effectiveness of students when interacting with educational resources and activities in MOOCs and SPOCs, taking into account the particular aspects of the learning context. PES has been applied in a case study, calculating the effectiveness of students when watching video lectures and solving parametric exercises in four SPOCs deployed in the Khan Academy platform. Different visualizations within and between courses are presented combining the metrics defined following PES. We show how these visualizations can help teachers make quick and informed decisions in our case study, enabling the whole comparison of a large number of students at a glance, and a quick comparison of the four SPOCs divided by videos and exercises. Also, the metrics can help teachers know the relationship of effectiveness with different behavioral patterns. Results from using PES in the case study revealed that the effectiveness metrics proposed had a moderate negative correlation with some behavioral patterns like recommendation listener or video avoider.
The emergence of massive open online courses (MOOCs) has caused a major impact on online education. However, learning analytics support for MOOCs still needs to improve to fulfill requirements of instructors and students. In addition, MOOCs pose challenges for learning analytics tools due to the number of learners, such as scalability in terms of computing time and visualizations. In this work, we present different visualizations of our “Add-on of the learNing AnaLYtics Support for open Edx” (ANALYSE), which is a learning analytics tool that we have designed and implemented for Open edX, based on MOOC features, teacher feedback, and pedagogical foundations. In addition, we provide a technical solution that addresses scalability at two levels: first, in terms of performance scalability, where we propose an architecture for handling massive amounts of data within educational settings; and, second, regarding the representation of visualizations under massiveness conditions, as well as advice on color usage and plot types. Finally, we provide some examples on how to use these visualizations to evaluate student performance and detect problems in resources.
The use of Massive Open Online Courses (MOOCs) is increasing worldwide and brings a revolution in education. The application of MOOCs has technological but also pedagogical implications. MOOCs are usually driven by short video lessons, automatic correction exercises, and the technological platforms can implement gamification or learning analytics techniques. However, much more analysis is required about the success or failure of these initiatives in order to know if this new MOOCs paradigm is appropriate for different learning situations. This work aims at analyzing and reporting whether the introduction of MOOCs technology was good or not in a case study with the Khan Academy platform at our university with students in a remedial Physics course in engineering education. Results show that students improved their grades significantly when using MOOCs technology, student satisfaction was high regarding the experience and for most of the different provided features, and there were good levels of interaction with the platform (e.g., number of completed videos or proficient exercises), and also the activity distribution for the different topics and types of activities was appropriate.
Massive open online courses (MOOCs) have recently emerged as a revolution in education. Due to the huge amount of users, it is difficult for teachers to provide personalized instruction. Learning analytics computer applications have emerged as a solution. At present, MOOC platforms provide low support for learning analytics visualizations, and a challenge is to provide useful and effective visualization applications about the learning process. At this paper we review the learning analytics functionality of Open edX and make an overview of our learning analytics application ANALYSE. We present a usability and effectiveness evaluation of ANALYSE tool with 40 students taking a Design of Telematics Applications course. The survey obtained very positive results in a system usability scale (SUS) questionnaire (78.44/100) in terms of the usefulness of visualizations (3.68/5) and the effectiveness ratio (92/100) of the actions required for the respondents. Therefore, we can conclude that the implemented learning analytics application is usable and effective.
This paper presents a detailed study of a form of academic dishonesty that involves the use of multiple accounts for harvesting solutions in a Massive Open Online Course (MOOC). It is termed CAMEO – Copying Answers using Multiple Existence Online. A person using CAMEO sets up one or more harvesting accounts for collecting correct answers; these are then submitted in the user's master account for credit. The study has three main goals: Determining the prevalence of CAMEO, studying its detailed characteristics, and inferring the motivation(s) for using it. For the physics course that we studied, about 10% of the certificate earners used this method to obtain more than 1% of their correct answers, and more than 3% of the certificate earners used it to obtain the majority (>50%) of their correct answers. We discuss two of the likely consequences of CAMEO: jeopardizing the value of MOOC certificates as academic credentials, and generating misleading conclusions in educational research. Based on our study, we suggest methods for reducing CAMEO. Although this study was conducted on a MOOC, CAMEO can be used in any learning environment that enables students to have multiple accounts.
Engineering degrees are often regarded as complex and one usual issue is that students struggle and feel discouraged during the learning process. Gamification is starting to play an important role in education with the objective of providing engagement and improving the motivation of students. One specific example is the use of badges. The analysis of users’ interactions and behaviors with the badge system can be used to improve the learning process, e.g. by adapting the learning materials and giving game-based activities to students depending on their interest toward badges. In this work we propose some metrics that provide information regarding the behavior of students with badges, including if they are intentionally earning them, the concentration for achieving them and their time efficiency. We validate these metrics by providing an extensive analysis of 291 different students interacting with a local instance of Khan Academy within our courses for freshmen at Universidad Carlos III de Madrid. This analysis includes relationship mining between badge indicators and others related to the learning process, the analysis of specific archetypal profiles of students that represent a broader population and also by clustering students by their badge indicators with the objective of customizing learning experiences. We finalize by discussing the implications of the results for engineering education, providing guidelines into how instructors can take advantage of the findings of the research and how researchers can replicate experiments similar to this one in other general contexts.
The Universidad Carlos III de Madrid has been offering several face-to-face remedial courses for new students to review or learn concepts and practical skills that they should know before starting their degree program. During 2012 and 2013, our University adopted MOOC-like technologies to support some of these courses so that a blended learning methodology could be applied in a particular educational context, i.e. by using SPOCs (Small Private Online Courses). This paper gathers a list of issues, challenges and solutions when implementing these SPOCs. Based on these challenges and issues, a design process is proposed for the implementation of SPOCs. In addition, an evaluation is presented of the different use of the offered courses based on indicators such as the number of videos accessed, number of exercises accessed, number of videos completed, number of exercises correctly solved or time spent on the platform.
One of the reported methods of cheating in online environments in the literature is CAMEO (Copying Answers using Multiple Existences Online), where harvesting accounts are used to obtain correct answers that are later submitted in the master account which gives the student credit to obtain a certificate. In previous research we developed an algorithm to identify and label submissions that were cheated using the CAMEO method; this algorithm relied on the IP of the submissions. In this study we use this tagged sample of submissions to i) compare the influence of student and problems characteristics on CAMEO and ii) build a random forest classifier that detects submissions as CAMEO without relying on IP, achieving sensitivity and specificity levels of 0.966 and 0.996, respectively. Finally, we analyze the importance of the different features of the model finding that student features are the most important variables towards the correct classification of CAMEO submissions, concluding also that student features have more influence on CAMEO than problem features.
One of the most investigated questions in education is to know which factors or variables affect learning. The prediction of learning outcomes can be used to act on students in order to improve their learning process. Several studies have addressed the prediction of learning outcomes in intelligent tutoring systems environments with intensive use of exercises, but few of them addressed this prediction in other web‐based environments with intensive use not only of exercises but also, for example, of videos. In addition, most works on prediction of learning outcomes are based on low level indicators such as number of accesses or time spent in resources. In this paper, we approach the prediction of learning gains in an educational experience using a local instance of Khan Academy platform with an intensive use of exercises and taking into account not only low level indicators but also higher level indicators such as students' behaviours. Our proposed regression model is able to predict 68% of the learning gains variability with the use of six variables related to the learning process. We discuss these results providing explanation of the influence of each variable in the model and comparing these results with other prediction models from other works.
When massive open online courses (MOOCs) first captured global attention in 2012, advocates imagined a disruptive transformation in postsecondary education. Video lectures from the world's best professors could be broadcast to the farthest reaches of the networked world, and students could demonstrate proficiency using innovative computer-graded assessments, even in places with limited access to traditional education. But after promising a reordering of higher education, we see the field instead coalescing around a different, much older business model: helping universities outsource their online master's degrees for professionals. To better understand the reasons for this shift, we highlight three patterns emerging from data on MOOCs provided by Harvard University and Massachusetts Institute of Technology (MIT) via the edX platform: The vast majority of MOOC learners never return after their first year, the growth in MOOC participation has been concentrated almost entirely in the world's most affluent countries, and the bane of MOOCs—low completion rates—has not improved over 6 years.
To process low level educational data in the form of user events and interactions and convert them into information about the learning process that is both meaningful and interesting presents a challenge. In this paper, we propose a set of high level learning parameters relating to total use, efficient use, activity time distribution, gamification habits, or exercise-making habits, and provide the measures to calculate them as a result of processing low level data. We apply these parameters and measures in a real physics course with more than 100 students using the Khan Academy platform at Universidad Carlos III de Madrid. We show how these parameters can be meaningful and useful for the learning process based on the results from this experience.
The Khan Academy platform enables powerful on-line courses in which students can watch videos, solve exercises or earn badges. This platform provides an advanced learning analytics module with useful visualizations for teachers and students. Nevertheless, this learning analytics support can be improved with recommendations and new useful higher level visualizations in order to try to improve the learning process. In this paper, we describe our architecture for processing data from the Khan Academy platform in order to show new higher level learning visualizations and recommendations. The different involved elements of the architecture are presented and the different decisions are justified. In addition, we explain some initial examples of new useful visualizations and recommendations for teachers and students as part of our extension of the learning analytics module for the Khan Academy platform. These examples use data from an undergraduate Physics course developed at Universidad Carlos III de Madrid with more than 100 students using the Khan Academy system.
Instructors and students have problems monitoring the learning process from low level interactions in on-line courses because it is hard to make sense of raw data. In this paper we present a demonstration of the Add-on of the Learning Analytics Support in the Khan Academy platform (ALAS-KA). Our tool processes the raw data in order to transform it into useful information that can be used by the students and instructors through visualizations. ALAS-KA is an interactive tool that allows teachers and students to select the provided information divided by courses and type of information. The demonstration is illustrated with different examples based on real experiments data.
The emergence of platforms to support MOOCs (Massive Open Online Courses) strengthens the need of a powerful learning analytics support since teachers cannot be aware of so many students. However, the learning analytics support in MOOC platforms is in an early stage nowadays. The edX platform, one of the most important MOOC platforms, has few learning analytics functionalities at present. In this paper, we analyze the learning analytics support given by the edX platform, and the main initiatives to implement learning analytics in edX. We also present our initial steps to implement a learning analytics extension in edX. We review technical aspects, difficulties, solutions, the architecture and the different elements involved. Finally, we present some new visualizations in the edX platform for teachers and students to help them understand the learning process.
The appearance of MOOCs has boosted the use of educational technology in all possible contexts. Universities are trying to understand this new phenomenon, while carrying out the first trials. Best practices are still scarce and will be developed in the coming months. In this paper, we present first experiences carried out at Universidad Carlos III de Madrid, both with MOOCs (Massive Open Online Courses) and with SPOCs (Small Private Online Courses), which are MOOC counterparts for internal use.
Virtual Learning Environments (VLEs) provide studentts with activities to improve their learning (e.g., reading texts, watching videos or solving exercises). But VLEs usually also provide optional activities (e.g., changing an avatar profile or setting goals). Some of these have a connection with the learning process, but are not directly devoted to learning concepts (e.g., setting goals). Few works have dealt with the use of optional activities and the relationships between these activities and other metrics in VLEs. This paper analyzes the use of optional activities at different levels in a specific case study with 291 students from three courses (physics, chemistry and mathematics) using the Khan Academy platform. The level of use of the different types of optional activities is analyzed and compared to that of learning activities. In addition, the relationship between the usage of optional activities and different student behaviors and learning metrics is presented.
This work approaches the prediction of learning gains in an environment with intensive use of exercises and videos, specifically using the Khan Academy platform. We propose a linear regression model which can explain 57.4% of the learning gains variability, with the use of four variables obtained from the low level data generated by the students. We found that two of these variables are related to exercises (the proficient exercises and the average number of attempts in exercises), and one is related to both videos and exercises (the total time spent in both) related to exercises, whereas only one is related to videos.
The emergence of Massive Open Online Courses (MOOCs) has caused a high disrupting effect on online education. One of the most extended MOOC platforms is Open edX. There is a demanding necessity by the instructors and students of these courses to provide timely analytics tools that can help understand the learning process at any moment. In this direction we have developed the Add-on of learNing AnaLYtics Support for open Edx (ANALYSE), which is our learning analytics contribution for Open edX. In this demonstration paper we will provide guidelines on how to use some of the ANALYSE video visualizations in order to detect problems in video resources, so that the learning process can be improved.
The use of badges in educational contexts its starting to gain popularity. However many studies do not offer an extensive analysis of the results regarding the use of badges after the educational experiment is finished. In this work we offer an evaluation of the results of three courses (physics, chemistry and mathematics) that we have conducted using Khan Academy with a wide badge system and 291 different students. We analyze these results regarding the distribution of badges per student, analyzing also the different badge types and which of them were delivered more often. We also explore the influence of factors such as the difficulty of problems or video length in the amount of badges triggered by exercises and videos respectively. We compare the results among the three courses trying to find possible explanations to these differences. We also put the lessons learned into context and give recommendations so that our findings can be used by instructional designers and other researchers.
Education is being powered by technology in many ways. One of the main advantages is making use of data to improve the learning process. The massive open online course (MOOC) phenomenon became viral some years ago, and with it many different platforms emerged. However most of them are proprietary solutions (i.e. Coursera, Udacity) and cannot be used by interested stakeholders. At the moment Open edX is placed as the primary open source application to support MOOCs. The community using Open edX is growing at a fast pace with many interested institutions. Nevertheless, the learning analytics support of Open edX is still in its first steps. In this paper we present an overview and demonstration of ANALYSE, an open source learning analytics tool for Open edX. ANALYSE includes currently 12 new visualizations that can be used by both instructors and students.
One of the most common gamification techniques in education is the use of badges as a reward for making specific student actions. We propose two indicators to gain insight about students' intentionality towards earning badges and use them with data from 291 students interacting with Khan Academy courses. The intentionality to earn badges was greater for repetitive badges, and this can be related to the fact that these are easier to achieve. We provide the general distribution of students depending on these badge indicators, obtaining different profiles of students which can be used for adaptation purposes.
The study presented in this paper deals with copying answers in MOOCs. Our findings show that a significant fraction of the certificate earners in the course that we studied have used what we call harvesting accounts to find correct answers that they later submitted in their main account, the account for which they earned a certificate. In total, around 2.5% of the users who earned a certificate in the course obtained the majority of their points by using this method, and around 10% of them used it to some extent. This paper has two main goals. The first is to define the phenomenon and demonstrate its severity. The second is characterizing key factors within the course that affect it, and suggesting possible remedies that are likely to decrease the amount of cheating. The immediate implication of this study is to MOOCs. However, we believe that the results generalize beyond MOOCs, since this strategy can be used in any learning environments that do not identify all registrants.
Online learning has become very popular over the last decade. However, there are still many details that remain unknown about the strategies that students follow while studying online. In this study, we focus on the direction of detecting 'invisible' collaboration ties between students in online learning environments. Specifically, the paper presents a method developed to detect student ties based on temporal proximity of their assignment submissions. The paper reports on findings of a study that made use of the proposed method to investigate the presence of close submitters in two different massive open online courses. The results show that most of the students (i.e., student user accounts) were grouped as couples, though some bigger communities were also detected. The study also compared the population detected by the algorithm with the rest of user accounts and found that close submitters needed a statistically significant lower amount of activity with the platform to achieve a certificate of completion in a MOOC. These results confirm that the detected close submitters were performing some collaboration or even engaged in unethical behaviors, which facilitates their way into a certificate. However, more work is required in the future to specify various strategies adopted by close submitters and possible associations between the user accounts.
The emergence of MOOCs (Massive Open Online Courses) makes available big amounts of data about students’ interaction with online educational platforms. This allows for the possibility of making predictions about future learning outcomes of students based on these interactions. The prediction of certificate accomplishment can enable the early detection of students at risk, in order to perform interventions before it is too late. This study applies different machine learning techniques to predict which students are going to get a certificate during different timeframes. The purpose is to be able to analyze how the quality metrics change when the models have more data available. From the four machine learning techniques applied finally we choose a boosted trees model which provides stability in the prediction over the weeks with good quality metrics. We determine the variables that are most important for the prediction and how they change during the weeks of the course.
Massive Open Online Courses (MOOCs) collect large amounts of rich data. A primary objective of Learning Analytics (LA) research is studying these data in order to improve the pedagogy of interactive learning environments. Most studies make the underlying assumption that the data represent truthful and honest learning activity. However, previous studies showed that MOOCs can have large cohorts of users that break this assumption and achieve high performance through behaviors such as Cheating Using Multiple Accounts or unauthorized collaboration, and we therefore denote them fake learners. Because of their aberrant behavior, fake learners can bias the results of Learning Analytics (LA) models. The goal of this study is to evaluate the robustness of LA results when the data contain a considerable number of fake learners. Our methodology follows the rationale of ‘replication research’. We challenge the results reported in a well-known, and one of the first LA/Pedagogic-Efficacy MOOC papers, by replicating its results with and without the fake learners (identified using machine learning algorithms). The results show that fake learners exhibit very different behavior compared to true learners. However, even though they are a significant portion of the student population ( ∼ 15%), their effect on the results is not dramatic (does not change trends). We conclude that the LA study that we challenged was robust against fake learners. While these results carry an optimistic message on the trustworthiness of LA research, they rely on data from one MOOC. We believe that this issue should receive more attention within the LA research community, and can explain some ‘surprising’ research results in MOOCs.
One of the original purposes of MOOCs is to democratize education worldwide in order to advance towards a fairer society and universal human development. However, initial findings suggest that there are a number of challenges that MOOCs face to achieve their maximum potential in developing countries and regions with complex issues of access to high quality education. The majority of research studies on MOOCs focus on one or a small number of courses, or an overview of an entire platform or system, such as edX or FutureLearn. However, these kinds of investigations can mask important regional variation in different parts of the world. In this study we conduct a longitudinal analysis using data from six years of courses from MITx and HarvardX, focusing on the particular Arab world sub-population, and comparing to the rest of the world and also on their human development index. A close investigation of this subpopulation will help us better understand what kinds of course registration and course-taking patterns are influenced by regional cultural factors, and what dimensions of MOOC learning are more universal. In this work we present initial results after conducting exploratory analysis on 452 MOOCs (~ 4.5M unique learners) from MITx and HarvardX, which show that despite the important cultural and geographical contrasts, the general trends are quite similar. Still, we observe some significant differences, such as lower completion metrics for Arabic countries when performing this comparison within each human development category and also some differences in percentage of enrolments per course category.
To fully leverage data-driven approaches for measuring learning in complex and interactive game environments, the field needs to develop methods to coherently integrate learning analytics (LA) throughout the design, development, and evaluation processes to overcome the downfalls of a purely data approach. In this paper, we introduce a process that weaves three distinctive disciplines together--assessment science, game design, and learning analytics--for the purpose of creating digital games for educational assessment.
While global massive open online course (MOOC) providers such as edX, Coursera, and FutureLearn have garnered the bulk of attention from researchers and the popular press, MOOCs are also provisioned by a series of regional providers, who are often using the Open edX platform. We leverage the data infrastructure shared by the main edX instance and one regional Open edX provider, Edraak in Jordan, to compare the experience of learners from Arab countries on both platforms. Comparing learners from Arab countries on edX to those on Edraak, the Edraak population has a more even gender balance, more learners with lower education levels, greater participation from more developing countries, higher levels of persistence and completion, and a larger total population of learners. This "apples to apples" comparison of MOOC learners is facilitated by an approach to multiplatform MOOC analytics, which employs parallel research processes to create joint aggregate datasets without sharing identifiable data across institutions. Our findings suggest that greater research attention should be paid towards regional MOOC providers, and regional providers may have an important role to play in expanding access to higher education.
This chapter analyzes the different implications of the new MOOC paradigm in assessment activities, emphasizing the differences with respect to other non MOOC educational technology environments and giving an insight about the redesign of assessment activities for MOOCs. The chapter also compares the different assessment activities that are available in some of the most used MOOC platforms at present. In addition, the process of design of MOOC assessment activities is analyzed. Specific examples are given about how to design and create different types of assessment activities. The Genghis authoring tool as a solution for the creation of some types of exercises in the Khan Academy platform is presented. Finally, there is an analysis of the learning analytics features related to assessment activities that are present in MOOCs. Moreover, some guidelines are provided about how to interpret and take advantage of this information.
Most e-learning platforms are able to collect large datasets of students' interactions as events; however that data is difficult to be interpreted directly by learning stakeholders. In this work we unify and connect several of our previous research studies giving a general context of our learning analytics research on Khan Academy. We propose a set of interesting indicators in order to learn more about the learning process. Furthermore, we have designed and implemented a learning analytics module called ALAS-KA which displays individual and class visualizations for these parameters. Finally we make use of ALAS-KA and the parameters to evaluate learning experiences.
The current relevance of Massive Open Online Courses (MOOCs) has provoked researchers in educational technology to work towards improving their pedagogical outcomes. Adaptive MOOCs are an example within this context. Given the importance of affective information within the adaptive systems, we propose a set of models to detect four emotions known to correlate with learning gains. The implementation of the models and the initial results from its application in a case study dataset are also provided.
This paper describes the configuration, setup and initial analysis examples of an experience for introducing learning analytics in a MOOC of maths for adult high school education, using the flipped the classroom methodology. An overview of the MOOC of maths is provided, as well as an overview of the ANALYSE learning analytics tool which is used to analyze the learning process. We describe how the learning analytics tool can be useful in this setup and methodology and we illustrate specific examples of conclusions in the MOOC of maths.
In current online courses, most learning analytics techniques collect, analyze and display low-level data about the interactions of students with educational activities and resources. These data are used to detect students with difficulties in the course, as well as educational activities and resources which might be problematic. This paper presents the Precise Effectiveness Strategy (PES) which enables to calculate the effectiveness of students with educational activities and resources in online courses from low-level events in a quantitative way, taking into account different aspects of the learning context. The PES is particularized for specific educational activities and resources, which are video resources and parametric exercises. Finally, we propose some visualizations related to video and exercise effectiveness in a real course in physics.