Contents
Datasets are generated nightly by the platform from course start up until two weeks after it ends, covering activity up until the end of the day. The ‘enrolments’ dataset is generated nightly as soon as the course opens for enrolments, and continues to be generated indefinitely. These datasets available securely on the course dashboard for download.
Datasets take the form of CSV (comma-separated values) files named to the following convention: course-slug_dataset-type.csv where:
-
dataset-type identifies the nature of data contained
-
course-slug identifies the course and matches the URL path component of the course (e.g. https://www.futurelearn.com/courses/course-slug/)
The use of CSV files is a practical compromise between some formality in the format of the data, and ease-of-use with common tools such as Excel. There is no commonly-agreed standard for CSV files so while we make our best efforts to ensure that a field in a CSV file will not be removed, or have its meaning changed without notice, we reserve the right to add new columns to the file. If you are writing software to work with these datasets it is advised to rely on the column names to identify columns rather than their position in the file, or on the width of the file itself.
Archetype survey responses
- learner_id [string] – the unique, anonymous id assigned to the user (this is unique per user so you can see activity on more than one course.
- responded_at [timestamp] – when the learner responded to the archetype survey.
- archetype [string] – the archetype group corresponding to the motivation provided by the learner at enrolment to the course. This can be Advancers, Explorers, Fixers, Flourishers, Hobbyists, Preparers, Vitalisers, or Other.
Campaigns
This dataset has a single row for each referral or campaign that a learner used to enrol on your course run. A “campaign link” has the form:https://futurelearn.com/courses/course-slug?utm_source=source&utm_medium=medium&utm_campaign=name
where the source
, medium
and name
values can take any form that helps you to identify your campaign.For example, if you were sending out an email newsletter to your alumni to promote a new language course, you might use:
https://futurelearn.com/courses/learn-spanish?utm_source=alumni-newsletter&utm_medium=email&utm_campaign=learn-spanish
You can find more examples and a handy link builder in Google’s documentation.
When a learner follows the link and enrols on the course we associate the enrolment with the campaign.
- utm_source [string] – the value of the
utm_source
parameter used in the URL. - utm_campaign [string] – the value of the
utm_campaign
parameter used in the URL. - utm_medium [string] – the value of the
utm_medium
parameter used in the URL. - domain [string] – the domain of the referring website where the link was used, when available. For example
google.co.uk
. - enrolments [integer] – the number of enrolments for this run associated with the referrer.
- active_learners [integer] – the number of enrolments that had associated step-completion activity.
If the utm_*
fields have the value None
this means no utm_
parameters were used in the link the learner followed. In this case we may still have the referring domain of the site that had the link to FutureLearn. If the domain
field is direct/no referrer
this means either that no link was followed by the learner (they may have typed futurelearn.com
directly into the URL bar of their browser, for example), or that a link was followed from an application that does not send the referring domain, for example an email or instant messaging application.
Comments
As stated in the FutureLearn Terms (6.11) Comments published on FutureLearn courses are held under a Creative Commons Licence (Attribution-Non Commercial-NoDerivs; BY-NC-ND). Accordingly, any further publication of such data must be attributed to the original author.
- id [integer] – a unique id assigned to each comment. You can use this to jump straight to the comment by going to https://www.futurelearn.com/courses/<your course>/comments/<this id>
- author_id [string] – the unique, anonymous id assigned to the author user.
- parent_id [integer] – the unique id of the parent comment (i.e. the comment this comment replies to)
- step [string] – the human readable step number (e.g. 1.13)
- week_number [integer] – the week component of the step number
- step_number [integer] – the step component of the step number
- text [string] – the comment text
- timestamp [timestamp] – when the comment was posted
- likes [integer] – the number of likes attributed to the comment
- first_reported_at [timestamp] – the first time and date at which a comment was reported to the moderators for review, if at all
- first_reported_reason [string] – the reason selected from the moderation drop-down by the user who reported the comment, if at all
- moderation_state [string] – the current moderation state (approved, hidden, or deleted) of the comment, if it has been reported
- moderated [timestamp] – the time and date on which the moderation action was taken
Enrolments
- learner_id [string] – the unique, anonymous id assigned to the user (this is unique per user so you can see activity on more than one course.
- enrolled_at [timestamp] – when the learner enrolled.
- unenrolled_at [timestamp] – when the learner un-enrolled (if they have).
- role [string] – the user role of the learner. This can be learner, educator, admin, marketeer or moderator. The team members dataset provides more detail about the course team.
- fully_participated_at [timestamp] – when the learner achieved full participation.
- statement_purchased_at [timestamp] – when the learner purchased a statement (note. this is not intended for financial records, as it does not include refunds/charge backs etc. simply that a purchase was made.
- gender [string] The learner’s response to the question ‘What is your gender?’. One of Unknown, declined_to_answer, female, male, nonbinary. Before 16/03/2016 we had the option “Other”, this has been removed, but may appear in older datasets.
- country [string] The learner’s response to the question ‘Which country do you live in?’. One of Unknown, declined_to_answer, or a ISO-3166 Alpha-2 country code. In addition we denote Kosovo with the code XK.
- age_range [string] The learner’s response to the question ‘What year were you born?’. One of Unknown, declined_to_answer, <18, 18-25, 26-35, 36-45, 46-55, 56-65, >65. The age range is calculated by taking the year that the dataset was generated in and subtracting the year of birth given by the learner.
- highest_education_level [string] The learner’s response to the question ‘What is the highest level of education you’ve completed?’. One of Unknown, declined_to_answer, less_than_secondary, secondary, tertiary, university_degree, university_masters, university_doctorate, apprenticeship, professional.
- employment_status [string] The learner’s response to the question ‘Which of the following categories best describes your employment status?’. One of Unknown, declined_to_answer, working_full_time, working_part_time, self_employed, looking_for_work, full_time_student, unemployed, retired, not_working.
- employment_area [string] The learner’s response to the question ‘What is your current area of employment?’. One of Unknown, declined_to_answer, accountancy_banking_and_finance, armed_forces_and_emergency_services, business_consulting_and_management, charities_and_voluntary_work, creative_arts_and_culture, energy_and_utilities, engineering_and_manufacturing, environment_and_agriculture, health_and_social_care, hospitality_tourism_and_sport, it_and_information_services, law, marketing_advertising_and_pr, media_and_publishing, property_and_construction, public_sector, recruitment_and_pr, retail_and_sales, science_and_pharmaceuticals, teaching_and_education, transport_and_logistics
- unlimited [boolean] – TRUE or FALSE to indicate if a learner is accessing the course via their Unlimited subscription.
The fields gender, country, age_range, highest_education_level, employment_status, employment_area are derived from the learners responses to the survey at https://futurelearn.com/user/more-about-you.The learner is prompted once by email to complete this survey either when the first join FutureLearn in the case of new learners, or first enrol on a new course in the case of existing learners. They may also complete the survey by visiting the URL directly. Only users with the role ‘learner’ are allowed to complete the survey. ‘Unknown’ means the learner has not completed the survey. ‘declined_to_answer’ means that they completed the survey but left the question response blank.
Leaving survey responses
- learner_id – [string] – the unique, anonymous id assigned to the user (this is unique per user so you can see activity on more than one course.
- left_at – [timestamp] – when the learner un-enrolled from the course
- leaving_reason -[string] – the reason for un-enrolling from the course provided by the learner when leaving the course.
- last_completed_step_at – [timestamp] – when a step was last marked as complete by the user
- last_completed_step [string] – the human readable step number (e.g. 1.13) for the last completed step
- last_completed_week_number – [integer] – the week component of the last completed step
- last_completed_step_number – [integer] – the step component of the last completed step
Peer review assignments
- id [integer] – a unique id assigned to each assignment submission (referenced by reviews)
- step [string] – the assignment’s human readable step number (e.g. 1.13)
- week_number [integer] – the week component of the step number
- step_number [integer] – the step component of the step number
- author_id [string] – the unique, anonymous id assigned to the author user.
- text [string] – the assignment text
- first_viewed_at [timestamp] – when the assignment step was first viewed
- submitted_at [timestamp] – when the assignment was submitted
- moderated [timestamp] – the datetime at which a assignment was moderated, if at all
- review_count [integer] – how many reviews are associated with the assignment
Peer review reviews
- id [integer] – a unique id assigned to each assignment review
- step [string] – the review’s human readable step number (e.g. 1.13)
- week_number [integer] – the week component of the step number
- step_number [integer] – the step component of the step number
- reviewer_id [string] – the unique, anonymous id assigned to the reviewing user
- assignment_id [integer] – the id identifying the assignment reviewed
- guideline_one_feedback [string] – text submitted for the first guideline
- guideline_two_feedback [string] – text submitted for the second guideline
- guideline_three_feedback [string] – text submitted for the third guideline
- created_at [timestamp] – when the review was submitted
Question response
- learner_id [string] – the unique, anonymous id assigned to the user
- quiz_question [string] – a human readable concatenation of the step and question number. e.g. 1.3.2 would represent the second question, from the third step in week one.
- week_number [integer] – the week component of the step number
- step_number [integer] – the step component of the step number
- question_number [integer] – the question component of the step number
- response [string] – the answer number selected, reflecting their ordered position. e.g. 1 would be the first answer, 1,3,4 would be the first, third and forth in a multiple answer question. These are always presented delineated by commas with no spaces and in ascending order.
- correct [boolean] – TRUE or FALSE for the responses correctness.
Step activity
- learner_id [string] – the unique, anonymous id assigned to the user.
- step [string] – the human readable step number (e.g. 1.13)
- week_number [integer] – the week component of the step number
- step_number [integer] – the step component of the step number
- first_visited_at [timestamp] – when the step was first viewed by the user
- last_completed_at [timestamp] – when the step was last marked as complete by the user
There may be identical first_visited_at timestamps for different steps in the same enrolment. This can occur if a learner opens multiple steps in new browser tabs in the same second. The last_completed_at timestamp for a step may also be the same as the first_visited_at timestamp of the next step if a learner marks a step as complete and then immediately views the next step.
Team members
- id [string] – the unique, anonymous id assigned to the user
- first_name [string] – the first name of the user as provided when they created their account
- last_name [string] – the last name of the user as provided when they created their account
- team_role [string] – the role or permission a user has on the “Manage course team” page in the course creator. Currently one of {author, lead_educator, host, mentor, reviewer, educator}
- user_role [string] – the role assigned to the user at the user level. Currently one of {marketer, moderator, organisation_admin, learner, admin}
An individual user may take several team roles on a given course. In this case, they appear multiple times in the dataset, once per team_role.
Video stats
- step_position [string] – the position of the step on the course
- title [string] – the name of the video step
- video_duration [integer] – the duration of the video in seconds
- total_views [integer] – the total amount of unique session views (i.e. a learner viewing the video twice in the same session will count as one view)
- total_downloads [integer] – the total amount of downloads for the video
- total_caption_views[integer] – the amount of views that featured a learner viewing the closed captions
- total_transcript_views [integer] – the amount of views that featured a learner viewing the html transcript
- viewed_hd [integer] – the amount of views that featured a learner viewing the video in HD
- viewed_five_percent [float] – the percentage of video views that viewed at least five percent of the video
- viewed_ten_percent [float] – the percentage of video views that viewed at least ten percent of the video
- viewed_twentyfive_percent [float] – the percentage of video views that viewed at least twenty five percent of the video
- viewed_fifty_percent [float] – the percentage of video views that viewed at least fifty percent of the video
- viewed_seventyfive_percent [float] – the percentage of video views that viewed at least seventy five percent of the video
- viewed_ninetyfive_percent [float] – the percentage of video views that viewed at least ninety five percent of the video
- viewed_onehundred_percent [float] – the percentage of video views that viewed at least one hundred percent of the video
- console_device_percentage [float] – the percentage of video views that were viewed from a console
- desktop_device_percentage [float] – the percentage of video views that were viewed from a desktop
- mobile_device_percentage [float] – the percentage of video views that were viewed from a mobile
- tv_device_percentage [float] – the percentage of video views that were viewed from a tv
- tablet_device_percentage [float] – the percentage of video views that were viewed from a tablet
- unknown_device_percentage [float] – the percentage of video views that were viewed from an unknown device
- europe_views_percentage [float] – the percentage of video views that were viewed from Europe
- oceania_views_percentage [float] – the percentage of video views that were viewed from Oceania
- asia_views_percentage [float] – the percentage of video views that were viewed from Asia
- north_america_views_percentage [float] – the percentage of video views that were viewed from North America
- south_america_views_percentage [float] – the percentage of video views that were viewed from South America
- africa_views_percentage [float] – the percentage of video views that were viewed from Africa
- antarctica_views_percentage [float] – the percentage of video views that were viewed from Antarctica
This dataset is only available for runs that started on or after the 9th of May 2017. A video that has had no views will not appear in this dataset.
Weekly sentiment surveys
- id [integer] – a unique id assigned to each survey response
- responded_at [timestamp] – when the learner responded to the weekly sentiment survey
- week_number [integer] – the week component of the step number
- experience_rating [integer] – the rating provided by the learner (1 for unhappy, 2 for neutral, 3 for happy)
- reason [string] – the reason provided by the learner when rating the course
Working with these Datasets
- Microsoft Excel, Apple Numbers or other spreadsheet applications may become unstable or unresponsive when importing data with a large number of rows. If your course has a large number of learners you may find it difficult to analyse the data using spreadsheets.
- When importing CSV datasets, you should explore any options presented to you in your analysis application. For instance, many packages such as Excel will by default treat step numbers as numerics which will result in, for example, step 1.10 being turned into step 1.1. You should according import this column as a text string, factor or other non-numeric equivalent.
What data can I share with co-creators?
Occasionally your co-creating partners may need access to course data in order to do research, or to analyse course performance. Co-creators cannot access course datasets directly, as these contain personally identifying information which must be restricted under the General Data Protection Regulation (GDPR).
If you need to share course data with your co-creating partners, you can do so by removing certain columns from each dataset, and then sharing these edited datasets with your co-creating partners.
The columns that need removing from each dataset are listed below:
Dataset | Remove for co-creator access |
---|---|
Weekly sentiment survey responses |
|
Video stats | Nothing - already anonymous |
Team members | Cannot be shared |
Step activity |
|
Question response |
|
Post course survey free text |
Cannot be shared |
Post course survey data |
|
Peer review reviews |
|
Peer review assignments |
|
Leaving survey responses |
|
Enrolments |
|
Comments |
|
Campaigns | Nothing – already anonymous |
Archetype survey responses |
|
Comments
0 comments
Please sign in to leave a comment.