PSLC DataShop provides two main services to the learning science community:
- a central repository to secure and store research data
- a set of analysis and reporting tools
Researchers can rapidly access standard reports such as learning curves, as well as browse data using the interactive web application. To support other analyses, DataShop can export data to a tab-delimited format compatible with statistical software and other analysis packages.
Friday, 4 September 2015
DataShop 9.0 released
With the latest release of DataShop, our focus was on fixing bugs and enhancing a few existing features.
- Users can now quickly navigate from problem-specific information in a Learning Curve or Performance Profiler report directly to that problem in the Error Report; an "Error Report" button has been added to the tooltips. The Error Report includes information on the actual values students entered and the feedback received when working on the problem.
- In the Performance Profiler, if a secondary KC model is selected, the skills from the secondary model that are present in the problem are included in the problem info tooltip.
- If the Additive Factors Model (AFM) or Cross Validation (CV) algorithms fail or cannot be run, the reason is now available to the user as a tooltip. The tooltip is present when hovering over the status in the KC Models table. If you have a follow-up questions, remember that you can always send email to datashop-help.
- Users can now sort the skills in a particular KC model to indicate learning difficulty. By sorting the KC model skills by intercept and then tagging those for which the slope is below some threshold, users can easily identify skills that may be misspecified and should be split into multiple skills. See the DataShop Tutorial videos on how to change the skills and test the result of that change. This sorting feature is available on the "Model Values" tab of the Learning Curve page.
- The Cross Validation calculation was modified to provide more statistically valid results. The new calculation computes an average over 20 runs in determining the root mean squared error (RMSE).
- The Student-Step Export was updated to print only a single predicted-error-rate value for steps with multiple skills, as the values are always the same.
- The Help pages for the Additive Factors Modeling (AFM) have been updated to indicate that DataShop implements a compensatory sum across all Knowledge Components when there are multiple KCs assigned to a single step.
- The KC Model Import was fixed to ensure that invalid characters cannot be used in the model name not only during initial model import, but also in the dialog box that comes up when a duplicate name is detected.
Wednesday, 2 September 2015
Attention! DataShop downtime for release of v9.0
DataShop is going to be down for 2 hours beginning at 6:00am EST on Friday, September 4, 2015 while our servers are being updated for the new release.
Friday, 29 May 2015
DataShop 8.2 released - several enhancements and bug fixes
With the latest release of DataShop, our focus was on fixing bugs and enhancing several existing features.
- In order to easily see which skills are associated with a problem or step, the Performance Profiler tooltips now include the relevant KCs.
Simlarly, the KC information is now included in the Problem point info tooltips on the Learning Curve page.
- Web Services was extended to allow users to import new KC models. This functionality, already available via the UI, lets web service users add new KC models to a dataset, mapping steps to skills.
- To make it easier for researchers to identify a student's last attempt at a step, we have added an 'Is Last Attempt' column to the transaction export. This boolean value is true (1) for the transaction with the maximum 'Attempt At Step' and 'Problem View' for a student and step. This can be useful for grading purposes.
- In order to be more flexible, Custom Fields no longer specify a data type; the values in a Custom Field can be mixed, supporting date, number and string formats in a single Custom Field.
Custom Field string values of up to 65,000 characters are now supported.
- KC Models are now grouped by the number of observations with skills (KCs). Comparisons of models only make sense for those with the same number of observations; the models are now grouped by 'Observations with KCs' before being sorted by the user-specified column.
- Previously, uploaded datasets could not be deleted if they had been accessed by any other user. With this release, if a Project Admin wishes to delete a dataset that they uploaded they will be prompted with a list of users that have accessed the dataset which requires confirmation.
- We relaxed an import restriction requiring all transactions to have a value for all dataset levels defined in the import. This is not required for logged datasets so we have removed the restriction for uploaded datasets.
- A performance bottleneck in the dataset upload path was removed.
- The Additive Factors Modeling (AFM) code now correctly interprets the Outcome column when the value is "Unknown".
- Fixed a bug where exporting transactions for a sample with a long string in the Custom Field value caused an error.
- Sample to Dataset now limits permission to create a dataset from a sample to DataShop and Project Admins.
Tuesday, 26 May 2015
Attention! DataShop downtime for release of v8.2
DataShop is going to be down for 16 hours beginning at 6:00pm EST on Thursday, May 28, 2015 while our servers are being updated for the new release.
Tuesday, 16 December 2014
DataShop 8.1 released - Sample to Dataset and several bug fixes
The latest release of DataShop introduces the new "Sample to Dataset" feature, which allows users to create a new, smaller dataset easily from a sample in an existing dataset. This is especially useful for large datasets for which AFM cannot be run due to size restrictions. Creating a smaller dataset from a subset of data can eliminate the restriction and allow KC modeling and AFM to run on the subset. Users can create datasets from their own samples, as well as from those that are public.
- A new "Samples" subtab has been added to the "Dataset Info" tab. From this subtab users can view the list of samples defined for the dataset. The samples can be modified, deleted or saved as a separate dataset. To create a new dataset click on the "Save as Dataset" icon (pictured below).
Creating a dataset from an existing sample will place the new dataset into the same project as the source dataset, giving it the same access permissions.
For each sample, the history and any filters used to create the sample are available. If a dataset was created from the sample, that information will also be in the history.
In addition, we've added a few enhancements and fixed several bugs:
- Web services was extended to allow users to generate a Learning Curve categorization report. This report can be generated for a specific KC Model in a dataset or, by default, include all KC Models in the dataset. This information is already available in the user interface but is now also available in a tab-delimited report format via web services. In addition to the category, the report includes the KC intercept, KC slope, number of unique steps, number of observations and the number of step instances.
- For primary investigators and project administrators, the Current Permissions report for the project can now be exported from the Access Requests page.
- A web services bug was fixed, now allowing users to add Custom Fields to public datasets.
- The size limit for file uploads was increased from 100MB to 200MB. Dataset uploads are still restricted to 100MB.
Thursday, 11 December 2014
Attention! DataShop downtime for release of v8.1
DataShop is going to be down for 2–4 hours beginning at 6:30am EST on Tuesday, December 16, 2014 while our servers are being updated for the new release.
Thursday, 3 July 2014
DataShop 8.0 released - view the problem content that students saw
This release introduces a useful (and often-requested) feature for making sense of data in DataShop. For certain datasets, you can now jump from DataShop reports to the content that students interacted with—what we call "problem content".
Look for View Problem buttons throughout the interface (often in tooltips on problem or step name):
Clicking the View Problem button takes you to a new page with more information about the problem. (Note that the word "problem" is used in the sense of any activity the user did that was named in the problem column of the data.)
You can also access problem content from a new subpage of Dataset Info called Problem List:
What can you do with problem content?
- Learn more about the system that students used
- Inspect the interface and problem to explain student difficulties suggested by data
- Use machine learning on the HTML export of problem content
Which datasets have problem content?
Look for the following icon on the list of datasets in DataShop.
For this initital release, datasets having problem content will mostly be either Open Learning Inititiative (OLI) datasets, or those containing data from Mathtutor or TutorShop. One public dataset you can explore now is FractionStudy2012 (part of Fractions Lab Experiment 2012).
How do I add problem content to my dataset?
Please contact us, and we will consult with you on the format DataShop expects for problem content. If you are short on time, consider attaching files documenting your system on the Files tab of your dataset.
Other changes in this release:
- Problem Breakdown has been renamed to Step List
- The number of files attached to a dataset is shown in parentheses on the Files tab
- For learning curve categorization, display more information in tooltips about how the parameters determined which KCs fall within each category
- Many bug fixes and small improvements.
Monday, 27 June 2014
Attention! DataShop downtime for release of v8.0
DataShop is going to be down for 2–4 hours beginning at 6:30am EST on Monday, June 30, 2014 while our servers are being updated for the new release.
Friday, 25 April 2014
Attention! DataShop downtime for hardware upgrade
DataShop is going to be down for 1–4 hours beginning at 7am EST on Wednesday, April 30, 2014 while our servers are being upgraded.
Friday, 14 March 2014
DataShop 7.2 released - many enhancements and bug fixes
In this release, we focused on fixing bugs and enhancing existing functionality. Here's a list of what changed:
- On the Performance Profiler, you can now view the performance for each domain (Problem, Step, Student,
Knowledge Component, Problem hierarchy level) by step duration, error step duration,
and correct step duration.
- Removed sample selection from Learning Curve KC Models sub-page, as the report is not based on sample information.
- Automated the role request process for requesting permission to access web services, create datasets/projects, and add external tools.
- When a project is created, we now automatically initialize the creator of a project as the project's PI, but this can later be changed by a project admin.
- Relaxed import restrictions for new datasets. DataShop will now:
- Warn if the Problem Start Time is blank (instead of an error)
- Display an error if Step Name, Selection, and Action are all blank for a given row—one of these is required (instead of requiring both Selection and Action to have values)
- On the Learning Curve page, the Other learning curve category has been renamed to
Good—these knowledge components did not fall into any of the "bad" or "at risk" categories.
Thus, these are "good" learning curves in the sense that they appear to indicate substantial student learning.
- New notifications about IRB and shareability on the project permissions pages, as well as clearer documentation on shareability and sharing data.
- New edit icons on Dataset Info overview table now show which rows in the table are editable by project administrators.
- Dataset upload now prompts the uploader for whether the dataset contains "study data".
- On the project page, a new Data Last Modified column is shown for all users.
- Improved display for files-only datasets by disabling reports that require transaction data.
- Prevent web services from returning HTML on error.
- Fixed a bug where exporting two transaction samples resulted in .zip file that couldn't be opened.
- Error Report now correctly calculates and displays the number of observations per step, taking into account the "problem view". The report now distinguishes between number of students and number of observations, as a single student can do a step more than once over multiple instances of the problem (problem views).
Monday, 9 March 2014
Attention! DataShop downtime for release of v7.2
DataShop is going to be down for 2–4 hours beginning at 7am EST on Friday, March 14, 2014 while our servers are being updated for the new release.
Friday, 25 October 2013
DataShop 7.1 released - automatic dataset import, learning curve categorization, and more
We've just released a new version of DataShop with a number of great new features! Here's a description of what has changed.
Automatic verification and import of new dataset files
We're very excited to announce that you can now upload a dataset to DataShop and have it verified and imported by the web application. This update is the third and final step in enabling automatic import of new datasets. To get started importing your own dataset, log into DataShop and click "Upload a dataset". If prompted to request permission, click through the dialog and we will grant you access shortly. More information about the import process and formats can be found on our help page.
Learning curve categorization
A new feature, enabled by default, categorizes learning curves (graphs of error rate over time for different KCs, or skills) into one of four categories, which can help you to identify areas for improvement in the KC model or student instruction. Learn more about the categories here, or give it a try with a public dataset or one of your own.
Home page content
On our homepage, you'll see new content under the heading "What can I do with DataShop?", organized by researcher type and research goals. (If you don't see this section, click "What can I do?" from the navigation on the left.) Click a researcher type to see a list of potentially relevant topics. Follow a link to a topic and you'll see a description of how this goal has been achieved with DataShop data. Links to relevant datasets and papers appear at the bottom of each topic.
Support for adding custom fields to transaction data
Another exciting improvement is the ability to add custom fields to data already in DataShop using web services. A custom field is a new column you define for annotating transaction data. Although the feature is new to web services, some datasets in DataShop already have custom fields. This is because some tutors have been instrumented to record custom fields while logging.
Some examples of custom fields include a field that captures the time of each tutor response to the millisecond; a field noting the agent that took the action in a multi-agent system; and a field recording a categorization of the problem the student is working on.
You can define a custom field using the web application (see Dataset Info > Custom Fields), but to set the data in that field, you need to use web services, a way to programmatically interact with DataShop data.
Other important changes
- the student-step rollup is now cached for faster exporting, and shares the same format as the web services version. The full list of format changes for this release is documented here.
- all papers in DataShop are now publicly accessible to website visitors without logging in
- on the Files tab and subtabs, a project admin can now delete or edit any file added to the dataset
- multiple KC model imports at the same time on the same dataset no longer cause problems
- clicking a point in a learning curve and hovering over a row in the table of steps beneath shows the full list of KCs associated with that step, which can help you identify which KCs are contributing to changes in the curve
Tuesday, 22 October 2013
Attention! DataShop downtime for release of v7.1
DataShop is going to be down for 10-15 hours beginning at 9pm EST on Thursday, October 24, 2013 while our servers are being updated for the new release.
Tuesday, 21 October 2013
DataShop Release Event - Friday, November 1st
Late October brings an update to LearnLab DataShop, the world's largest open educational data repository, so we thought it would be worthwhile to meet up with current and potential users of DataShop. Come and see what's new with DataShop, enjoy light refreshments, and chat with us about your work.
DataShop Release Event
Friday, November 1, 2013
4-6pm (before the LearnLab Corporate Partners reception)
CMU Gates Building, room 6115 (or attend virtually)
Your RSVP is appreciated by Friday, October 25th. Also, please bring your laptop so that you can use DataShop during the event. For virtual attendees, we will send instructions for how to join closer to the day of the event.Highlighted New Features
- Learning curve categorization (highlights issues with the skill model and potentially student learning)
- Import your own datasets directly through the web application
- New homepage content organized by research goal and researcher type, with links to relevant datasets and papers
Monday, 1 July 2013
DataShop 7.0 released - automatic verification of uploaded datasets, access request enhancements, and more
Automatic verification of uploaded tab-delimited files
When uploading new tab-delimited files for import, the verify step now occurs in two phases, both automatic, which means immediate feedback and a simpler import process. You'll get feedback on a verification of the first 100 lines of your file (or first file, if there is more than one). If this verification succeeds, then your file(s) will be verified in full. The results of this process will be emailed to you and will be visible in your Import Queue.
- The Condition column has been added to the student-problem export
- The student-probem export is now cached for faster downloading, and the report is much faster overall. You can choose to export the cached version of the selected samples or use the options you've selected on the page.
- We removed the ".0" that appeared at the end of many timestamps even though none of the timestamps had millisecond information. This change affected the transaction, student-step, and student-problem formats.
- Exported files now have a file naming scheme that consistently identifies the type of export (changes in bold):
Changes to supported time formats for importing transaction data
- We updated the list of supported time formats
- Timezones are (still) ignored when in the 'Time' and 'Problem Start Time' columns but now generate a warning.
- Support for the HH:mm.0 format has been removed, as it usually indicates an Excel error (Excel auto-formatted the timestamp—see our tip on how to avoid this).
Access Request / Access Report changes
- When requesting access to a project, the "Reason" field is now required. In the same dialog, the default access level is now Edit.
- When exporting the Access Report from the Access Requests page, the exported file now takes into account any filtering and sorting that you may have done. In addition, users who only viewed the Dataset Info page of a dataset are now properly excluded from the report, as viewing public dataset metadata does not count as a dataset view.
- Automatic expiry of unanswered requests for access. For unanswered requests for access, DataShop will now remind the PI/data provider after one week and again after the second week. Then after the third week, the requester will be notified (the PI/data provider BCC'ed), telling them that we haven't heard from the PI/data provider and that we will expire the request in one week. After four weeks, if the PI/data provider has not acted on the request, the request will be denied.
AFM Changes (affecting AIC, BIC, and Cross Validation statistics)
- The AFM code has been optimized to be more efficient and run on larger datasets (those with more students and steps).
- AFM now distinguishes between two instances of a student working on the same step consecutively (via the Problem View column). This change will result in different AIC/BIC/Cross Validation values for some datasets.
- For unstratified cross validation, AFM follows the following rule more strictly: "the system requires that each student and each KC in the dataset appear in at least two observations. If a student or KC does not, all data points for that student or KC are excluded from cross validation".
- Cross validation values will now appear if at least one of the three types of cross validation (student stratified, step stratified, and unstratified) runs successfully. The types that did not run will be listed as "unable to run".
- Clarified the data requirements for running cross validation: at least 4 students and 4 KCs must be present for unstratified cross validation to run; at least 4 students must be present for student stratification cross validation to run; and at least 4 steps must be present for the step stratified cross validation to run.
Tuesday, 25 June 2013
DataShop downtime for release of v7.0
DataShop is going to be down for 1-5 hours beginning at 7am EST on Monday, July 1, 2013 while our servers are being updated for the new release.
Friday, 5 April 2013
DataShop 6.2 released - upload datasets, project access enhancements, and more
With today's update to DataShop, we've made another big step toward allowing you to import datasets directly. You can now upload a file to be imported into DataShop, as well as create and manage projects and files-only datasets. All progress on the import of your datasets will be shown in the Import Queue at the top of My Datasets.
Two new items in the main navigation under My Data—Upload a dataset and Create a project—allow you to get started adding new data to DataShop. You can create a dataset with or without transaction data. Transaction data is data that is in either of the two formats DataShop accepts (XML and tab-delimited). More info about these formats can be found in our help.
After you upload a dataset with transaction data, you'll see it in the new Import Queue on the My Datasets page. Information about the file format verification and import status (such as estimated import date) will be shown in the queue and emailed to you.
Manage project access directly
On each project page in DataShop you'll see an updated Permissions tab. If you are a project admin for that project, you can now see the list of people who have access to your project, modify that access, and grant access to new users directly by entering their username (in addition to responding to requests for access). An access report for that project is also available.
New "condition" column in the student-problem export
The "condition" column is now also included in the student-problem export, in addition to the transaction and student-step exports.
Access Report optimizations
The Access Report, which shows who has accessed your projects and what their permissions are, has been optimized to be much faster. You can view the main Access Report on the Access Requests page.
Thursday, 4 April 2013
DataShop downtime for release of v6.2
DataShop is going to be down for 1-5 hours beginning at 6am EST on Friday, April 5, 2013 while our servers are being updated for the new release.
Thursday, 24 January 2013
DataShop 6.1 released - new navigation, error bars, improved project pages, and more
Revised home page and navigation
The latest version of DataShop has a new navigation section along the left-hand side of the application. We've grouped together things that are specific to your account—your datasets, access requests, and profile—under the heading My Data. My Datasets now appears under this heading, while Public Datasets and Private Datasets (renamed from Other Datasets) appear under the heading Explore. We have also removed the login box in favor of the login page. (To log in, just click the "Log in" button.)
Error Bars in Learning Curves
Turn on error bars on a learning curve by clicking the "Error Bars" checkbox in the navigation. You can choose between error bars that represent one standard deviation or one standard error.
New project pages and subtabs
With this new project page, more information will be capable of being captured and indexed, making pages more intelligible to both researchers and Google search.
In addition the the existing project access levels of "edit" and "view", we've added a third—"admin". A project admin has full control over a project and its datasets. This role will be even more useful when we've added the ability to upload datasets. We've created a table to show the difference between the three roles. As of this release, if you were a PI for a project, you are now also its admin.
Another addition to the project page is a subtab called "IRB" (visible if you are the project admin for a project). When you add a dataset to DataShop, you must complete a few steps on the IRB subtab of your project page. These requirements, specified in the latest IRB for DataShop and on our help page, apply to all datasets added to DataShop after April 2012. Included in these are requirements for what you must do before being allowed to use DataShop to share data outside of your immediate research team. More information about this process is available on our help page.
Change to Performance Profiler controls
We've added controls for changing the X and Y axes to the navigation area. The existing controls, which can be accessed by positioning your cursor over the X and Y axis labels, are still available.
Tweaks to access requests and the access report
We made the following changes related to access requests and the access report:
- We fixed a bug that was prompting users to agree to project terms for projects they didn't have access to but were only browsing.
- New columns in the exportable access report show more information about the last action of the data provider, PI, or DataShop admin.
- If a project has both a PI and data provider and one of the two approves access, the PI or data provider that responded will not be notified again if the user re-requests access, nor will she be asked to approve access again.
- A PI or data provider responding to a request for access can now choose to share the reason they enter with the requester. This is the default, as most people were using this feature as if the text they entered was being sent to the requester.
Monday, 21 January 2013
DataShop Downtime for Scheduled Maintenance
DataShop is going to be down for 4 hours beginning at 8am EST on Thursday, January 24, 2013 while our servers are being updated for the new release.