DataShop > Help > Reports

Dataset Info

The Dataset Info report provides both an overview and context for the currently selected dataset. It may answer questions such as how and when were these data collected? What's the scope of the dataset? If the owner of the dataset chose to provide other information about the dataset, that is displayed here as well.

The main components of the Dataset Info report are:

Dataset Overview and Statistics

The dataset overview table is presented at the top of the Dataset Info report. This report loads by default when you select a dataset to browse. If you don't see the overview, click the main tab titled Dataset Info followed by the Overview link below.

In the Overview table, a number of fields describe the dataset's characteristics. If you are a project admin for this project, you can edit some of the fields in the Overview table—click a field to edit it.

The overview fields are:

Category	Description
Project	A collection of datasets with a principal investigator and a data provider (who is often the same as the principal investigator). We consider a project to be a title for your research (e.g., Perceptual Fluency in Geometry Achievement). It might be similar to the title of a grant proposal, or some other phrase that identifies your work. To change the project name, contact us.
Principal Investigator	Defined at the project level, this is the person who, along with the data provider, determines who has access to the project. To change the principal investigator, contact us.
Data Provider	Defined at the project level, the data provider is a person responsible for providing a dataset to DataShop. He or she, with the agreement of CMU legal, may specify whether a project-specific terms of use should apply to a project. Most datasets in DataShop use the same person for both the data provider and principal investigator fields; in this case, data provider is not shown. The data provider, along with the principal investigator, determines who has access to the project. To change the data provider, contact us.
Curriculum	Used to describe the curriculum in which these data were collected (e.g., Algebra I).
Dates	The date range(s) for when these data were collected. This can be determined from the log data by pressing the auto-set button.
Area/Subject	The Area/Subject group to which this dataset belongs (e.g., Language/Chinese or Math/Algebra).
Tutor	The title of the tutor software used to collect data (e.g., Algebra 1 2005 or CTAT 2.7)
Description	A description of the dataset. This can include links to outside resources. It can be helpful to enter as much contextual information here as possible so that other researchers can attempt to make sense of the dataset. This is especially true if the dataset is part of a public project.
Has Study Data	Whether or not the dataset contains data that are the result of a research study or experiment.
Hypothesis	The hypothesis that was tested. Only displayed if "Has Study Data" is "yes".
Status	The status of the dataset (one of on-going, complete, files-only, or other ).
School(s)	The school(s) where these data were collected.
Acknowledgment for Secondary Analysis	Acknowledgement that a researcher should include in a publication if they use this dataset for their research. The acknowledgement, if entered, is shown on the Citation page and in a text file included with each export.
Preferred Citation for Secondary Analysis	Citation that a researcher should include in a publication if they use this dataset for their research. The citation, if entered, is shown on the Citation page and in a text file included with each export. A citation must be for a paper attached to the dataset.
Additional Notes	Any additional information about the dataset.

The statistics table, described below, is generated from the data and is therefore not editable.

Category	Description
Number of Students	The total number of students for which there is data.
Number of Unique Steps	The number of unique steps in the dataset, where uniqueness is defined as a step within a specific problem hierarchy (the curriculum location where the problem appears). The same step attempted by two students equals only one unique step.
Total Number of Steps	The number of steps in the dataset, where each student-step counts as one step. The same step attempted by two students equals two steps in the total number of steps. For example, if problem A has steps S1, S2, and S3, and student A does S1 and S2 while student B does S2 and S3, and there is just that problem in the dataset, then there are 3 unique steps and 4 total steps.
Total Number of Transactions	The total number of transactions in the dataset.
Total Student Hours	The number of hours of student activity in the dataset, represented by the sum of the duration of all student transactions in the dataset.
Knowledge Component Model(s)	The knowledge component models for this dataset (e.g., Default, Manual-Model). The number of unique knowledge components in the model is displayed following each model listed.

Sample Selector

Creating a sample

The effect of multiple filters

DataShop @CMU

Table of Contents

Dataset Info

Dataset Overview and Statistics