Sample Selector

Sample Selector is a tool for creating and editing samples, or groups of data you compare across—they're not "samples" in the statistical sense, but more like filters.

By default, a single sample exists: "All Data". With the Sample Selector, you can create new samples to organize your data.

You can use samples to:

A sample is composed of one or more filters, specific conditions that narrow down your sample.

Creating a sample

The general process for creating a sample is to:

The effect of multiple filters

DataShop interprets each filter after the first as an additional restriction on the data that is included in the sample. This is also known as a logical "AND". You can see the results of multiple filters in the sample preview as soon as all filters are "saved".


Citing DataShop and Datasets

You can find dataset-specific citation guidance on the Citation page (Dataset Info > Citation) in DataShop or in the text file included with each export of data. This information is taken from the Dataset Info fields "Acknowledgement for Secondary Analysis" and "Preferred Citation for Secondary Analysis", which are settable by researchers who have edit access to the dataset. General citation guidance is given below.

To cite the DataShop web application and repository:

Please include the following reference in your publication:

Koedinger, K.R., Baker, R.S.J.d., Cunningham, K., Skogsholm, A., Leber, B., Stamper, J. (2010). A Data Repository for the EDM community: The PSLC DataShop. In Romero, C., Ventura, S., Pechenizkiy, M., Baker, R.S.J.d. (Eds.) Handbook of Educational Data Mining. Boca Raton, FL: CRC Press.
[PDF, 427 KB]

You might also cite our URL in the text of your paper:

For exploratory analysis, I used DataShop, available at (Koedinger et al., 2010).

To cite a dataset for secondary analysis:

First determine if a citation or acknowledgement is given for the dataset you are using. You can find dataset-specific citation guidance on the Citation page (Dataset Info > Citation) in DataShop.

If no citation or acknowledgement is shown there, you will need to determine a primary paper for the dataset to reference. A primary paper is an article published by the owner of the dataset containing their analysis. Many datasets will have such a paper attached to the dataset under the Files tab. If you're not sure which paper the primary researcher(s) would like to have cited (or if no paper is listed), contact us and we will put you in touch with them or determine an appropriate paper for citing.

To cite a dataset for secondary analysis that does not have an owner, preferred citation, or acknowledgement (for example, a "course" dataset collected over the duration of a school year), determine an appropriate "global" paper for the domain. For example, to reference a mathematics course dataset, you could cite a recent book chapter on Cognitive Tutors.