DataShop - Import File Verification Tool

Table of Contents

1. Introduction [top]

DataShop can now import a tab-delimited text file of transaction data similar to that generated by the DataShop export. You may want to import data to:

2. Package Organization [top]

The DataShop Import Verification Tool package is organized in the following manner:

dist/: Contains the datashop-verify.jar file.
example/: Example import file.
/: readme.html, manifest.txt

3. Requirements [top]

In order to successfully run the Import Verification Tool, your local system must meet the following criteria:

  1. Java (version 1.6) is installed (and on the Windows path)

4. How to import data [top]

Currently, only DataShop developers can perform the import as it has not yet been built into the web application. But you can send us valid data in a tab-delimited text file and we will import it for you.

Before sending us a file to import, use this tool to verify that the file is valid.

To verify that your import file is valid:

  1. At a command prompt, enter
    java -jar path/to/datashop-verify.jar -filename example-import.txt
    where path/to/datashop-verify.jar is the path to the import file verification JAR, and example-import.txt is the file you'd like to verify.

    The import file verification tool will run and provide information about the validity of your import file.

Results from the verification are printed to the console and to an output text file called datashop-verify.log.

5. Specifications for import file validity [top]

The import tool accepts tab-delimited text files as input. The format of the file mimics that of the files produced through a DataShop export, with a few differences.

Note: If you're importing a file originally obtained from a DataShop export, delete the first line (row) that lists the sample and dataset titles (e.g., Sample: All Data Dataset: Geometry. The first line should contain column headings only.

The following column headings are required in the import text file:

Optional column headings include:

5.1 Column Heading Descriptions [top]

While most of the column headings listed above are self-explanatory, some require additional description.

Time must be given in one of the following formats:

FormatExample
yyyy-MM-dd HH:mm:ss2001-07-04 12:08:56
yyyy-MM-dd HH:mm:ss.SSS z2001-07-04 12:08:56.322 PST
yyyy-MM-dd HH:mm:ss z2001-07-04 12:08:56 Pacific Standard Time
yyyy-MM-dd HH:mm z2001-07-04 12:08 PST
MMMMM dd, yyyy hh:mm:ss a zJuly 04, 2001 12:08:56 AM PST
MM/dd/yy HH:mm:ss:SSS z07/04/01 12:08:56:322 PST
MM/dd/yy HH:mm:ss z07/04/01 12:08:56 GMT-08:00
mm:ss.0 z08:56.0 PST
long1239939193
double01239939193.31

These formats utilize date and time parameters of the SimpleDateFormat class in Java 1.5. For more information about these parameters, see Sun's API for the SimpleDateFormat class. For more information on Java primitive types long and double, see Sun's Java tutorial page on Primitive Data Types.

Level represents a Dataset Level. An example of the correct use of this column heading is Level(Unit), where 'Unit' is the dataset level name. The Level column should always be of the format Level(dataset_level_name). If a dataset level name is not included, the name will default to "Default".

Condition Name and Condition Type should always be paired with each other, even if a condition does not have a condition type. The import tool will produce an error if these two columns are not paired together.

KC represents a Knowledge Component. An example of the correct use of this column heading could be KC(Area), where 'Area' is the skill model name for that knowledge component. The KC column should always be of the format KC(skill_model_name). If a skill model name is not included, the name will default to "Default".

KC Category represents a Knowledge Component Category. An example of the correct use of this column heading could be KC Category(Area), where 'Area' is the skill model name for that knowledge component. The KC Category column should always be of the format KC Category(skill_model_name). If a skill model name is not included, the name will default to "Default".

Additionally, KC and KC Category must always be paired with each other, in the same way that Condition Name and Type must be paired together.

CF represents a Custom Field. An example of the correct use of this column heading could be CF(Factor or add-m), where 'Factor or add-m' is the name for that custom field. The CF column should always be of the format CF(custom_field_name). If a custom field name is not included, the name will default to "Default".

5.2 Order of columns [top]

The import tool expects the column headings to be in a particular order. Placing columns in unexpected locations can cause the import tool to fail during processing. The expected order of column headings is as follows:

  1. Anon Student Id
  2. Session Id
  3. Time
  4. Time Zone
  5. Student Response Type
  6. Student Response Subtype
  7. Tutor Response Type
  8. Tutor Response Subtype
  9. Level()
  10. Problem Name
  11. Step Name
  12. Attempt At Step
  13. Outcome
  14. Selection
  15. Action
  16. Input
  17. Feedback Text
  18. Feedback Classification
  19. Help Level
  20. Total # Hints
  21. Condition Name
  22. Condition Type
  23. KC()
  24. KC Category()
  25. School
  26. Class
  27. CF()

5.3 Multiple similarly named columns [top]

The import tool allows for multiple columns for the following column headings:

For example, if a dataset file had multiple student-id columns, the correct column format would be:

Anon Student Id [tab] Anon Student Id
Level(), Selection, Action, Input, and CF() follow this specification. For columns that are required as pairs— Condition Name and Condition Type, or KC and KC Category—these columns must be listed in the order that they are paired. For example, if a dataset file has two condition columns, the column format would be:
Condition Name [tab] Condition Type [tab] Condition Name [tab] Condition Type.

If you have any questions please contact datashop-help@lists.andrew.cmu.edu.