CrySis protein crystallography diffraction pattern analysis dataset. Created and copy write by X6A beamline at Brookhaven National Laboratory. For academic use ONLY.

The links below currently contain 5500 binary classified images with 1341 continuous attributes divided into 5 mutually exclusive datasets (1100 points each). Please do not divide the datasets, as they contain data from duplicate images at different orientations (if you divide the datasets, one the piece will 'know something' about each other, and give an unfair advantage to the learner.)

For more statistics about the unscaled data, check out this (diffraction_desc) file.

Downloads

Format Description
Sparse Unscaled
  Min-Max Scaled
  Z-Score Scaled
   
CSV Unscaled
  Min-Max Scaled
  Z-Score Scaled

At least as far as Naive Bayes is concerned, ~5000 images seems to be sufficient for this dataset