For the time being, you can only use numerical data.
Non-numerical data falls in two cathegories: ordered and
non-ordered.
If you have non-ordered data you have to transform it in "dummy variables".
Example:
We have 3 possible atributes for each case : Green,Blue or Red
Case name
Caracteristic
A
Red
B
Red
C
Green
D
Blue
E
Blue
Should be changed to:
Case Name
Red
Green
Blue
A
1
0
0
B
1
0
0
C
0
1
0
D
0
0
1
E
0
0
1
If you have ordered non-numerical data (ordered values define a scale,
for example Cold,Warm,Hot ) you must change it to numerical data maintaining
the relative relationship between cathegories. Example:
Case name
Caracteristic
A
Hot
B
Hot
C
Warm
D
Cold
E
Cold
Should be changed to:
Case Name
Temperature
A
1
B
1
C
2
D
3
E
3
Another important point is that you cannot submit missing data. If
you have missing data in your database you must assign a value to it (the
average value of other cases, for instance or the median value) before
submitting it.But beware because replacing missing data is notorious for
biasing the data analysis.