First, let's generate a dataset from scratch:
/* create a sample data set named test */
data test;
input china russia usa; /* specifies variable names */
datalines;
1 2 3
4 5 6
7 8 9
;
run;
You should have this table.
But we aren't done with data generation yet. Let's continue creating more variables.
proc means noprint; *generates means, does not print output;
var _numeric_; /*specifies all numeric variables in the current DATA step, which means, all variables*/
output out=new(drop=_type_ _freq_) mean= median= std= q1= q3= / autoname;
/* drop = _type_ _freq_ means you drop the two variables so named*/
/* additional variables generated are mean, median, std, q1, q3*/
run;
/* Transposes the dataset */
proc transpose data=new out=out;
run;
Now we have artificially created all data.
However, there are two issues:
(1) it is in "long" format. We want it in "wide" format.
(2) ideally, we don't want a single variable to take on values such as "china_Mean". We'd prefer to have two variables, the first taking on "China", "Russia", "USA", and the second taking "mean", "median", etc.
Let's deal with issue 2 first. We use the scan function and indicate (with '_') that _ is the separator. _name_ means we'll take any word, and 1 means the first instance, 2 means the second instance.
/* Separates the variables */
data separated;
set out;
varname=scan(_name_,1,'_');
stat=scan(_name_,2,'_');
drop _name_;
run;
proc sort data=separated;
by varname;
run;
The last three lines sort the data.
Next, we need to transpose the data.
proc transpose data=separated out=transposed(drop=_name_);
by varname;
id stat;
var col1;
run;
proc print data=transposed;
title 'Transposed Data!';
run;
And you can admire the finished product (for a little while, before going to learn more SAS).
Click here to download all the code of today's post.