SAS: Visualization
- 7 minsOverview
Single-Cell Plotting
Syntax:
PROC SGPLOT DATA = data-set <options>;
PLOTNAME-1 <plot statements> < / OPTIONS>;
PLOTNAME-2 <plot statements> < / OPTIONS>;
...
RUN;
1. One Variable
Continuous
Plot | Syntax |
---|---|
Histogram | HISTOGRAM var </ OPTIONS>; |
Density | Density var </ OPTIONS>; |
Box Plot | HBOX var </ OPTIONS>; VBOX var </ OPTIONS>; |
Discrete
Plot | Syntax |
---|---|
Bar Chart | HBAR var </ OPTIONS>; VBAR var </ OPTIONS>; |
2. Two Variables
Continuous x, Continuous y
Plot | Syntax |
---|---|
Scatter Plot | SCATTER X=x-var Y=y-var </ OPTIONS>; |
High-Low | HIGHLOW <X=x-var|Y=y-var> HIGH=numeric-var LOW=numeric-var </ OPTIONS>; |
Series | SERIES X=x-var Y=y-var </ OPTIONS>; |
Line Chart | VLINE x-var /RESPONSE=y-var </ OPTIONS>; HLINE x-var /RESPONSE=y-var </ OPTIONS>; |
HEATMAP | HEATMAP X=x-var Y=y-var </ options>; |
HEATMAPPARM | HEATMAPPARM X=x-var Y=y-var </ options>; |
Bubble | BUBBLE X=x-var Y=y-var </ options>; |
Box Plot | HBOX var </ OPTIONS>; VBOX var </ OPTIONS>; |
Needle | NEEDLE X=x-var Y=y-var </ options>; |
Discrete x, Continuous y
Plot | Syntax |
---|---|
Bar Chart | HBAR x-var / RESPONSE=y-var </ OPTIONS>; VBAR x-var / RESPONSE=y-var </ OPTIONS>; |
Dot Plot | Dot x-var / RESPONSE=y-var </ OPTIONS>; |
Waterfall | WATERFALL x-var / RESPONSE=y-var </ OPTIONS>; |
Pie Chart | PROC SGPIE PIE x-var / RESPONSE=y-var </ OPTIONS>; |
Donut | PROC SGPIE DONUT x-var / RESPONSE=y-var </ OPTIONS>; |
Discrete x , Discrete y
Plot | Syntax |
---|---|
Bubble | BUBBLE X=x-var Y=y-var </ options>; |
Text | TEXT X=x-var Y=y-var TEXT=y-var </ options>; |
3. Axis table
From SAS 9.4, you can start using Axistable statement which can generate an annotation table alongside the axis. For example, Box plot with axistable.
If you are using SAS 9.3, the trick would be creating an annotation table before plotting. For example, Risk tables, annotated or not.
4. Maps
To be continued …
Examples (Single-Cell Plotting)
Histogram:
PROC SGPLOT DATA=sashelp.pricedata;
TITLE "This is the tile";
WHERE REGIONNAME = 'Region1';
HISTOGRAM PRICE /BINWIDTH=10 DATALABEL=count FILLATTRS=(COLOR='#CDCDCD');
DENSITY price/ TRANSPARENCY=0.7 LINEATTRS=(COLOR='#3F3059'); /* type = kernel */
RUN;
Bar Chart:
data test;
set sashelp.burrows;
if status = -1 or missing(status) then status = 1;
if z < 531 or missing(z) then a = 'A';
if z >= 531 then a = 'B';
run;
* frequency table;
proc freq data=test;
tables a*status/ out=test1 outpct;
run;
proc sgplot data=test1;
vbar status / response=pct_col group=a barwidth=0.4;
xaxistable pct_col/ x = status ;
keylegend/ location=outside position=topright;
run;
High-Low:
PROC SGPLOT DATA=sashelp.stocks;
WHERE stock='IBM' AND YEAR(date)=2005;
HIGHLOW X=date HIGH=high LOW=low / CLOSE=close LINEATTRS=(color='maroon');
run;
Bubble:
PROC SGPLOT DATA = sashelp.class;
BUBBLE X = height Y = weight SIZE =age / GROUP = sex;
RUN;
Scatter:
%LET color1 = '#3F3059';
%LET color2 = cxA23A2E;
TITLE "Origin in {Europe, USA}";
PROC SGPLOT DATA=sashelp.cars;
WHERE origin^='Asia' && type^="Hybrid";
STYLEATTRS DATACONTRASTCOLORS = (&color1 &color2);
SCATTER X=weight Y=mpg_city / GROUP=Origin MARKERATTRS=(SYMBOL=CircleFilled);
KEYLEGEND / LOCATION=inside POSITION=TopRight ACROSS=1;
RUN;
Dot Plot:
PROC SGPLOT DATA=sashelp.cars;
DOT TYPE / RESPONSE=horsepower LIMITS=both STAT=mean
MARKERATTRS=(SYMBOL=circlefilled SIZE=9);
XAXIS grid;
YAXIS DISPLAY=(nolabel) OFFSETMIN=0.1;
KEYLEGEND / LOCATION=inside POSITION=topright ACROSS=1;
RUN;
Series:
PROC SGPLOT DATA=sashelp.stocks;
TITLE "Stock Prices 1986-2005";
SERIES X=date Y=close / GROUP=stock GROUPLP=stock;
RUN;
Line Chart:
PROC SGPLOT DATA=sashelp.class;
TITLE "Height by Age and Sex";
VLINE AGE / RESPONSE=height STAT=mean MARKERS GROUP=sex;
RUN;
Regression:
TITLE "Regression plot";
PROC SGPLOT DATA=sashelp.cars;
REG X=horsepower Y=enginesize / GROUP=cylinders;
WHERE cylinders IN (4 6 8);
RUN;
Box Plot:
PROC SGPLOT DATA=sashelp.cars;
TITLE "Price by Car Type";
HBOX msrp / GROUP=type MEANATTRS=(SYMBOL=star) CATEGORY=type;
RUN;
Heat Map:
PROC SGPLOT DATA=sashelp.heart;
HEATMAP X=weight Y=cholesterol / NXBINS=80 NYBINS=80;
RUN;
Multiple-Cell Plotting
SGPANEL
The SGPANEL procedure creates a panel of graph cells for the values of one or more classification variables. For example, if a data set contains three variables (A, B and C) and you want to compare the scatter plots of B*C for each value of A, then you can use the SGPANEL procedure to create this panel. The SGPANEL procedure creates a layout for you automatically and splits the panel into multiple graphs if necessary.
Syntax:
PROC SGPANEL DATA=data <options>;
PANELBY var;
<options>;
<graph> ...;
RUN;
SGSCATTER
The SGSCATTER procedure creates a paneled graph of scatter plots for multiple combinations of variables, depending on the plot statement that you use.
Syntax:
PROC SGSCATTER DATA=data <options>;
<statement>
RUN;
Examples (Multiple-Cell Plotting)
SGPanel:
TITLE1 "PRODUCT SALES";
PROC SGPANEL DATA=SASHELP.PRDSALE;
PANELBY quarter;
ROWAXIS LABEL="sales";
VBAR product / RESPONSE=predict STAT=mean
TRANSPARENCY=0.3;
VBAR product / RESPONSE=actual STAT=mean
BARWIDTH=0.5 TRANSPARENCY=0.3;
RUN;
TITLE1;
title1 "Distribution of Cholesterol Levels";
PROC SGPANEL DATA=sashelp.heart;
PANELBY weight_status sex / LAYOUT=lattice
NOVARNAME;
HBOX cholesterol;
RUN;
title1;
SGScatter:
PROC SGSCATTER DATA=sashelp.cars;
COMPARE Y=mpg_highway
X=(weight enginesize horsepower)
/ GROUP=type;
RUN;
PROC SGSCATTER DATA=sashelp.iris
(WHERE=(species EQ "Virginica"));
MATRIX PETALLENGTH PETALWIDTH SEPALLENGTH
/ ELLIPSE=(TYPE=mean)
DIAGONAL=(HISTOGRAM KERNEL);
RUN;
Reporting
ODS LAYOUT
ODS LAYOUT ABSOLUTE:
Absolute layout is perfectly suited for static types of output that can be printed on a single page where you want output placed in a specific location.
ODS LAYOUT ABSOLUTE <options>;
ODS REGION;
...;
ODS REGION;
...;
ODS LAYOUT END;
ODS LAYOUT GRIDDED:
Gridded layout for dynamically sized regions can accommodate dynamic data, can span more than one page, and allows for easier alignment. Programs created using gridded layout are easier to maintain than those created using absolute layout.
ODS LAYOUT GRIDDED ROWS=n-row COLUMNS=n-col <options>;
ODS REGION ROW=row-index COLUMN=col-index;
...;
ODS REGION ROW=row-index COLUMN=col-index;
...;
ODS LAYOUT END;
Examples
ODS LAYOUT GRIDDED:
OPTIONS NONUMBER NODATE;
ODS PDF FILE='test.pdf';
TITLE 'THIS IS TITLE1';
FOOTNOTE 'THIS IS FOOTNOTE1';
/* Page 1 in the PDF*/
ODS LAYOUT GRIDDED;
ODS REGION ROW=1 COLUMN=1;
TITLE 'THIS IS THE REGION TITLE';
FOOTNOTE 'THIS IS THE REGION FOOTNOTE';
PROC PRINT DATA=sashelp.class(OBS=10);
RUN;
ODS REGION ROW=2 COLUMN=1 WIDTH=3in;
PROC SGPLOT DATA=sashelp.stocks;
TITLE "Stock Prices 1986-2005";
SERIES X=date Y=close / GROUP=stock GROUPLP=stock;
RUN;
ODS LAYOUT END;
/* End of Page 1 */
ODS PDF CLOSE;
Appendix
Reference
Single-Cell
- SAS Programming for R Users
- Github::SAS Programming for R Users (Course Materials)
- SAS-doc::SGPLOT Procedure
- Gallery of Plots
- Getting Started with the SGPLOT Procedure
- Using PROC SGPLOT for Quick High Quality Graphs
- Fifty Ways to Change your Colors (in ODS Graphics)
- What colors does PROC SGPLOT use for markers?
- Bar Charts with SGPLOT
- SAS-blogs::SGPLOT
Multiple-Cell
Interactive Dashboard
JMP
SAS®Visual Analytics
Excel