SAS: Visualization

- 7 mins

Overview

Single-Cell Plotting

Multiple-Cell Plotting

Reporting


Single-Cell Plotting

Syntax:

PROC SGPLOT DATA = data-set <options>; 
	PLOTNAME-1 <plot statements> < / OPTIONS>;
	PLOTNAME-2 <plot statements> < / OPTIONS>;
	...
RUN;

1. One Variable

Continuous

Plot Syntax
Histogram HISTOGRAM var </ OPTIONS>;
Density Density var </ OPTIONS>;
Box Plot HBOX var </ OPTIONS>;
VBOX var </ OPTIONS>;

Discrete

Plot Syntax
Bar Chart HBAR var </ OPTIONS>;
VBAR var </ OPTIONS>;

2. Two Variables

Continuous x, Continuous y

Plot Syntax
Scatter Plot SCATTER X=x-var Y=y-var </ OPTIONS>;
High-Low HIGHLOW <X=x-var|Y=y-var> HIGH=numeric-var LOW=numeric-var </ OPTIONS>;
Series SERIES X=x-var Y=y-var </ OPTIONS>;
Line Chart VLINE x-var /RESPONSE=y-var </ OPTIONS>;
HLINE x-var /RESPONSE=y-var </ OPTIONS>;
HEATMAP HEATMAP X=x-var Y=y-var </ options>;
HEATMAPPARM HEATMAPPARM X=x-var Y=y-var </ options>;
Bubble BUBBLE X=x-var Y=y-var </ options>;
Box Plot HBOX var </ OPTIONS>;
VBOX var </ OPTIONS>;
Needle NEEDLE X=x-var Y=y-var </ options>;

Discrete x, Continuous y

Plot Syntax
Bar Chart HBAR x-var / RESPONSE=y-var </ OPTIONS>;
VBAR x-var / RESPONSE=y-var </ OPTIONS>;
Dot Plot Dot x-var / RESPONSE=y-var </ OPTIONS>;
Waterfall WATERFALL x-var / RESPONSE=y-var </ OPTIONS>;
Pie Chart PROC SGPIE
PIE x-var / RESPONSE=y-var </ OPTIONS>;
Donut PROC SGPIE
DONUT x-var / RESPONSE=y-var </ OPTIONS>;

Discrete x , Discrete y

Plot Syntax
Bubble BUBBLE X=x-var Y=y-var </ options>;
Text TEXT X=x-var Y=y-var TEXT=y-var </ options>;

3. Axis table

From SAS 9.4, you can start using Axistable statement which can generate an annotation table alongside the axis. For example, Box plot with axistable.

If you are using SAS 9.3, the trick would be creating an annotation table before plotting. For example, Risk tables, annotated or not.

4. Maps

To be continued …

Examples (Single-Cell Plotting)

Histogram:

PROC SGPLOT DATA=sashelp.pricedata;
	TITLE "This is the tile";
	WHERE REGIONNAME = 'Region1';
	HISTOGRAM PRICE /BINWIDTH=10 DATALABEL=count FILLATTRS=(COLOR='#CDCDCD');
	DENSITY price/ TRANSPARENCY=0.7 LINEATTRS=(COLOR='#3F3059'); /* type = kernel */
RUN;

Bar Chart:

data test;
	set sashelp.burrows;
	if status = -1 or missing(status) then status = 1;
	if z < 531 or missing(z) then a = 'A';
	if z >= 531 then a = 'B';
run;
* frequency table;
proc freq data=test;
	tables a*status/ out=test1 outpct;
run;
proc sgplot data=test1;
	vbar status / response=pct_col group=a barwidth=0.4;
	xaxistable pct_col/ x = status ;
	keylegend/ location=outside position=topright;
run;

High-Low:

PROC SGPLOT DATA=sashelp.stocks;
	WHERE stock='IBM' AND YEAR(date)=2005;
	HIGHLOW X=date HIGH=high LOW=low / CLOSE=close LINEATTRS=(color='maroon');
run;

Bubble:

PROC SGPLOT DATA = sashelp.class;
	BUBBLE X = height Y = weight SIZE =age / GROUP = sex;
RUN;

Scatter:

%LET color1 = '#3F3059';        
%LET color2 = cxA23A2E;
TITLE "Origin in {Europe, USA}";
PROC SGPLOT DATA=sashelp.cars;
WHERE origin^='Asia' && type^="Hybrid";                 
   STYLEATTRS DATACONTRASTCOLORS = (&color1 &color2);
   SCATTER X=weight Y=mpg_city / GROUP=Origin MARKERATTRS=(SYMBOL=CircleFilled);
   KEYLEGEND / LOCATION=inside POSITION=TopRight ACROSS=1;
RUN;

Dot Plot:

PROC SGPLOT DATA=sashelp.cars; 
  DOT TYPE / RESPONSE=horsepower LIMITS=both STAT=mean 
    MARKERATTRS=(SYMBOL=circlefilled SIZE=9); 
    XAXIS grid;
    YAXIS DISPLAY=(nolabel) OFFSETMIN=0.1;
    KEYLEGEND / LOCATION=inside POSITION=topright ACROSS=1;
RUN;

Series:

PROC SGPLOT DATA=sashelp.stocks;
	TITLE "Stock Prices 1986-2005"; 
	SERIES X=date Y=close / GROUP=stock GROUPLP=stock;
RUN;

Line Chart:

PROC SGPLOT DATA=sashelp.class;
	TITLE "Height by Age and Sex";
	VLINE AGE / RESPONSE=height STAT=mean MARKERS GROUP=sex;
RUN;

Regression:

TITLE "Regression plot";
PROC SGPLOT DATA=sashelp.cars;
	REG X=horsepower Y=enginesize / GROUP=cylinders;
	WHERE cylinders IN (4 6 8);
RUN;

Box Plot:

PROC SGPLOT DATA=sashelp.cars; 
	TITLE "Price by Car Type"; 
	HBOX msrp / GROUP=type MEANATTRS=(SYMBOL=star) CATEGORY=type;
RUN;

Heat Map:

PROC SGPLOT DATA=sashelp.heart; 
	HEATMAP X=weight Y=cholesterol / NXBINS=80 NYBINS=80;
RUN;


Multiple-Cell Plotting

SGPANEL

The SGPANEL procedure creates a panel of graph cells for the values of one or more classification variables. For example, if a data set contains three variables (A, B and C) and you want to compare the scatter plots of B*C for each value of A, then you can use the SGPANEL procedure to create this panel. The SGPANEL procedure creates a layout for you automatically and splits the panel into multiple graphs if necessary.

Syntax:

PROC SGPANEL DATA=data <options>;
	PANELBY var;
	<options>;
	<graph> ...;
RUN;

SGSCATTER

The SGSCATTER procedure creates a paneled graph of scatter plots for multiple combinations of variables, depending on the plot statement that you use.

Syntax:

PROC SGSCATTER DATA=data <options>;
	<statement>
RUN;

Examples (Multiple-Cell Plotting)

SGPanel:

TITLE1 "PRODUCT SALES";
PROC SGPANEL DATA=SASHELP.PRDSALE;
  PANELBY quarter;
  ROWAXIS LABEL="sales";
  VBAR product / RESPONSE=predict STAT=mean
                 TRANSPARENCY=0.3;
  VBAR product / RESPONSE=actual STAT=mean
                 BARWIDTH=0.5 TRANSPARENCY=0.3;
RUN; 
TITLE1;

title1 "Distribution of Cholesterol Levels";
PROC SGPANEL DATA=sashelp.heart;
  PANELBY weight_status sex / LAYOUT=lattice
                              NOVARNAME;
  HBOX cholesterol;
RUN; 
title1;

SGScatter:

PROC SGSCATTER DATA=sashelp.cars;
  COMPARE Y=mpg_highway
          X=(weight enginesize horsepower)
          / GROUP=type;
RUN;

PROC SGSCATTER DATA=sashelp.iris
               (WHERE=(species EQ "Virginica"));
MATRIX PETALLENGTH PETALWIDTH SEPALLENGTH
       / ELLIPSE=(TYPE=mean)
         DIAGONAL=(HISTOGRAM KERNEL);
RUN;


Reporting

ODS LAYOUT

ODS LAYOUT ABSOLUTE:
Absolute layout is perfectly suited for static types of output that can be printed on a single page where you want output placed in a specific location.

ODS LAYOUT ABSOLUTE <options>;
   ODS REGION;
      ...;
      
   ODS REGION;
      ...;
ODS LAYOUT END;

ODS LAYOUT GRIDDED:
Gridded layout for dynamically sized regions can accommodate dynamic data, can span more than one page, and allows for easier alignment. Programs created using gridded layout are easier to maintain than those created using absolute layout.

ODS LAYOUT GRIDDED ROWS=n-row COLUMNS=n-col <options>;
   ODS REGION ROW=row-index COLUMN=col-index;
      ...;
      
   ODS REGION ROW=row-index COLUMN=col-index;
      ...;
ODS LAYOUT END;

Examples

ODS LAYOUT GRIDDED:

OPTIONS NONUMBER NODATE;
ODS PDF FILE='test.pdf';
TITLE 'THIS IS TITLE1';
FOOTNOTE 'THIS IS FOOTNOTE1';
/* Page 1 in the PDF*/
ODS LAYOUT GRIDDED;

ODS REGION ROW=1 COLUMN=1;
TITLE 'THIS IS THE REGION TITLE';
FOOTNOTE 'THIS IS THE REGION FOOTNOTE';
PROC PRINT DATA=sashelp.class(OBS=10);
RUN;

ODS REGION ROW=2 COLUMN=1 WIDTH=3in;
PROC SGPLOT DATA=sashelp.stocks;
    TITLE "Stock Prices 1986-2005"; 
    SERIES X=date Y=close / GROUP=stock GROUPLP=stock;
RUN; 

ODS LAYOUT END;
/* End of Page 1 */
ODS PDF CLOSE; 


Appendix

Reference

Single-Cell

Multiple-Cell

Interactive Dashboard

JMP
SAS®Visual Analytics
Excel

Zhijian Liu

Zhijian Liu

A foodaholic

comments powered by Disqus
rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora