Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

SAS subset dataset


May 27, 2021 SAS


Table of contents


Setting up sub-settings for the SAS dataset means extracting a portion of the dataset by selecting a smaller number of variables or a smaller number of observations, or at the same time selecting a smaller number of observations. T he sub-settings of the variable are completed by using the KEEP and DROP statements, and the observed sub-settings are completed using the DELETE statement. I n addition, the resulting data from the sub-setup operation is saved in a new data set that can be used for further analysis. S ub-settings are primarily used to analyze parts of a dataset without using variables or observations that may not be relevant to the analysis.

Sub-set variables

In this approach, we extract only a few variables from the entire data set.

Grammar

The basic syntax for SAS innos setting variables is:

KEEP var1 var2 ... ;
DROP var1 var2 ... ;

Here is a description of the parameters used:

  • Var1 and var2 are variable names that need to be retained or deleted in the data set

Cases

Consider the following SAS dataset that contains employee details for your organization. I f we just want to get the Name and Department values from the data set, then we can use the following code.

DATA Employee; 
  INPUT empid ename $ salary DEPT $ ; 
DATALINES; 
1 Rick 623.3 	IT 		 
2 Dan 515.2 	OPS	
3 Mike 611.5 	IT 	
4 Ryan 729.1    HR 
5 Gary 843.25   FIN 
6 Tusar 578.6   IT 
7 Pranab 632.8  OPS
8 Rasmi 722.5   FIN 
;
RUN;
DATA OnlyDept;
 SET Employee;
 KEEP ename DEPT;
  RUN;
 PROC PRINT DATA=OnlyDept; 
RUN; 

When the above code executes, we get the following output.

SAS subset dataset

You can get the same result by deleting unwanted variables. T he following code illustrates this.

DATA Employee; 
  INPUT empid ename $ salary DEPT $ ; 
DATALINES; 
1 Rick 623.3 	IT 		 
2 Dan 515.2 	OPS	
3 Mike 611.5 	IT 	
4 Ryan 729.1    HR 
5 Gary 843.25   FIN 
6 Tusar 578.6   IT 
7 Pranab 632.8  OPS
8 Rasmi 722.5   FIN 
;
RUN;
DATA OnlyDept;
 SET Employee;
 DROP empid salary;
  RUN;
 PROC PRINT DATA=OnlyDept; 
RUN; 

Sub-set observation

In this approach, we extract only a few observations from the entire data set.

Grammar

We use PROC FREQ to track the observations selected for the new dataset.

The syntax for sub-set observations is:

  IF Var Condition THEN DELETE ;

Here is a description of the parameters used:

  • Var is the name of the variable, and depending on its value, the observation is deleted using the specified criteria.

Cases

Consider the following SAS dataset that contains employee details for your organization. I f we only want to get data on employees who earn more than 700, then we use the following code.

DATA Employee; 
  INPUT empid name $ salary DEPT $ ; 
DATALINES; 
1 Rick 623.3	IT 		 
2 Dan 515.2 	OPS	
3 Mike 611.5 	IT 	
4 Ryan 729.1    HR 
5 Gary 843.25   FIN 
6 Tusar 578.6   IT 
7 Pranab 632.8  OPS
8 Rasmi 722.5   FIN 
;
RUN;
DATA OnlyDept;
 SET Employee;
 IF salary < 700 THEN DELETE;
  RUN;
 PROC PRINT DATA=OnlyDept; 
RUN; 

When executing the above code, we can get the following output.

SAS subset dataset