/* Title: Deleting Select Cases From a Detail Individual Return File Created: August 2014 Author: Pam Borkowski-Valentin, Swarthmore College Office of Institutional Research */ Comment We need to change the NSC dates into standard dates As some fields will have blank cells (graduation, no record found), expect warnings. ALTER TYPE SearchDate (A8). ALTER TYPE EnrollmentBegin (A8). ALTER TYPE EnrollmentEnd (A8). ALTER TYPE GraduationDate (A8). RENAME VARIABLES RequesterReturnField= ID. EXECUTE. * Date and Time Wizard: EnrollmentBegin_Date. COMPUTE EnrollmentBegin_Date=date.dmy(number(substr(ltrim(EnrollmentBegin),7,2),f2.0), number(substr(ltrim(EnrollmentBegin),5,2),f2.0), number(substr(ltrim(EnrollmentBegin),1,4),f4.0)). VARIABLE LABELS EnrollmentBegin_Date 'Enrollment Begin'. VARIABLE LEVEL EnrollmentBegin_Date (SCALE). FORMATS EnrollmentBegin_Date (ADATE10). VARIABLE WIDTH EnrollmentBegin_Date(10). EXECUTE. * Date and Time Wizard: EnrollmentEnd_Date. COMPUTE EnrollmentEnd_Date=date.dmy(number(substr(ltrim(EnrollmentEnd),7,2),f2.0), number(substr(ltrim(EnrollmentEnd),5,2),f2.0), number(substr(ltrim(EnrollmentEnd),1,4),f4.0)). VARIABLE LABELS EnrollmentEnd_Date 'Enrollment End'. VARIABLE LEVEL EnrollmentEnd_Date (SCALE). FORMATS EnrollmentEnd_Date (ADATE10). VARIABLE WIDTH EnrollmentEnd_Date(10). EXECUTE. * Date and Time Wizard: SearchDate_Date. COMPUTE SearchDate_Date=date.dmy(number(substr(ltrim(SearchDate),7,2),f2.0), number(substr(ltrim(SearchDate),5,2),f2.0), number(substr(ltrim(SearchDate),1,4),f4.0)). VARIABLE LABELS SearchDate_Date 'Search Date'. VARIABLE LEVEL SearchDate_Date (SCALE). FORMATS SearchDate_Date (ADATE10). VARIABLE WIDTH SearchDate_Date(10). EXECUTE. * Date and Time Wizard: GraduationDate_Date. COMPUTE GraduationDate_Date=date.dmy(number(substr(ltrim(GraduationDate),7,2),f2.0), number(substr(ltrim(GraduationDate),5,2),f2.0), number(substr(ltrim(GraduationDate),1,4),f4.0)). VARIABLE LABELS GraduationDate_Date 'Graduation Date'. VARIABLE LEVEL GraduationDate_Date (SCALE). FORMATS GraduationDate_Date (ADATE10). VARIABLE WIDTH GraduationDate_Date(10). EXECUTE. Comment This syntax calculates the number of calendar days that the student was enrolled. * Date and Time Wizard: Calendar_days_enrolled. COMPUTE Calendar_days_enrolled=DATEDIF(EnrollmentEnd_Date, EnrollmentBegin_Date, "days"). VARIABLE LABELS Calendar_days_enrolled. VARIABLE LEVEL Calendar_days_enrolled (SCALE). FORMATS Calendar_days_enrolled (F5.0). VARIABLE WIDTH Calendar_days_enrolled(5). EXECUTE. Comment This syntax sets the term based on the Enrollment Begin Date Feel free to adjust the terms to your specifications Here, the Enrollment Begin Date for the Fall/Spring Term is August 15 to April 30 and the Enrollment Begin Date for the Summer Term is May 1 to August 14. COMPUTE EnrollmentBegin_Month=XDATE.MONTH(EnrollmentBegin_Date). EXECUTE. ALTER TYPE EnrollmentBegin_Month (F2.0). COMPUTE EnrollmentBegin_Day=XDATE.MDAY(EnrollmentBegin_Date). EXECUTE. ALTER TYPE EnrollmentBegin_Day (F2.0). IF (ANY(EnrollmentBegin_Month,5,6,7) | (EnrollmentBegin_Month=8 & EnrollmentBegin_Day < 15)) TermType=2. EXECUTE. DO IF (ANY(EnrollmentBegin_Month,9,10,11,12,1,2,3,4) | (EnrollmentBegin_Month=8 & EnrollmentBegin_Day >= 15)). RECODE TermType (MISSING=1). END IF. EXECUTE. VALUE LABELS TermType 1 'Spring or Fall' 2 'Summer'. Execute. COMMENT Way to delete #1 If you delete the only Y record that a student has, they will have no records in your return file This first way deletes all "N" records as well as the Y records that you have chosen to delete The resulting file will only have Y records (for both enrollment & for graduation, as we are making sure to keep those records as well) and then you can unduplicate based on your project criteria - - NOTE that there is a begin & end date range that you will need to specify for your particular project The begin enrollment date allows you to make sure you are not picking up courses (such as, for example, late ending summer courses) that you may not want in your file. Comment This first section does not delete the cases We are keeping: Graduated=Y Enrolled > 30 days for a fall or spring term Enrolled > 10 days for a summer term In date range. USE ALL. COMPUTE filter_$=((TermType=1 & Calendar_days_enrolled > 30) | (TermType=2 & Calendar_days_enrolled > 10) | (Graduated="Y")). VARIABLE LABELS filter_$ 'Count as enrolled or graduated or no record (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. USE ALL. COMPUTE filter_$=(RANGE (EnrollmentBegin_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) | RANGE(GraduationDate_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013))). VARIABLE LABELS filter_$ 'RANGE (EnrollmentBegin_Date,DATE.MDY(08,15,2012), DATE.MDY '+ '(12,31,2013)) | RANGE(GraduationDate_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. Comment Here's the same, but it does delete cases Again, NOTE that there is a begin & end date range that you will need to specify for your particular project. FILTER OFF. USE ALL. SELECT IF ((TermType=1 & Calendar_days_enrolled > 30) | (TermType=2 & Calendar_days_enrolled > 10) | (Graduated="Y")). EXECUTE. FILTER OFF. USE ALL. SELECT IF (RANGE (EnrollmentBegin_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) | RANGE(GraduationDate_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013))). EXECUTE. Comment Way to delete #2 This will keep the N records, and change a RecordFound=Y to N if the person has only that one record We first need to get a count of the number of records people have. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /Num_Cases=N. Comment This first section does not delete the cases We are keeping: Graduated=Y RecordFoundYN=N Enrolled > 30 days for a fall or spring term Enrolled > 10 days for a summer term Have only one RecordFound=Y record ( if does not meet date criteria, we will then change this to a =N record in another step) In date range - - NOTE that there is a begin & end date range that you will need to specify for your particular project The begin enrollment date allows you to make sure you are not picking up courses (such as, for example, late ending summer courses) that you may not want in your file. USE ALL. COMPUTE filter_$=(Graduated="Y" | (TermType=1 & Calendar_days_enrolled > 30) | (TermType=2 & Calendar_days_enrolled > 10) | RecordFoundYN="N" | Num_Cases=1). VARIABLE LABELS filter_$ 'Graduated="Y" | (TermType=1 & Calendar_days_enrolled > 30) | '+ '(TermType=2 & Calendar_days_enrolled > 10) | RecordFoundYN="N" | Num_Cases=1 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. USE ALL. COMPUTE filter_$=(RANGE (EnrollmentBegin_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) | RANGE(GraduationDate_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) | RecordFoundYN="N" | Num_Cases=1). VARIABLE LABELS filter_$ 'RANGE (EnrollmentBegin_Date,DATE.MDY(08,15,2012), DATE.MDY '+ '(12,31,2013)) | RANGE(GraduationDate_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) | '+ 'RecordFoundYN="N" | Num_Cases=1 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. Comment This section deletes the cases Again, NOTE that there is a begin & end date range that you will need to specify for your particular project. FILTER OFF. USE ALL. SELECT IF (Graduated="Y" | (TermType=1 & Calendar_days_enrolled > 30) | (TermType=2 & Calendar_days_enrolled > 10) | RecordFoundYN="N" | Num_Cases=1). EXECUTE. FILTER OFF. USE ALL. SELECT IF (RANGE (EnrollmentBegin_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) | RANGE(GraduationDate_Date,DATE.MDY(08,15,2012), DATE.MDY (12,31,2013)) | RecordFoundYN="N" | Num_Cases=1). EXECUTE. Comment Here, we change the Y records with too few days to N in the RecordFoundYN variable. DO IF (Num_Cases=1 & ((TermType=1 & Calendar_days_enrolled <= 30) | (TermType=2 & Calendar_days_enrolled <= 10))). RECODE RecordFoundYN ('Y'='N'). END IF. EXECUTE. DO IF (Num_Cases=1 & EnrollmentBegin_Date > 12/31/2013). RECODE RecordFoundYN ('Y'='N'). END IF. EXECUTE. Comment While we could always select "RecordFoundYN="Y"" when running frequencies, it could also be a good idea to delete some/all of the variables that you might run frequencies on just to ensure that these values won't accidentally be included Here are a few variables that you could set to missing for those who only had one Y record that was changed to N in the previous step NOTE: All listed variables must be in your SPSS file- If you list a varialbe in the syntax but delete it from your file for some reason, this won't work . DO IF (RecordFoundYN="N"). RECODE CollegeName, @2year4year, PublicPrivate (ELSE=" "). END IF. EXECUTE. DO IF (RecordFoundYN="N"). RECODE EnrollmentBegin_Date, EnrollmentEnd_Date, GraduationDate_Date, Calendar_days_enrolled, TermType (ELSE=SYSMIS). END IF. EXECUTE.