Is it necessary to put observations in a certain order? In a number of cases, yes. The most obvious case is when you are using the qualifier -in- to specify a subset in your data. For example,

**drop**in

*1/100*/* Drops the observations from line 1 to line 100

*/*

Keeps the observations from line 30 to the last line, denoted by small letter l */

**keep**in*30/l*/If the observations were in arbitrary order, then you wouldn’t know which ones were dropped or kept, would you? This is when -sort- and -gsort- come in handy. These two put the observations in a certain order. The -sort- command put the observations in ascending order based on a specific variable or a set of variables. The basic syntax for -sort- is:

**sort**

*varlist*

If varlist is only one variable, then Stata will sort the observations in ascending order based on that variable. If there are 2 variables,

*var1*and

*var2*, after sort, Stata will sort the observations according to

*var1*first. Then, for observations with common

*var1*, Stata will sort them according to

*var2*. If there are more than 2 variables, then the observations will be sorted by the first variable first, then the second variable second, and so on. -gsort-, on the other hand, can sort the observations in either ascending or descending order. The basic syntax for -gsort- is:

**gsort**[+ or -]

*varname*[+ or -]

*varname*[+ or -]

*varname*…

A plus sign (+) before the varname instructs Stata to order the observations in ascending order, while a minus sign (-) implies descending order of observations. For example, to sort the countries by their geographical region (

*regn*) in alphabetical order and by GDP per capita (

*gdppc*), from highest to lowest:

**gsort**+

*regn*–

*gdppc*

The

*-by varlist:- prefix also requires the observations to be sorted according to the varlist. But, as we have discussed in “_n, its big brother _N, and Super -bysort-,” this can be conveniently written as:*

**bysort**

*varlist*

**:**…

or

**by**

*varlist*,

**sort:**…

Filed under: Data Management Tagged: | by, bysort, gsort, sort

## Leave a Reply