Saving variable labels before -collapse-

collapse literally collapses the dataset  into a dataset of summary statistics. After collapse, the dataset in memory is lost unless -preserve- was declared. Also, the labels of all variables in clist are replaced with (stat) variable name, where stat can be mean, sum, etc. (see help collapse).

Instead of retyping all the variable labels, you can use the extended macro function var label (see help extended_fcn) to  save the variable label of each variable in the varlist into a local macro before collapse. Restore the labels using label var (see help label). Example:
sysuse auto, clear

foreach var of varlist * {
    local vlab`var': var label `var'

collapse price - gear_ratio, by(foreign)

foreach var of varlist * {
    label var `var' "`vlab`var''"

3 Responses

  1. […] not just use -collapse-, a built-in command that also converts data into means, sums, etc.? Sure, you may use -collapse-. […]

  2. See also the FAQ at

    Your code poses two potential problems.

    First, if the original variable label did not exist, you end up with an empty label. That is not necessarily an improvement. The work-around, illustrated in the FAQ, is to use the original variable name if there was no label.

    Second, if your variable names are near the length limit, then the local macro names in this will be problematic, as your name adds four characters “vlab” to the existing variable name. That also applies to the code in the FAQ, but less so.

    An industrial strength program for this would call the macros holding the labels something like label1, label2, etc.

    Clearly, neither of these problems bites in the auto dataset, but they could bite elsewhere.

Leave a Reply