Class05

Execucomp overview

  • Data source: company’s annual proxy (DEF14A SEC form)

  • The dataset starts from 1992

  • Universe

    • US firms

    • S&P500 from 1992 and S&P1500 from 1994

    • There is no ADR firms (ADRs do not need to submit proxy to SEC)

Web access

WRDS –> Compustat - Capital IQ –> Execucomp

Identifiers

  • GVKEY

  • CUSIP

Important variables

YEAR
This is the fiscal year in Compustat Fundamentals Annual (Quarterly)
CEOANN
Historical flag to indicate CEO in a particular year
TITLEANN
Historical flag to describe the title
It is useful to determin if the executive officer serves as chairman of the board
EXECID
Permanent ID for each executive
This ID does not change even the person moved to another company
EXECDIR
Flag to indicate whether executive officer is also a director
GENDER
It indicates female or male.
  • Understand CEOANN vs PCEO and AGE vs PAGE

EXEC_FULLNAME

YEAR

CEOANN

PCEO

AGE

PAGE

Timothy D. Cook

2010

CEO

49

59

Timothy D. Cook

2011

CEO

50

59

Timothy D. Cook

2012

CEO

CEO

51

59

Timothy D. Cook

2013

CEO

CEO

52

59

Timothy D. Cook

2014

CEO

CEO

53

59

Timothy D. Cook

2015

CEO

CEO

54

59

Timothy D. Cook

2016

CEO

CEO

55

59

Timothy D. Cook

2017

CEO

CEO

56

59

Timothy D. Cook

2018

CEO

CEO

57

59

Timothy D. Cook

2019

CEO

CEO

58

59

Timothy D. Cook

2020

CEO

CEO

59

59

Python

  • Import packages

import pandas as pd
import numpy as np
  • Import Execucomp data

url = 'https://www.dropbox.com/s/u1safsdjh4n7pt3/dataucomp.txt?dl=1'
data = pd.read_csv(url, sep='\t', encoding='ISO-8859-1')
len(data)
297233

Attention

You will get UnicodeDecodeError if you do not use encoding argument when importing data. This is caused by special charcater, e.g. é. See the link to learn about encoding: https://en.wikipedia.org/wiki/ISO/IEC_8859-1

  • Clean the data

data.columns = data.columns.str.lower()
data = data.drop_duplicates(['gvkey', 'year', 'execid'])
len(data)
297229
  • CEO duality

data['chair_id'] = np.where(data['titleann'].str.contains('chmn|chairman'), 1, 0)
data['duality'] = np.where((data['ceoann']=='CEO') & (data['chair_id']==1), 1, 0)
  • Percentage of female executives or CEOs

# Count number of executives
pct_female = (data.groupby(['gvkey', 'year', 'gender'])
    ['execid'].count().unstack())

# Fill missing values
pct_female = pct_female.fillna(0)

# Total number of executives
pct_female['total'] = pct_female['FEMALE'] + pct_female['MALE']

# Count number of female CEO
female_ceo = (data[(data['ceoann']=='CEO') & (data['gender']=='FEMALE')]
    .groupby(['gvkey', 'year'])['execid'].count().to_frame('n_female_ceo'))

# Merge datasets
pct_female = pct_female.join(female_ceo, how='left')

# Fill missing values
pct_female = pct_female.fillna(0)

# Percentage of female executives
pct_female['pct_female'] = pct_female['FEMALE'] / pct_female['total']

# Percentage of female CEOs
pct_female['pct_female_ceo'] = pct_female['n_female_ceo'] / pct_female['total']

# Calculate averge percentage by year
pct_female = (pct_female.groupby(level=1)
    [['pct_female', 'pct_female_ceo']].mean())

# Average during the sample period
pct_female.mean()
pct_female        0.068210
pct_female_ceo    0.004514
dtype: float64

Stata

  • Import Execucomp data

local data_url "https://www.dropbox.com/s/u1safsdjh4n7pt3/dataucomp.txt?dl=1"
import delimited "`data_url'", clear
  • Clean the data

duplicates drop gvkey year execid, force
  • CEO duality

gen chair_id = 1 if strpos(titleann, "chmn")>0 | strpos(titleann, "chairman")>0
gen duality = 1 if chair_id==1 & ceoann=="CEO"
replace duality = 0 if duality==.
  • Percentage of female executives or CEOs

/* Count total number of executives */
bysort gvkey year: egen n_total = count(execid)

/* Count number of female executives */
gen gender_id = 1 if gender=="FEMALE"
replace gender_id = 0 if gender=="MALE"
bysort gvkey year: egen n_female = sum(gender_id)

/* Count number of female CEO */
gen gender_ceo = 1 if gender=="FEMALE" & ceoann=="CEO"
replace gender_ceo = 0 if gender_ceo == .
bysort gvkey year: egen n_female_ceo = sum(gender_ceo)

/* Keep firm level data */
duplicates drop gvkey year, force

/* Percentage of female executives */
gen pct_female = n_female / n_total

/* Percentage of female CEO */
gen pct_female_ceo = n_female_ceo / n_total

/* Calculate average percentage by year */
collapse (mean) pct_female=pct_female pct_female_ceo=pct_female_ceo, by(year)

/* Average during the sample period */
tabstat pct_female pct_female_ceo, s(mean)