# Class05 ## Execucomp overview - Data source: company’s annual proxy (DEF14A SEC form) - The dataset starts from 1992 - Universe - US firms - S&P500 from 1992 and S&P1500 from 1994 - There is no ADR firms (ADRs do not need to submit proxy to SEC) ### Web access WRDS --> Compustat - Capital IQ --> Execucomp ### Identifiers - GVKEY - CUSIP ### Important variables
YEAR
This is the fiscal year in Compustat Fundamentals Annual (Quarterly)
CEOANN
Historical flag to indicate CEO in a particular year
TITLEANN
Historical flag to describe the title
It is useful to determin if the executive officer serves as chairman of the board
EXECID
Permanent ID for each executive
This ID does not change even the person moved to another company
EXECDIR
Flag to indicate whether executive officer is also a director
GENDER
It indicates female or male.
- Understand CEOANN vs PCEO and AGE vs PAGE |EXEC_FULLNAME |YEAR |CEOANN |PCEO |AGE |PAGE | |----------------|-----|-------|-----|----|-----| |Timothy D. Cook |2010 | |CEO |49 |59 | |Timothy D. Cook |2011 | |CEO |50 |59 | |Timothy D. Cook |2012 |CEO |CEO |51 |59 | |Timothy D. Cook |2013 |CEO |CEO |52 |59 | |Timothy D. Cook |2014 |CEO |CEO |53 |59 | |Timothy D. Cook |2015 |CEO |CEO |54 |59 | |Timothy D. Cook |2016 |CEO |CEO |55 |59 | |Timothy D. Cook |2017 |CEO |CEO |56 |59 | |Timothy D. Cook |2018 |CEO |CEO |57 |59 | |Timothy D. Cook |2019 |CEO |CEO |58 |59 | |Timothy D. Cook |2020 |CEO |CEO |59 |59 | ## Python - Import packages ```{jupyter-execute} import pandas as pd import numpy as np ``` - Import Execucomp data ```{jupyter-execute} url = 'https://www.dropbox.com/s/u1safsdjh4n7pt3/dataucomp.txt?dl=1' data = pd.read_csv(url, sep='\t', encoding='ISO-8859-1') len(data) ``` ```{attention} You will get UnicodeDecodeError if you do not use encoding argument when importing data. This is caused by special charcater, e.g. é. See the link to learn about encoding: https://en.wikipedia.org/wiki/ISO/IEC_8859-1 ``` - Clean the data ```{jupyter-execute} data.columns = data.columns.str.lower() data = data.drop_duplicates(['gvkey', 'year', 'execid']) len(data) ``` - CEO duality ```{jupyter-execute} data['chair_id'] = np.where(data['titleann'].str.contains('chmn|chairman'), 1, 0) data['duality'] = np.where((data['ceoann']=='CEO') & (data['chair_id']==1), 1, 0) ``` - Percentage of female executives or CEOs ```{jupyter-execute} # Count number of executives pct_female = (data.groupby(['gvkey', 'year', 'gender']) ['execid'].count().unstack()) # Fill missing values pct_female = pct_female.fillna(0) # Total number of executives pct_female['total'] = pct_female['FEMALE'] + pct_female['MALE'] # Count number of female CEO female_ceo = (data[(data['ceoann']=='CEO') & (data['gender']=='FEMALE')] .groupby(['gvkey', 'year'])['execid'].count().to_frame('n_female_ceo')) # Merge datasets pct_female = pct_female.join(female_ceo, how='left') # Fill missing values pct_female = pct_female.fillna(0) # Percentage of female executives pct_female['pct_female'] = pct_female['FEMALE'] / pct_female['total'] # Percentage of female CEOs pct_female['pct_female_ceo'] = pct_female['n_female_ceo'] / pct_female['total'] # Calculate averge percentage by year pct_female = (pct_female.groupby(level=1) [['pct_female', 'pct_female_ceo']].mean()) # Average during the sample period pct_female.mean() ``` ## Stata - Import Execucomp data ```{code-block} stata local data_url "https://www.dropbox.com/s/u1safsdjh4n7pt3/dataucomp.txt?dl=1" import delimited "`data_url'", clear ``` - Clean the data ```{code-block} stata duplicates drop gvkey year execid, force ``` - CEO duality ```{code-block} stata gen chair_id = 1 if strpos(titleann, "chmn")>0 | strpos(titleann, "chairman")>0 gen duality = 1 if chair_id==1 & ceoann=="CEO" replace duality = 0 if duality==. ``` - Percentage of female executives or CEOs ```{code-block} stata /* Count total number of executives */ bysort gvkey year: egen n_total = count(execid) /* Count number of female executives */ gen gender_id = 1 if gender=="FEMALE" replace gender_id = 0 if gender=="MALE" bysort gvkey year: egen n_female = sum(gender_id) /* Count number of female CEO */ gen gender_ceo = 1 if gender=="FEMALE" & ceoann=="CEO" replace gender_ceo = 0 if gender_ceo == . bysort gvkey year: egen n_female_ceo = sum(gender_ceo) /* Keep firm level data */ duplicates drop gvkey year, force /* Percentage of female executives */ gen pct_female = n_female / n_total /* Percentage of female CEO */ gen pct_female_ceo = n_female_ceo / n_total /* Calculate average percentage by year */ collapse (mean) pct_female=pct_female pct_female_ceo=pct_female_ceo, by(year) /* Average during the sample period */ tabstat pct_female pct_female_ceo, s(mean) ```