代做STAT6128 Key Topics in Social Science: Measurement and Data Computer Workshop 4

代做STAT6128 Key Topics in Social Science: Measurement and Data Computer Workshop 4 –Social Mobility帮做

STAT6128

Key Topics in Social Science: Measurement and Data

Computer Workshop 4 -Social Mobility

The data

The data we shall be using today comes from the 2006 Programme for International Student Assessment (PISA). This data is designed to be cross-nationally comparable across a wide selection of developed nations. Today we shall focus on occupations. Recall from the lectures that this is the primary outcome of interest for Sociologists. However, in PISA, we cannot measure social mobility in itself; PISA is cross-sectional data, and therefore we do not have any information on children’s eventual outcomes. Instead, we shall investigate the relationship between parental occupation and 15 year old children’s occupational expectations (what job they expect to have when they are 30 years old). So just for today, think of these expectations as if are actual outcomes. (As an aside, there has been some work by sociologists and economists who claim expectations mediate the link between social background and attainment during adulthood. So in fact this type of analysis could actually be quite interesting for our understanding of intergenerational mobility).

Start Stata. Create a do file like last week (use ‘version’ to tell Stata which version you use, use the ‘cd’ command to tell Stata from where to open data files and where to save the do file, use the ‘use’ command to open the Stata dataset PISA_IM, which you first need to download from Blackboard into the folder you name behind the ‘cd’ command.)

Like last week, write the bold command lines into your do file and the italic ones into the command window.

Country code and sample size

Once you have opened your data set type

label list Country

You receive the error command ‘value label Country not found’ . As a consequence, the data do not contain any information on which value refers to which country. Given that the data set does not contain information on the country coding,I give it to you here:

Country code	Country name
208	Denmark
276	Germany
352	Iceland
380	Italy
410	Korea
442	Luxembourg
554	New Zealand
616	Poland
620	Portugal
792	Turkey

Type

tab Country

You see a table giving the 3-digit country code. Each number in this first column represents one country. The second column gives you the sample size per country, the third column the percentage of the sample per country.

Measurement of Occupation (Ganzeboom Index)

As mentioned in the lecture, there are many different ways one can “measure” (or rank) occupations. The main method PISA uses is the Ganzeboom ISEI indexof social class. This is a “continuous” measure of occupational prestige, and basically ranks occupations through their impact on people’s income.

To begin, we use this as our measure of occupation. The three variables of interest are: Father’s occupation is labelled BFMJ

Mother’s occupation is labelled BMMJ

Child’s (expected) occupation is labelled BSMJ

Let us investigate BSMJ first. To find out more about the distribution of the variable BSMJ, type:

sum BSMJ, d

Something is wrong…….more than the top 10% of data is coded at one point (“99”).

Normally missing values in Stata are coded as “.” As such,they would be excluded in all commands. However, the original data was coded in SPSS. In SPSS, the missing were coded with the value 99. Transferring the SPSS file into Stata leads to a data point 99, since the transfer was not done properly.

Type

label list BSMJ

You see that 97 and 99 values attributed to the variable are coded as missing values.

If the SPSS data had been transferred properly into Stata format, the missing values should be coded ‘ . ’

We will do that now ourselves.

Type

gen bsmj=BSMJ

(you generate a variable that has exactly the same values as your original BSMJ variable)

replace bsmj=. if BSMJ>96

Now type

sum bsmj, d

Compare this with the sum command beforehand. You see that if missing values are properly coded in Stata (with a ‘.’) then Stata does not show them.

Sometimes you might want to see them though. In this case you can type

tab bsmj, m

The m here tells Stata you want to see the missings. You see, that 17 % of values are missing for children’s expected occupation.

Also the variables BFMJ and BMMJ have allocated the values 97 and 99 to missings. Please independently try to create a variable bfmj and bmmj that have the missing values coded properly as ‘ . ’. The solution is given on the next page.

gen bfmj=BFMJ

replace bfmj=. if BFMJ>96

gen bmmj=BMMJ

replace bmmj=. if BMMJ>96

We now want to see how children’s expected occupation is associated with their parents’ occupation. As our measures are “continuous”, we shall use OLS regression.

Firstly, we need to take into account PISA’s complex sampling design. We covered this last week. The PISA survey design uses clustered sampling: first schools are selected and then students within schools. Clustering increases the standard error. We therefore need to tell Stata to take clustering into account.

Type:

svyset SCHOOLID [pw=W_FSTUWT]

This has set up the complex survey design. Now let us perform a regression, relating fathers’ occupation to the child’s expectation. We will estimate this model using all observations from all countries. Type:

svy: regress bsmj bfmj i.Country

The prefix i. before the variable Country indicates that this is a categorical variable. In this case, we have 10 countries (10 categories) in the variable Country. Hence Stata will create 9 dummy variables.

You should get something like the following output:

The table shows you that there are 788 schools in your data (Number of PSUs), the total sample size is 37,560 students.

Now interpret this table. Which country is the reference country? (Tip: look at the table with the country codes given beforehand)

The coefficient of interest is the one associated to BFMJ. It is positive and statistically significant. This suggests that a 1 point increase in fathers Ganzeboom index is associated with a 0.234 point increase in the child’s Ganzeboom index.

Remember, last week we talked in the lecture briefly about how to interpret regression results. The Ganzeboom index lacks a natural metric (scale). How could we give some more meaning to our results here? We could express the change in the Ganzeboom index in terms of standard deviations.

Find the standard deviations of bfmj and bsmj by typing:

svy: mean bfmj

estat sd

svy:mean bsmj

estat sd

You will receive the following results:

	Mean	Standard deviation
bfmj	42.73	15.86
bsmj	60.59	16.81

Question:

If the fathers Ganzeboom index increases by one standard deviation, by how many standard deviations will the child’s index increase? You know that a 1 point increase in the father’s index increases the child’s index by 0.234 points.

0.234*15.86=3.71

Hence if father’s index increases by one standard deviation, the child index increases by 3.71 points. We can express the 3.71 points in standard deviations:

3.71/16.81=0.22 Result:

If the father’s Ganzeboom index increases by one standard deviation, the child’s index increases by 0.2 standard deviations.

In conclusion our regression results show that from an intergenerational mobility perspective, we can say that children of fathers with higher ranking occupations enter (or at least “expect to enter”) better jobs.

How does this vary across developed nations? To get a rough idea (and only this time ignoring the complex sampling design), type:

bysort Country: regress bsmj bfmj

tab Country,gen(C)

forval i=1(1)10{

svy, subpop(C`i'): regress bsmj bfmj

}

This generates a set of dummy variables for each country (named C1-C10); then uses a loop to execute a svy:regress command for each of these countries.

This has reproduced the analysis for each individual country. Notice the relationship is weakest in Turkey (country 792) and Korea (country 410). It seems that the jobs children “expect” to enter in these countries are not strongly associated with their father’s occupation. On the other hand, in Poland (country 616) the relationship is particularly strong.

Alternative measure of occupation

Perhaps in this case another way of measuring occupation may also be suitable.

The PISA dataset contains an alternative measure of occupation; 4 digit ISCO codes. This is the ILO classification of occupation, look at the following webpage:

http://www.ilo.org/public/english/bureau/stat/isco/index.htm

This data is very interesting because of its detail. Occupations are defined into over 300 categories. However, for today we will convert this into a binary measurement

(“Professional” and “Non-Professional” jobs). In other words, we will examine the

relationship between whether a child is expecting to enter a professional job and whether the child’s parents have a professional job. (We could go further by using logistic regression to investigate this relationship. We will examine logistic regression in a later workshop.)

Let us start with this conversion. Create a variable called Student_Pro, which has the value 1 if the variable Student_Occ_ICSO is below 3000 (that means the student aims to become a “Professional”) and it is 0 if the value of Student_Occ_ICSO is 3000 and above. In

addition, give the newly created variable Student_Pro a missing value ‘ .’, if the value of a Student_Occ_ICSO is 9999. First, try yourself to create this variable Student_Pro. If you do not manage the code is given on the next page.

gen Student_Pro=.

replace Student_Pro=0 if Student_Occ_ICSO>2999

replace Student_Pro=1 if Student_Occ_ICSO<3000

replace Student_Pro=. if Student_Occ_ICSO==9999

Now create the variable Father_Pro and Mother_Pro using the same specification:

gen Father_Pro=.

replace Father_Pro=0 if Father_Occ_ICSO>2999

replace Father_Pro=1 if Father_Occ_ICSO<3000

gen Mother_Pro=.

replace Mother_Pro=0 if Mother_Occ_ICSO>2999

replace Mother_Pro=1 if Mother_Occ_ICSO<3000

Now type the following:

svy:tabulate Father_Pro Student_Pro , row

svy:tabulate Mother_Pro Student_Pro , row

What do these results show?

Up to now, we have looked at all countries together. Now let’s examine Poland and Korea separately.

Start with Korea. Type:

svy:tabulate Father_Pro Student_Pro if Country==410, row

svy:tabulate Mother_Pro Student_Pro if Country==410, row

Then do the same for Poland (code 616).

What results do you find? Compare the tables.

Measurement Error

We shall finish this part of the workshop by briefly considering the role of measurement error. Firstly, recall from the lectures that children act as proxy respondents for their parents. That is, it is children who report their parents’ education and occupation (not the parents themselves). Children may not always report this correctly.

For this set of countries, however, data has been collected from both the parent and the child (note this was not done for all countries, and was not done in the PISA 2000 or 2003 waves). We can therefore investigate how well children report their parents’ occupation. In particular,

Parent_Report_Father_Occ_ICSO is fathers’ reports of their own occupation

Parent_Report_Father_Pro is fathers’ reports about whether they are a professional Parent_Report_Mother_Occ_ICSO is mothers’ reports of their own occupation

Parent_Report_Mother_Pro is mothers’ reports about whether they are a professional

Let’s consider whether children can accurately report if their mother or father is a professional. Type (ALL ON ONE LINE):

tab Parent_Report_Father_Pro Father_Pro if Parent_Report_Father_Pro!=. &

Father_Pro!=., col

Look at the main diagonal (top left to bottom right). If there was no measurement error, all observations would be in these cells. Instead, we can see some misclassification: children report their father to be a professional when he is not (and viceversa). This is of course assuming that parents accurately report their own occupation …

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030

联系我们

EMail: 99515681@qq.com

QQ: 99515681

留学生作业帮-留学生的知心伴侣！

工作时间：08:00-21:00

微信客服：codinghelp

热门主题

课程名