(Continuing from last week’s post …)

We all jump to conclusions quickly. Journalists are the most famous for it — three of anything within a week is a trend. But, in fact, we all do it, journalist or not, in an attempt to understand the world’s complexity.

So it’s normal to look at a few census tracts and think, Gee, that’s an area where I know a lot of new Chinese immigrants live (or that’s what the neighbours say). And it’s an expensive area. And it looks like a lot of people are claiming they have low incomes there. So, it must be that …

The flaw is that we don’t look beyond that. We don’t look at the neighbourhood next door, one that also has a lot of new Chinese immigrants, one that’s also in an expensive area, but one where the rate of low-income households is not that high. Or the area that is expensive and that has a high rate of low-income households but where the level of new Chinese immigrants is low.

Academic researchers run what are called regression analyses to test whether there is really a connection between different variables.

It’s a standard feature of almost any study that attempts to prove anything. Do people who do strength training have better results in retaining memory functions than people who do yoga? You need to make sure that the strength-trainers and the yoga-ites are comparable on every other scale: a mix of ages and occupations, a similar range of diets, comparable levels of crossword-puzzle and other brain-boosting activities, and so on. Otherwise you run the risk of concluding that strength training is better, when actually the kind of people who like to do strength training also do a number of other related things that tend to boost memory.

My last post got kind of long, so I skipped talking about this.

But I’m adding this to make the point to everyone trying to understand Vancouver by looking at a few census tracts at a time. It can’t be done, or at least not to the level that any serious researcher would think was credible. You have to look at the region overall and figure out, i.e. does the number of new Chinese immigrants correlate at a significantly high rate with the level of low-income households in that tract. (It’s also good if you can do these kinds of correlation over time as well i.e. as more and more new Chinese immigrants move into an area, does the rate of this or that other factor increase as well in something approaching lockstep?)

This is true for some of the research showing up related to students or homemakers being listed on land titles as owners. On the face of it, it seems weird that people in those categories are listed as owners in expensive areas. But to make the case absolutely solidly, you’d need to look at what occupations are listed everywhere. Maybe it will show that a suspiciously high number of homemakers and students are buying only in particular high-end areas. But maybe it will show that a lot of homes throughout the region are allegedly owned by homemakers and students and that there’s barely a difference between the west side and anywhere else in town.

Yes, it’s a lot of work to do that. Hardly anyone is doing that kind of work. Well, except for one of my researcher friends who has provided the analysis below. He ran the variables through the program to see whether there were any distinct correlations. There weren’t, except for very minor effects. I’ve provided his full analysis below. I can’t understand more than a quarter of it, but maybe some of you can. I can understand enough to see that, yet again, there’s no smoking gun yet.

I hope everyone gets that I’m doing this because I believe that ideas should be tested. It’s dangerous to have everyone spouting the same conventional wisdom.

**. **************************************

**. * **

**. * CENSUS TRACT DATA SETS: CORRELATES OF SHELTER COSTS > INCOME IN THE 2011 NHS – ANALYSIS FOR VANCOUVER METRO**

.

** **

**THE VARIABLE NAMES & DESCRIPTIONS **

storage display value

variable name variable label

——————

cmaname CMA name

cmauid CMA code

ctname Census Tract name

ct_gnr CT GNR (%)”

TotPriHH Total # of private HHs

AreaKm Land area in square kilometres

juris_id BCA jurisdiction code

HHdensity Density of HH – per sq km

Owners Total HHs – Owners

Renters Total HHs – Renters

HHMedInc Median HH total income

MedVal Median value of dwellings

MedRent Median monthly shelter costs for rented dwellings

Pct0_30 Pct of HH paying 0-30% of income for shelter

Pct30_99 Pct of HH paying 30-99% of income for shelter

Pct100up Pct of HH paying 100% of income or more for shelter

PctOwn30up % of owner HH spending 30% + of HH income on shelter costs

MedOwnPay Median monthly shelter costs for owned dwellings

RecImm5 Total # Recent Immigrants (Last 5 years)

RecImmInd Total # of Recent Immigrants – India

RecImmChi Total # of Recent Immigrants – China

RecImmPhil Total # of Recent Immigrants – Philippines

RecImmIran Total # of Recent Immigrants – Iran

ImmTot Total # Immigrants

TotImmInd Total # of Immigrants – India

TotImmChi Total # of Immigrants – China

TotImmPhil Total # of Immigrants – Philippines

TotImmIran Total # of Immigrants – Iran

TotImmHK Total # of Immigrants – Hong Kong

TotImmViet Total # of Immigrants – Viet Nam

. ***************************************

**. * **

**. * REGRESSIONS ON PCT OF HOUSEHOLDS (RENTERS AND OWNERS) W/ SHELTER PAYMENTS > INCOME**

These regressions test for correlates across 450 Vancouver CMA census tracts. Est. percentage in tract with shelter costs greater than income (not sure how 0 income and 0 costs are treated). Like all regressions the results assume that the other variable values re not changing so higher X keeping all other variables unchanged (which is hard in reality if you think of increasing the number of recent immigrants while keeping the total number of immigrants unchanged.

This percentage in a tract is higher in tracts with more density, higher median house values, higher median rents, and lower median income. It falls with the total number of immigrants, but rises with the number of new (past 5 years) immigrants. Higher for Chinese recent immigrants, lower for Indian, and Filipino.

But these effects are pretty small. Mean tract avg is 7.2% (standard deviation of 3.28 percentage points) of households (HH) reporting shelter costs >income.

Mean # of Chinese recent immigrants is 80, Standard deviation is 135. Increasing the number by 135 (a 168% increase) would raise the HH of people in a census tract reporting shelter costs > income from 7.22 to 8.30 (an increase of 18%). Or another way, to increase the number of households reporting shelter costs > income by one, you would need to add 106 more Chinese recent immigrants to the tract. Half of this effect is common to all recent immigrants, the marginal effect if they are Chinese is thus just half of this

. areg Pct100up HHdensity MedVal MedRent HHMedInc ImmTot RecImm5 RecImmInd RecImmChi RecImmPhil RecImmIr

> an if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 451

F( 10, 421) = 33.94

Prob > F = 0.0000

R-squared = 0.6048

Adj R-squared = 0.5776

Root MSE = 2.4276

———————————-

Pct100up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | .0003058 .000062 4.94 0.000 .000184 .0004276

MedVal | 3.14e-06 6.72e-07 4.67 0.000 1.82e-06 4.46e-06

MedRent | .0030059 .0005704 5.27 0.000 .0018846 .0041271

HHMedInc | -.0000774 .00001 -7.73 0.000 -.0000971 -.0000577

ImmTot | -.0004567 .0002237 -2.04 0.042 -.0008963 -.000017

RecImm5 | .0051461 .0014461 3.56 0.000 .0023036 .0079887

RecImmInd | -.0045674 .0018941 -2.41 0.016 -.0082904 -.0008443

RecImmChi | .0047709 .0021152 2.26 0.025 .0006133 .0089285

RecImmPhil | -.0070449 .0025268 -2.79 0.006 -.0120116 -.0020782

RecImmIran | .0050884 .0044768 1.14 0.256 -.0037113 .0138882

_cons | 6.10212 .7518196 8.12 0.000 4.624332 7.579907

————-+——————–

juris_id | F(19, 421) = 1.221 0.236 (20 categories)

Vancouver

Pct100up Mean 7.22 std dev 3,78

PctOwn30up Mean 27.62335 std dev 6.837

PctRent30up Mean 40.77577 std dev 12.754

RecImmChi Mean 79.8 std dev 135.2

RecImmInd Mean 42.8 std dev 114

RecImmPhil Mean 52.6 std dev 75.4

HHdensity Mean 1993.744 std dev 2754.097

HHMedInc Mean 68666.32 std dev 20319.

MedVal Mean 645885.4 std dev 318822.

MedRent Mean 1008.376 std dev 287.332

In the regression below, I add a dummy variable (=1 if the tract has a median value > 1.25M and then interact this with the # of recent Chinese immigrants. The idea is to try to see if it is high house price / high Chinese immigrants tracts that have uniquely higher rates. Main point, is that no correlation between higher # of recent Chinese immigrants in higher house price tracts and HH reporting shelter costs > income.

. areg Pct100up HHdensity MedVal MedRent HHMedInc hivalue hival_chi ImmTot RecImm5 RecImmInd RecImmChi R

> ecImmPhil RecImmIran if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 451

F( 12, 419) = 28.15

Prob > F = 0.0000

R-squared = 0.6048

Adj R-squared = 0.5756

Root MSE = 2.4334

———————————-

Pct100up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | .0003059 .0000625 4.89 0.000 .000183 .0004288

MedVal | 3.20e-06 9.20e-07 3.48 0.001 1.39e-06 5.01e-06

MedRent | .0030105 .0005725 5.26 0.000 .0018852 .0041359

HHMedInc | -.0000777 .0000103 -7.56 0.000 -.0000979 -.0000575

hivalue | .0074929 1.300754 0.01 0.995 -2.549323 2.564309

hival_chi | -.000499 .0040146 -0.12 0.901 -.0083903 .0073923

ImmTot | -.0004627 .0002283 -2.03 0.043 -.0009114 -.0000139

RecImm5 | .0051772 .0014778 3.50 0.001 .0022723 .008082

RecImmInd | -.004567 .0019021 -2.40 0.017 -.0083058 -.0008282

RecImmChi | .0048346 .0021715 2.23 0.027 .0005663 .0091029

RecImmPhil | -.0070798 .0025449 -2.78 0.006 -.0120822 -.0020774

RecImmIran | .0050731 .0045187 1.12 0.262 -.003809 .0139552

_cons | 6.08185 .7996954 7.61 0.000 4.509935 7.653764

————-+——————–

juris_id | F(19, 419) = 1.214 0.241 (20 categories)

**. *************************************

**. * **

**. * REGRESSIONS ON PERCENTAGE OF OWNER HH’S PAYING > 30% OF INCOME ON SHELTER**

**. ***

Here I look at owner HH paying more than 30% of income. What is striking is that it is not correlated with median tract house value. Falls with median tract income, higher in tracts w/ more recent immigrants, but not especially Chinese recent immigrants, though it I correlated with more recent immigrants from India. That the recent Chinese immigrant correlation is not present for this regression (mainly because of a less precisely estimated effect, so there is more noise in the connection)

. areg PctOwn30up HHdensity MedVal MedRent HHMedInc ImmTot RecImm5 RecImmInd RecImmChi RecImmPhil RecIm

> mIran if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 452

F( 10, 422) = 22.72

Prob > F = 0.0000

R-squared = 0.4418

Adj R-squared = 0.4035

Root MSE = 5.1884

———————————-

PctOwn30up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | .0000429 .000132 0.33 0.745 -.0002164 .0003023

MedVal | 1.30e-06 1.43e-06 0.91 0.362 -1.50e-06 4.11e-06

MedRent | .0059343 .0011865 5.00 0.000 .0036021 .0082665

HHMedInc | -.0001558 .0000214 -7.29 0.000 -.0001978 -.0001138

ImmTot | .0002155 .0004744 0.45 0.650 -.0007169 .001148

RecImm5 | .0069896 .0030906 2.26 0.024 .0009148 .0130644

RecImmInd | .0105182 .0040467 2.60 0.010 .0025641 .0184723

RecImmChi | .0033269 .0045028 0.74 0.460 -.0055238 .0121776

RecImmPhil | -.0088441 .0054003 -1.64 0.102 -.0194589 .0017708

RecImmIran | -.0022972 .0090973 -0.25 0.801 -.0201788 .0155845

_cons | 28.38539 1.561183 18.18 0.000 25.31672 31.45405

————-+——————–

juris_id | F(19, 422) = 1.893 0.013 (20 categories)

. areg PctOwn30up HHdensity MedVal MedRent HHMedInc hivalue hival_chi ImmTot RecImm5 RecImmInd RecImmCh

> i RecImmPhil RecImmIran if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 452

F( 12, 420) = 18.91

Prob > F = 0.0000

R-squared = 0.4426

Adj R-squared = 0.4015

Root MSE = 5.1971

———————————-

PctOwn30up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | .0000414 .0001329 0.31 0.756 -.0002199 .0003026

MedVal | 1.79e-06 1.93e-06 0.93 0.352 -1.99e-06 5.58e-06

MedRent | .0059687 .0011897 5.02 0.000 .0036302 .0083072

HHMedInc | -.0001588 .0000219 -7.26 0.000 -.0002017 -.0001158

hivalue | .4558599 2.745186 0.17 0.868 -4.940156 5.851876

hival_chi | -.0057133 .0085679 -0.67 0.505 -.0225547 .011128

ImmTot | .0001598 .0004855 0.33 0.742 -.0007946 .0011141

RecImm5 | .0072346 .0031536 2.29 0.022 .0010358 .0134333

RecImmInd | .0105596 .0040598 2.60 0.010 .0025795 .0185397

RecImmChi | .0040504 .0046194 0.88 0.381 -.0050297 .0131304

RecImmPhil | -.0091559 .0054349 -1.68 0.093 -.0198389 .0015271

RecImmIran | -.0021824 .0091348 -0.24 0.811 -.0201381 .0157733

_cons | 28.26762 1.640656 17.23 0.000 25.0427 31.49254

————-+——————–

juris_id | F(19, 420) = 1.908 0.012 (20 categories)

. areg PctOwn30up HHdensity MedRent HHMedInc hivalue hival_chi ImmTot RecImm5 RecImmInd RecImmChi RecIm

> mPhil RecImmIran if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 452

F( 11, 421) = 20.56

Prob > F = 0.0000

R-squared = 0.4415

Adj R-squared = 0.4017

Root MSE = 5.1963

———————————-

PctOwn30up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | 1.34e-06 .0001257 0.01 0.991 -.0002458 .0002485

MedRent | .0060131 .0011885 5.06 0.000 .0036769 .0083493

HHMedInc | -.0001483 .0000187 -7.92 0.000 -.0001851 -.0001115

hivalue | 1.721937 2.384254 0.72 0.471 -2.964587 6.408461

hival_chi | -.00566 .0085664 -0.66 0.509 -.0224983 .0111782

ImmTot | .0001717 .0004853 0.35 0.724 -.0007821 .0011256

RecImm5 | .0067994 .0031182 2.18 0.030 .0006702 .0129287

RecImmInd | .0110327 .0040273 2.74 0.006 .0031166 .0189487

RecImmChi | .0048544 .0045372 1.07 0.285 -.0040641 .0137728

RecImmPhil | -.0090561 .005433 -1.67 0.096 -.0197352 .0016231

RecImmIran | -.0016543 .0091158 -0.18 0.856 -.0195724 .0162638

_cons | 28.70721 1.570989 18.27 0.000 25.61925 31.79517

————-+——————–

juris_id | F(19, 421) = 1.863 0.015 (20 categories)

**. *************************************

**. * **

**. * REGRESSIONS ON PERCENTAGE OF RENTER HH’S PAYING > 30% OF INCOME ON SHELTER**

**. ***

**. **

Finally, the same exercise for the percentage of renter HH paying more than 30% of their income on shelter. As expected, higher percentage of renter HH w/ shter costs > 30% of income in tracts with higher median values and rents and lower median incomes. Higher where there are more recent immigrants, but lower in tracts with more recent immigrant from India

. areg PctRent30up HHdensity MedVal MedRent HHMedInc ImmTot RecImm5 RecImmInd RecImmChi RecImmPhil RecI

> mmIran if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 452

F( 10, 422) = 18.45

Prob > F = 0.0000

R-squared = 0.3989

Adj R-squared = 0.3576

Root MSE = 10.1124

———————————-

PctRent30up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | .0001175 .0002572 0.46 0.648 -.000388 .0006231

MedVal | 9.53e-06 2.78e-06 3.43 0.001 4.06e-06 .000015

MedRent | .0137682 .0023125 5.95 0.000 .0092227 .0183138

HHMedInc | -.0004058 .0000416 -9.74 0.000 -.0004876 -.0003239

ImmTot | -.0008175 .0009246 -0.88 0.377 -.0026349 .0009999

RecImm5 | .0157228 .0060236 2.61 0.009 .0038827 .0275629

RecImmInd | -.0277157 .0078871 -3.51 0.000 -.0432186 -.0122128

RecImmChi | -.0115154 .0087761 -1.31 0.190 -.0287657 .0057349

RecImmPhil | -.0152668 .0105254 -1.45 0.148 -.0359555 .005422

RecImmIran | -.0260211 .017731 -1.47 0.143 -.060873 .0088309

_cons | 47.89108 3.042807 15.74 0.000 41.91013 53.87202

————-+——————–

juris_id | F(19, 422) = 1.483 0.087 (20 categories)

. areg PctRent30up HHdensity MedVal MedRent HHMedInc hivalue hival_chi ImmTot RecImm5 RecImmInd RecImmChi RecImmPhil RecImmIran if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 452

F( 12, 420) = 15.35

Prob > F = 0.0000

R-squared = 0.3995

Adj R-squared = 0.3552

Root MSE = 10.1311

———————————-

PctRent30up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | .0001043 .0002591 0.40 0.688 -.000405 .0006135

MedVal | 7.85e-06 3.75e-06 2.09 0.037 4.71e-07 .0000152

MedRent | .0137653 .0023191 5.94 0.000 .0092068 .0183238

HHMedInc | -.0004007 .0000426 -9.40 0.000 -.0004844 -.0003169

hivalue | 2.444142 5.351347 0.46 0.648 -8.074617 12.9629

hival_chi | .0009968 .016702 0.06 0.952 -.031833 .0338267

ImmTot | -.0006908 .0009465 -0.73 0.466 -.0025512 .0011695

RecImm5 | .0149404 .0061474 2.43 0.016 .0028568 .0270239

RecImmInd | -.0275182 .0079141 -3.48 0.001 -.0430743 -.0119621

RecImmChi | -.0118728 .0090049 -1.32 0.188 -.029573 .0058274

RecImmPhil | -.0146099 .0105946 -1.38 0.169 -.0354349 .006215

RecImmIran | -.0254652 .0178071 -1.43 0.153 -.0604673 .0095369

_cons | 48.5118 3.198224 15.17 0.000 42.22528 54.79831

————-+——————–

juris_id | F(19, 420) = 1.354 0.146 (20 categories)

. areg PctRent30up HHdensity MedRent HHMedInc hivalue hival_chi ImmTot RecImm5 RecImmInd RecImmChi RecImmPhil RecImmIran if cmauid==933, absorb(juris_id)

Linear regression, absorbing indicators Number of obs = 452

F( 11, 421) = 16.22

Prob > F = 0.0000

R-squared = 0.3933

Adj R-squared = 0.3500

Root MSE = 10.1716

———————————-

PctRent30up | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+——————–

HHdensity | -.000071 .0002461 -0.29 0.773 -.0005548 .0004128

MedRent | .0139598 .0023265 6.00 0.000 .0093867 .0185328

HHMedInc | -.0003548 .0000367 -9.68 0.000 -.0004269 -.0002827

hivalue | 7.987928 4.66708 1.71 0.088 -1.185753 17.16161

hival_chi | .0012302 .0167684 0.07 0.942 -.0317299 .0341903

ImmTot | -.0006385 .0009499 -0.67 0.502 -.0025057 .0012287

RecImm5 | .0130351 .0061038 2.14 0.033 .0010373 .0250328

RecImmInd | -.0254467 .0078832 -3.23 0.001 -.040942 -.0099513

RecImmChi | -.0083523 .0088814 -0.94 0.348 -.0258098 .0091052

RecImmPhil | -.0141726 .0106348 -1.33 0.183 -.0350766 .0067314

RecImmIran | -.0231531 .0178438 -1.30 0.195 -.0582271 .0119208

_cons | 50.4366 3.075147 16.40 0.000 44.39204 56.48116

————-+——————–

juris_id | F(19, 421) = 1.147 0.301 (20 categories)

.

end of do-file