No Wonder So Many People are Depressed

K · Post by K » Sat Jul 07, 2018 2:27 pm

I think the caption and text shown on the previous page ( http://magpies.net/nick/bb/viewtopic.ph ... 43#1846543 ) give us enough clarification (sort of), David.

From the text, it seems Mr. P 'made' the plot (perhaps a student did it for him) after getting the idea from Mr. S-D's work. Mr. S-D got an econ degree out of obsession with the "trends" tool and wrote a book about it too (entitled Everybody Lies). Importantly, the source info reveals what the real search was, which of course raises the following issue...

Mugwump wrote:As I suspected, it is a very porous dataset. Instead of searching for "N***** jokes", people may nowadays search for "non-PC jokes", for "racist jokes", for "black deaths matter" jokes, or other combinations.
...

This is one reason the sketchy labelling of the curves and figure are so bad. The broader description of these curves as searches for "racist jokes", etc. really hides the issue. At the very least, he should have made an attempt to investigate different combinations, A OR B OR C... Apart from the issue of searches left out, there is also the issue of searches wrongly included. The claim is that in rap songs and African-American culture it's always spelled "n***a", which if true would eliminate that particular problem.

Here is what the "trends" tool gives for "n***** jokes". This should now match Mr. P's Fig 15-2. Does it?

HAL · Post by **HAL** » Sat Jul 07, 2018 2:30 pm

You can have a look at my source code. Suppose I said it does.

K · Post by K » Sun Jul 08, 2018 12:50 pm

It looks to my eye like that pic above from the "trends" website does match EN's Fig. 15-2. I don't think I like the way SAP smoothed it, but let me ignore that for now...

An obvious question is: why look for "n***** jokes" rather than just "n*****"? It's an offensive word, so both might be thought to indicate racism in the searcher. (The way the "trends" tool works, "n***** jokes" searches count as a subset of "n*****" searches, though it apparently lets one specify not to include cases including "jokes", if you want.)

Here is what search interest over time looks like for just "n******". [As usual, log in to view (or out to avoid!).]

TBC

K · Post by K » Sun Jul 08, 2018 8:07 pm

[ctd]

The site also lets you compare the two searches, so you can see the relative frequency of searches. (The site does not give total numbers of searches, but just figures relative to the highest-search month.) Here's the pic. [Log in to view.]

It looks like "n***** jokes" searches are maybe 15% of "n*****" searches over the period 2004-present.

Of course, I assume we all feel that attitudes to casual racism, etc. have changed, but do you think they've changed dramatically over such a short period as the last decade or so?

TBC

K · Post by K » Mon Jul 09, 2018 2:05 pm

[ctd]

The "n*****" relative search frequency looks quite flat, doesn't it? At least it seems so in two periods, 2004-2009(ish) and 2009(ish)-present, with the line in the second period lower than in the first.

If they're going to make grand claims about allegedly diminishing racism, one would think they should have thought about what would be a valid comparison ---- a reference or control or baseline search term, you could call it. One idea is to compare "n***** jokes" with simply "jokes". Since the above pic shows "n***** jokes" searches being just a fraction of "n*****" searches, I compared "n*****" searches and "jokes" searches. (This is almost as exciting as playing Fortnite --- well, almost as exciting as I presume playing Fortnite must be, at least to teenage boys and AFL footballers.) Here is the result. (Perhaps look at this and the above pic together.) [Log in to view.]

Is it not suggesting something rather different from their claims?

TBC

K · Post by K » Tue Jul 10, 2018 3:03 pm

[ctd]

As well as comparing relative search frequencies for "n***** jokes", "jokes", and "n*****", we could compare "n*****" and "African American". Here is that pic.

Is there some sort of relevant official holiday or week near the start of each year? ... Okay, a hasty web search tells me that Martin Luther King Day is held on the third Monday of January. Does that explain the annual spike shown in the pic? [Update: looking at the search term "Martin Luther King" seems to confirm this explanation.]

TBC

K · Post by K » Wed Jul 11, 2018 8:19 pm

[ctd]

What about the book of Stephens-Davidowitz, the guy specializing in web-search data? I was disturbed that in the book he incorrectly describes (the psychologist) Kahneman as an economist; I don't think it's acceptable to get someone's profession wrong. (Perhaps it's worse to spell someone's name incorrectly, as Pinker does in his book, but then the author can always claim it was "just a typo".) Maybe even more disturbing is that Pinker wrote the foreword to S-D's book.

Anyway...

The table below is from S-D's book. It shows "the top five negative words used in searches" about blacks, Jews, etc. Isn't it interesting how racists seem to be convinced that other racial groups are racist?

I took the first line and used the "trends" tool website to make the chart below S-D's table. You'll see that the ordering seems to have changed, from "rude", "racist", "stupid", "ugly", "lazy" to "racist", "ugly", "stupid", "rude", "lazy".

The chart clearly shows that many, maybe all, of the relative search frequencies for these negative terms have increased, especially the leading search term, "racist black people".

Mugwump · Post by **Mugwump** » Wed Jul 11, 2018 9:25 pm

K · Post by K » Thu Jul 12, 2018 2:50 pm

^ Yes, there are some subtleties here. In general, I'm disturbed when authors of what is supposed to be non-fiction don't bother getting easily checkable information right. (Surely, if you're too lazy to do it yourself, you can hire some child to check the spelling of people's names, for example.) On the other hand, in the specific case of expertise and professions, perhaps one should not pidgeon-hole or stereotype people.

The term S-D used was something like "Nobel Prize economist Kahneman", so in some sense it seems S-D was trying to squeeze in the fact that DK won the Nobel Prize in Economics without breaking stride. I'm quite sure Kahneman would not describe himself as an economist, though, and you'll often read things like "who won the Nobel Prize in Economics, although he is actually a psychologist". On Nobel Prizes in Economics, I'd be alarmed if John Nash were described as a "Nobel Prize economist", rather than a mathematician who won that Nobel Prize. (And one might suggest that many winners of the Nobel Peace Prize are anything but peaceful humans, but that is a separate topic.)

It's true this branch of psychology has been given the name "behavioural economics". But to describe one of its practitioners as a "behavioural economist" is rather jarring to my ear. This is a fascinating area, though the reliability and reproducibility of results is a major concern. I was sort of trying to pose a question in behavioural economics in this thread when I asked if people's attitudes to the US homicide rate over time were affected by all those different starting years in the different plots. I failed to get any responses, but never fear: no doubt we'll return to the topic of homicide some time soon, certainly before 2050.

Addendum (amusing quote):

"I will never know if my vocation as a psychologist was a result of my early exposure to interesting gossip, or whether my interest in gossip was an indication of a budding vocation." (Kahneman.)

K · Post by K » Fri Jul 13, 2018 3:59 pm

K · Post by K » Mon Jul 16, 2018 5:09 pm

K · Post by K » Fri Jul 20, 2018 4:44 pm

[CTD]

How is it possible for someone with an economics PhD to fail to get it, even when it's explained for him?
In football analogies, here's what Stephens-Davidowitz doesn't seem to realize:

Say the AFL refuses to tell us the teams' scores in their matches, but gives us the number of shots at goal of each team.

Statement 1: If team A has a greater number of shots at goal than team B, it's likely that team A has a greater score than team B.

Statement 2: The number of shots at goal team A records is roughly equal to the score of team A.

Statement 1 seems plausible; to see how good an estimate it gives, we'd have to look at data from past games. Statement 1 is analogous to Ellenberg's statements. But Statement 2 is clearly wrong; the nature of our game's scoring system makes this obvious.

Okay, now stay with AFL, but switch to a different analogy. Say we're interested in interchange rotations, percentage of time on-field, etc.

Statement 2b: On average, each footballer plays 81.81...% of the game time (the remainder spent on the bench).

Statement 3: On average, 81.81...% of the team play the entire game.

Statement 2b seems reasonable; in fact, it must be true (unless we cheat or have a deluge of injuries). Statement 3 immediately sounds wrong; in fact, it can never be true (unless the bench is wiped out at the start and there are no rotations).

Stephens-Davidowitz not only falsely assumes Statement 1 is equivalent to Statement 2, but also falsely assumes Statement 2b is equivalent to Statement 3.

K · Post by K » Tue Jul 24, 2018 4:31 am

If we attempt to be Kahneman, we might guess that SSD's mistakes were partly due to his intuition that the word "rough(ly)" would save everything; that might be the case if the problem were merely one of imprecision, but since the statement is not even approximately correct, no number of words like "rough", "estimate", etc. sprinkled in the text are going to help.

The one problem SSD saw ("the best quotes may be at the beginning of the book") may also have acted like a decoy, making it difficult for SSD to see the other problems.

How should we view all this? I guess it really means readers should be (even more) wary of the claims in his book. It's like we witnessed a spontaneous, accidental test, which he failed badly.

K · Post by K » Sun Jul 29, 2018 1:22 am

Mugwump wrote:...
His [Kahneman's] book "Thinking Fast and Slow" is a masterpiece, combining a wealth of well-evidenced insightful material with utterly lucid exposition of complicated concepts. He shows very clearly that reason as a far more fragile instrument than we suppose, even in highly intelligent people.

The thing I like about Kahneman is his self-doubt, his self-criticism. His relationship with his collaborator Tversky, a very different personality, is also fascinating; it seems like it was in effect a platonic love affair, complicated by professional stresses and their personality differences. The manner of its ending reads like a tragedy. Have you read Lewis's book about this (The Undoing Project)? I haven't yet.

In general, I'm suspicious of Lewis's books, because he doesn't worry about letting the facts get in the way of a good story. That's definitely the case with those books that have been made into movies. Maybe he should just be a pulp-fiction writer. Or is it still a big leap for someone who can produce page-turners loosely based on truth to write completely fictional tales? But in this case, it seems that by accident he had some sort of connection with Tversky's family, so he had access to inside information that others would not have.

At this rate, we should perhaps start a VPT book-club thread.

The problem is that such a thing would surely be best done as two threads: one for great books and one for terrible books (just as there are separate "What made you happy?" and "What made you sad?" threads). But then, VPTers might violently disagree about which thread a book belongs in, or not know before the discussion begins...

This is an unofficial Bulletin Board - owned and run by its users. We welcome all fans of the Mighty Collingwood Football Club.

No Wonder So Many People are Depressed