The randomness of iTunes

In 1998, a rather awkward 25-year-old male walked into a CD store (this was in the day when music was sold on CDs, in stores, to 25 year-olds) and purchased Whitey Ford Sings the Blues by Everlast.  Here’s what the indubitable Wikipedia has to say about said album and artist

Whitey Ford Sings the Blues was both a commercial and critical success (selling more than 3 million copies).  It was hailed for its blend of rap with acoustic and electric guitars, developed by Everlast together with producers Dante Ross and John Gamble (aka SD50).  The album’s genre-crossing lead single “What It’s Like” proved to be his most popular and successful song, although the follow up single, “Ends”, also reached the rock top 10.

Several years later Apple launched iTunes, which also proved to be a commercial and critical success, and the awkward male promptly loaded Whitey Ford Sings the Blues into the song library.  iTunes seemed to take a particular shine to this album, apparently favouring it with many more frequent plays, when iTunes was set to “shuffle”, than any of other 100 or more albums in the collection.  At least that’s how it appeared to the awkward male, who seemed to notice it come up much more often than expected.

In a strange twist of fate I also just happen to have Whitey Ford Sings the Blues in my iTunes collection.  In another strange coincidence, just like that awkward male from a decade ago, I’ve noticed that iTunes tends to favour it over other albums in the song list when iTunes is set to shuffle.

Life is certainly full of strange coincidences, but does iTunes really favour certain songs/ artists/ albums over others?  Let’s test it scientifically…

I set iTunes to shuffle and counted the number of tracks I had to skip before I hit Whitey Ford Sings the Blues.  The results are below:

32, 65, 181, 67, 77, 152, 50, 46, 230, 64

In other words, Whitey Ford Sings the Blues played randomly 10 times in 964 attempts (i.e. 1.037% of the sample).  I have 119 albums in iTunes, so theoretically I should be hearing it 1/119=0.840% of the time.  So the sample is a little bit higher than expected, but statistically significantly higher?

This question can be answered using the probability mass function of the Binomial Distribution.  The probability of exactly 10 “successes” out of 964 “attempts”, given that the probability of a success is 1/119 is, using the very fine SpeedCrunch calculator:

binompmf(10; 964; 1/119) = 0.102 (i.e. 10.2%)

This is well above the standard p=0.05 (5%) significance level.  I have to conclude that Whitey Ford Sings the Blues doesn’t play any more or less frequently than any other album in my iTunes collection when the playlist is set to shuffle.

Humans are very bad a gauging randomness.  Or rather, probably like most predators, we’re very good at detecting patterns, and tend to see patterns when they’re not really there.  Luckily we have statistics to sort it all out for us.

And Whitey Ford Sings the Blues is still an awesome album.

——

Once, back in primary school…

… my friends and I found ourselves in a blazing row over whether professional wrestling was real or fake.  The adherents insisted that the combat was actual (it looks convincing), whereas the sceptics (myself included) were adamant that it was all theatre (how could the wrestlers possibly survive the impacts?).  Of course, being boys, rather than calmly debate the relative merits, or seek an objective reference, we chose a Lord of the Flies decision-making stratagem of tribal violence and intimidation.  We debated the point using the time-honoured and gentlemanly tradition of casting the most slanderous aspersions on the probity of each other’s mother, and performing professional wrestling moves on each other, until a side relented.  Ultimately, the adherents secured victory through sheer force of numbers.  Consensus via mobocracy.

It was a defining moment in my personal development.  I was made aware that other people see the world differently to me; that people will cling grudgingly to an irrational belief, no matter how much logic you throw at them.  In fact, some people choose to define themselves as a person by that belief, and can get quite angry, even violent, when that belief is challenged.

It got worse.  As I grew older I was a shocked to realise my world was awash with these irrational beliefs and dominated by people enslaved to them.  And I’m not just talking about religion.  Everything from, “I’ve got a sure-fire gambling strategy…” to, “Oh, I won’t vaccinate my child because they do more harm than good…” kind of thinking.  I developed a kind of Cole Sear Sixth Sense… I began to see Stupid People everywhere.  Or rather, smart people who chose to be ignorant.

It sometimes saddens me, although perhaps I do understand it.  An irrational belief in something for which there is no evidence, such as a loving, protective God, or the infallibility of television entertainment, could be a necessary coping mechanism.  A coping mechanism we need to give us hope, and keep us going, in the face of a cold universe utterly indifferent to our existence.  But it’s a child-like faith in the magic benevolence of Santa Claus.  An illusion.  Can the human race can ever truly be free as long as we keep subjugating ourselves to such silly notions?

——

3G mobile data speeds in Sydney on the iPhone

If you’re interested in mobile Internet and the iPhone (and, really, who isn’t?), the Byteside ByteBlog has posted the results of its “Australian iPhone data test”.  The researchers measured 3G download speeds, upload speeds, and ping times, using the iPhone’s Speedtest.net application connected to four major mobile ISPs around Sydney.

We had concurrent access to four iPhone 3GS handsets, one on each of the four Australian networks — Telstra, Optus, Vodafone, and 3.  Travelling around the Sydney CBD and Sydney suburban areas, we ran close to 150 individual speed tests.  Tests ranged from Manly to Homebush, Annandale to North Sydney, and plenty in between.

Byteside have kindly made the raw data from their study available for download.  To analyse the results in more detail, I thought it would be an interesting exercise to try out the Instat statistical analysis package.  Instat can be downloaded for free (for non-commercial use) from the Statistical Services Centre.

Importing Byteside’s raw data (available as an Excel file) into Instat was fairly straight forward.  It just needed a bit of cleaning up.  To start proceedings, Instat’s Summary Tables was used to produce some descriptive statistics of download speeds.

Table 1: Statistical summary of download speeds by ISP

Mobile
ISP
obs
(no.)
min
(kbps)
mean
(kbps)
median
(kbps)
max
(kbps)
st.dev
(kbps)
Optus 110 0 1637 1903 3654 1080
Telstra 149 0 2681 2416 6151 1554
Three 149 0 625 525 2290 471
Vodafone 151 0 1283 1125 3349 961
TOTAL 559 0 1550 1271 6151 1329

Telstra looks the clear winner in terms of average and maximum download speeds, followed by Optus, Vodafone and Three.  However, I don’t think a summary analysis can tell the whole story.  I wanted to have a look at the plots of the distributions and run a few tests.  Instat obliges with the first request by providing quite a comprehensive graphing capability.

Intstat’s boxplot feature seems to cover all the basics (there’s also scope for plenty of customisation).  Below is a comparison of download speeds between the four carriers.

Figure 1: Boxplot of mobile download speeds by ISP

3G_download_speeds_boxplot

Now let’s have a look at the histograms…

Figure 2: Histogram of mobile download speeds by ISP

Fig. 2a: Optus
optus_download_hist
Fig. 2b: Telstra
telstra_download_hist
Fig. 2c: Three
three_download_hist
Fig. 2d: Vodafone
vodafone_download_hist

Telstra really does own the opposition, but not consistently so.  In fact the most frequent download speed for “The Big T” was less than 2Mbps.  All four carriers recorded their fair share of quite cruddy speeds, and all four recorded at least one instance where a download attempt didn’t work at all.  The reliability of 3G mobile broadband in this country (or Sydney, at least) has some way to go, apparently.

I also ran a couple of statistical tests.  As the data aren’t normally distributed, or the ISP download speed variances equal, I chose the non-parametric tests that Instat offers.  Firstly, to ask a somewhat redundant question, does network make a difference to download speed?

Kruskal-Wallis Test

Sample   n   Median   Ave rank    z
1    110  1902.50    305.0      1.81
2    149  2416.00    401.2     10.70
3    149   525.00    163.2    -10.30
4    151  1125.00    257.4     -2.01

H = 167.42 (adjusted for ties) with 3  d.f
Probability > 167.42  = 0

When you ask an obvious question, expect an obvious answer, I suppose.  We can comprehensively reject the null hypothesis that all the medians are equal.  Yes, download speeds depend on the network that you’re on.

Looking at the histograms above, the Optus and Vodafone distributions share some similarities.  Is there a statistically significant difference between these two carriers?  Again, the answer looks fairly obvious before the test is even run, but what the hell…

Two-sample test for independent data

Sample   n   Median   Rank sum
optus_d 110  1902.50  16047.0
voda_d 151  1125.00  18144.0

Mann-Whitney U = 9942.00    U’ = 6668.00
Wilcoxon T     = 16047.00

On HO: Mean (for U) = 8305.00
Mean (for T) = 14410.00
s.d. = 602.09  (adjusted for tied ranks)

Hence z = 2.72      Significance level is 0.33% for one-sided test
Significance level is 0.66% for two-sided test

0.33% (p=0.0033, one-sided) is highly statistically significant.  So we must conclude that the Optus mobile data network does indeed provide faster download speeds on the iPhone than Vodafone in the Sydney areas covered in Byteside’s study.  That’s when the download does actually work, of course.

So this post is really more a brief review of the Instat package than mobile network download speeds in Sydney.  I really enjoyed using it for the small amount of statistical analysis presented in this post.  Instat was capable and quite fun to drive.  Although Instat doesn’t have the polish or bells and whistles of a package like SPSS, it does have an intuitive GUI; a useful tool-set; and doesn’t come with the steep learning curve of R.  Free for non-commercial use, the price is right too.

Finally, in a conclusion keeping with the overall theme of this post, Telstra is the obvious choice for mobile data carrier in terms of download speeds and reliability.  But you’ll certainly pay for the privilege.  Optus and Vodafone both represent good value alternatives.  Based on the results collected by Byteside, it’s hard to recommend Three.  It is also a particular concern that all four networks recorded nil or negligible speeds to a significant proportion of download attempts in the survey.

——

Benford’s Law and Census data

Would you like to conduct a simple statistical experiment?

Grab a list of measurements from a real-life source of data, such as heights of buildings, or lengths of rivers.  Now look at just the first digit of each of the measurements and record the frequency of the numbers 1 through to 9.  Intuitively, you’d expect that each leading digit occurs roughly 11% of the time (i.e. 1 chance in 9).  However, it is an interesting observation that the leading digit from such sources is often not uniformly distributed.  Surprisingly, a first digit of 1 tends to appear with a probability of about 30%, a leading digit of 2 tends to occur about 18% of the time, a 3 about 13%, and so on in a logarithmically decreasing pattern, with a leading digit of 9 often being observed less than 5% of the time.

Welcome to the bizarre world* of Benford’s Law [*not actually a bizarre world].

This phenomenon was first noted by the astronomer and mathematician Simon Newcomb in 1881.  The physicist Frank Benford re-stated the observation in 1938 and, in an odd twist of fate, it is after him that the law is named.  It is referred to as a “Law” but of course it won’t apply to all kinds of real-world lists of numbers.  Lottery results (they’re entirely random) or counts of fingers (since we’re talking about digits), by way of example, should be, I hope, somewhat more uniformly distributed.

I thought it would be a fun exercise to test Benford’s Law using population data available from the Australian Bureau of Statistics’ website.  I downloaded the latest Census population counts across the 129 Statistical Local Areas (SLAs) in South Australia, and tallied the frequency of the leading digits.  The results of observed versus expected frequencies are summarised in Table 1 below.  For example, there were 3 SLAs that recorded a population with a leading digit of 9 (they were 934, 9205, and 9015).  If Benford’s Law holds there should be 6 such SLAs.  Note that one SLA had a population of zero, so was not included.

Table 1: Frequency of leading digit – observed vs. expected under Benford’s Law

Leading digit
No. times leading digit
was observed
No. times leading digit
was expected*
1 47 (36.7%) 39 (30.1%)
2 36 (28.1%) 23 (17.6%)
3 11 (8.6%) 16 (12.5%)
4 10 (7.8%) 12 (9.7%)
5 1 (0.8%) 10 (7.9%)
6 6 (4.7%) 9 (6.7%)
7 7 (5.5%) 7 (5.8%)
8 7 (5.5%) 7 (5.1%)
9 3 (2.3%) 6 (4.6%)
Total SLAs
128 (100%) 128 (100%)

*Expected numbers are rounded to nearest integer based on exact percentages predicted under Benford’s Law.

There are certainly some similarities between the two columns.  The observed leading digits 1 through to 9 of the SLA population counts do decrease in a logarithmic pattern, as Benford’s Law predicts.  However, the observed vs. expected frequencies by row are only really roughly comparable.  The observed frequencies appear to be a little bit more weighted towards the digits 1 (47 vs. 39) and 2 (36 vs. 23).  Also, the leading digit of 5 is a bit of a standout, with only one observed compared to ten expected.

Did the Census counts conform to Benford’s Law?  I wasn’t convinced, and decided to dig deeper with an appropriate statistical test.  Under the null hypothesis, both the observed and expected counts come from the same distribution (i.e. Benford-type logarithmic).  Under the alternative hypothesis (two-sided) they are different distributions.

Firing up the trusty statistical analysis packager “R”, I entered the above matrix and ran Fisher’s Exact Test:

> census <- matrix(c(47,36,11,10,1,6,7,7,3,39,23,16,12,10,9,7,7,6), nr=9)

> fisher.test(census, workspace=2000000)

yielding the following output:

Fisher’s Exact Test for Count Data

data:  census
p-value = 0.0791
alternative hypothesis: two.sided

p=0.0791 is, I think, a bit on the low side, but technically not statistically significant at the standard p=0.05 level.  Therefore not enough evidence to reject the null hypothesis in favour of the alternative.  I had to conclude that the observed frequencies of Census populations by SLA in SA do, in fact, follow a Benford-type logarithmic distribution.  Benford’s Law analysis is often used for fraud detection (for example, insurance claims and even the recent Iranian Presidential election), so it is a relief to know that the Australian Bureau of Statistics isn’t just making their data up at random!

——

I’m number one, baby, so why try harder?

The other day Whirlpool Discussion Forums member “billabong” posted that ISP aaNet is running a promotion, and are using the flyer pictured below.  Readers will see that aaNet have sourced two customer satisfaction surveys (Whirlpool Australian Broadband Survey 2008 and Australian PC Authority Best ISP Award 2008), making it clear that aaNet finished number “1″ in those polls.

Or did they?

As billabong points out, on the question of “Would you recommend your ISP to other people?” in the Whirlpool survey, aaNet actually finished 7th overall with 88.4% “Yes”.  In the “Value for Money” category of the PC Authority survey, aaNet also finished 7th with 4 out of 6 stars.  And they dropped to only 3 out of 6 stars in the “Overall” class.

So not quite “Number 1″.

I don’t actually have a real problem with aaNet cherry picking survey results, putting themselves in the best possible light for marketing purposes.  It’s just par for the course in advertising.  I expect all companies will do it to some degree.  Consumers should always be wary of the various shenanigans that go on when it comes to marketing departments and data.  That said, there can be still be a certain elegance about it.  The way aaNet have crudely slapped a blue “Number 1″ ribbon over results that they actually finished 7th in leaves a sour taste in my mouth.

Poor form, aaNet.  Poor form.

aaNetFlyer

——

Statistics experts label ISP filtering trials unscientific

Earlier this year I added my $AUD0.02 to the debate around the Australian government’s ill-conceived, and, in fact, ludicrous plan to compulsorily censor the internet (under the Orwellian moniker of Cleanfeed).  My arguments against it were more ethical/ philosophical/ common sense, objecting that Cleanfeed:

  • was not needed
  • was not wanted
  • will not work
  • has no mandate
  • will be too expensive
  • will break things
  • will not scale
  • was not transparent
  • was vulnerable to scope creep

All pretty sound arguments if you ask me* [*nobody asked me].  Enough to drop the project in its embryonic stages you would have thought.  But no, Cleanfeed trials were launched and now maintain an unstable orbit around the planet Stupid.  Viz this recent article in ARN:

Statistics experts label ISP filtering trials unscientific: Trials for mandatory filtering would never be accepted in an academic statistics journal

The Federal Government’s ISP filter trials lack proper methodology and are not representative, according to experts in statistics and testing from two of Australia’s leading universities.

The criticisms come after two of the nine ISPs participating revealed only 15 of their customers, which in one case was 1 per cent of the total, chose to have their Internet filtered.

The vast majority of ISPs also used an opt-in system that requires users wanting to be filtered to request it.

“I would not have confidence in any of the results they find because of the way the sample has been constructed,” expert in statistics and senior lecturer at the Queensland University of Technology, Dr Daniel Johnson, said.

So not only does Cleanfeed fail on ethical grounds, it now fails on hard scientific grounds.  As one of my statistical mentors, the legendary Professor Robert (Bob) F. Ling says, “When you find yourself in a hole, stop digging.”

——

The “V-shaped recovery” continues

…or is it more like a bullet ricocheting off the floor, about to hit us in the cojones?

Sixteen days late, but tonight I finally got around to updating my “Stock Market Seismometer” (see separate tab above) for the period ending July 2009.  It was a relief to see the control chart has moved out of “extremely over-sold” territory and into the “highly over-sold” range.  I had to go to all the trouble of changing the font colour from red to orange!  The index is now back to where it was in October last year, and continues its inexorable march upwards.

——

NOT FINANCIAL ADVICE.  For academic interest only.

EasyData: It’s data, and it’s easy

I wanted to share with you a terrific online data enquiry tool recently posted by my State government called EasyDataEasyData is part of a wider business portal, but its particular focus is to present the latest regional South Australian economic, social and environmental indicators down to the Local Government (i.e. council) area.  Data sources include the Australian Bureau of Statistics and other agencies from all three levels of government.

I find the EasyData interface intuitive and easy to navigate; with plenty of relevant, useful and interesting information to explore.  And it looks great.  The functionality is also there for information analysts like me who might want to steal export the data into our own reports, presentations and spreadsheets.

EasyData is certainly one of the better, if not the best, online regional profiles products that I’ve come across, and I’ve seen a few.  All the more remarkable given that, I understand, just a few key staff developed the whole thing in only a few months.  It’s a credit to the those that put it all together (not me!).  Great effort.

Check out EasyData here: www.SouthAustralia.biz/EasyData

——

Can I get a Little MORE support around here?

In November last year, I blogged about the phone queue reporting and graphing page beta-released by my ISP, Internode.  The aim was to use the data presented on that page, with some basic queuing theory (Little’s Law), to determine the size of their helpdesk.  I theorised that a rough estimate for how many Internode support staff are on duty at any particular point in time could be given by:

Calls in Queue x 12.5 / Wait Time

Looking at the hourly averages, I concluded that, on the Saturday of my analysis, Internode helpdesk had 8 or so people on hand to assist with customers’ technical problems.  I have been informed that my estimate for that period was surprisingly accurate.

The graphs and hourly averages data were taken offline for a little bit, but they’ve recently been reinstated.  I thought it would be timely and interesting to have another look and see what’s changed over the intervening months.  Last Saturday evening I went through and analysed the hourly averages covering the time period from 8pm Friday (17 July 2009) to 8pm Saturday (18 July 2009).  Note that Internode’s residential technical support helpdesk is staffed from 7am to midnight, 7 days a week.  I then applied the same methodology from Can I get a Little support around here to estimate the number of support staff on duty (last column).

Table 1: Internode helpdesk phone queue – hourly averages

Time period Avg. wait time
(mins:secs)
Avg. calls queued
(no.)
Support staff on duty
(est.)
Friday, 8pm-9pm 00:21 0.1 4
9pm-10pm 00:22 0.1 3
10pm-11pm 00:21 0.0 not enough data
11pm-midnight 00:22 0.0 not enough data
Saturday, 7am-8am 00:22 0.1 3
8am-9am 00:30 0.2 5
9am-10am 00:22 0.2 7
10am-11am 00:26 0.3 9
11am-noon 00:22 0.2 7
noon-1pm 00:22 0.2 7
1pm-2pm 00:23 0.2 7
2pm-3pm 00:23 0.2 7
3pm-4pm 00:58 0.7 9
4pm-5pm 00:22 0.2 7
5pm-6pm 00:23 0.2 7
6pm-7pm 00:22 0.1 3
7pm-8pm 00:21 0.1 4

Looking through my small window of analysis, it appears that Internode have largely resolved any problems they were experiencing late last year/early this year in terms of extraordinarily long wait times.  Time spent in the phone queue has collapsed from around 10 minutes to less than 30 seconds.  However, this dramatic improvement doesn’t appear to be due to any significant increase in staff numbers.

——

Clint Doesn’t Like Surveys

I just wanted to share this, courtesy of a very talented cartoonist by the name of Jack Faulkner that I met on twitter:

clint_doesnt_like_surveys

I’m nowhere near as awesome as Clint Eastwood of course, but I DO happen to like surveys.  As long as they’re scientifically rigorous.

Check out Jack’s gallery at twitpic.com/photos/jackfaulkner [some language/content occasionally a little bit NSFW]

——