Will the 43rd parliament of Australia survive the next two years?

Good question. I mean, what are the odds?

Australia’s 43rd parliament is a precarious one. Last year’s federal election resulted in a hung parliament in the House of Representatives, with a minority Labor government clinging to power with the help of the Greens and a few Independents. Now I don’t want to sound morbid but, Heaven forbid, should any one of the Honorable Members die, the resulting by-election could conceivably bring down the government. So finely balanced are the numbers.

So, I wonder, what IS the probability that this will happen before the next election, due in 2013?

Statisticians that devote themselves to thinking about exactly this sort of question are called Actuaries, and when calculating risk of mortality they reference a thing called a Life Table. Life tables list the probabilities that a person aged x will survive to x+1 years. For example, looking at the most recent Life Tables for Australia, the probability that a female aged 40 will die before turning 41 is 0.00078 (i.e. 0.078%). Or looking at it in a more positive light, the chance a 40 year old female will live to see her 41st birthday is 1-0.00078, or a healthy 99.922%.

So turning our attention back to the 150-member House of Representatives, what is the probability that one or more of them might shuffle off this mortal coil in the next two years, resulting in a by-election and/or a change of government?

To be rigorous I should aggregate individual probabilities of survival based on each member’s age and gender. But I can’t be arsed, so I’ll talk about it in terms of generalities.

If I recall correctly, the average age of a politician is 51 years. Despite this being a modern society in the year 2011, our parliament is still a massive sausage-fest. Men in government significantly outnumber women. So I’ll just be lazy and use the qx column in the Australian Life Tables linked above for males aged 51 and 52.

From the Life Table, the probability that a male aged 51 years will survive to his 53rd birthday is (1-0.00332)*(1-0.00359)=0.993. Therefore the chance that every member of the 150 seat parliament survives for the next two years can be approximated at 0.993^150=0.349 (i.e. 34.9%). The likelihood that one or more parliamentarian dies within the next two years is the complement all surviving, calculated as 1-0.349=0.651 (65.1%).

65%. That’s a pretty high risk that the parliament won’t see out its full term.

——

“iiNet fares best in TIO complaints…”

Or do they?

The above headline and corresponding story appeared in ARN on 4 May 2011. The article was reporting on the Telecommunications Industry Ombudsman’s (TIO) January to March 2011 statistics published here.

Now before I go any further, let me say I have grave concerns with the source data itself, let alone the way it’s being reported in the press. When talking about TIO complaint statistics, it is worthwhile to familiarise yourself with this recent research report from the University of Technology, Sydney (UTS).

Findings of the UTS research include (from the executive summary):

  • Over 80% (of carriage service providers surveyed) assert that the TIO accepts complaints which are out of jurisdiction, frivolous or vexatious.

Recommendations include:

  • Amend TIO policy and procedure to cease the multiple counting of complaints in statistics and recommence reporting disposition of complaints.
  • Amend TIO policy and procedure to refer to Level 1 Complaints as Contacts rather than Complaints.

So fair to say the TIO isn’t exactly highly regarded by the ISP industry. Indeed, Exetel have commenced legal action against the TIO for acting outside its charter. Or, in Exetel CEO John Linton’s own words, “Exetel is now going to take a court action to have the TIO closed as being a ‘criminal’ organisation based on a level of incompetence, lack of knowledge and unconstitutional actions that even Australians, who are mostly as apathetic as the governments they tolerate, shouldn’t have imposed on them.”

Well OK then…

Now if you still want to accept that all “complaints”, as reported by the TIO, are real, legitimate complaints then the fairest way to present the numbers is as a proportion of overall customer base. That’s going to be a bit tricky, because not all ISPs publish accurate customer numbers in the public domain. So I’ve had to estimate relative market share using Roy Morgan survey data and a methodology used in a previous blog post. When I refer to market share I mean ALL business, government AND home subscribers with ANY kind of internet access including dialup, DSL, cable, fibre, satellite and wireless (fixed & mobile). The Roy Morgan surveys were conducted between January 2010 and December 2010

Then, to be consistent, I only looked at TIO complaints made against a provider’s internet services . That is, I excluded complaints made against landline and mobile phone services. I considered all internet services TIO complaints made between October 2010 and March 2011

Then I derived a normalised score by dividing the relative market share by the proportion of complaints. The benchmark is a score of 1000 and a higher score is better. A score over 1000 means that an ISP recorded relatively fewer complaints as a proportion of its customer base than its peers. Less than 1000 means that an ISP was over-represented in TIO complaints relative to its size.

The TIO complaints leaderboard comes out as follows (click to up-size):

Table 1: TIO internet services complaints leaderboard, 2010/2011


iiNet did indeed do very well. With roughly an 11% market share, but only 6% of complaints, it gets a score of 1843. That’s well above the expected benchmark of 1000. But it didn’t fare best. The ISP that recorded the lowest number of TIO complaints as a proportion of number of customers was Internode, with a massive normalised score of 3448.

——

The NBN, CVC and burst capacity

Late last year, NBN Co (the body responsible for rolling out Australia’s National Broadband Network) released more detail on its wholesale products and pricing. You can download their Product and Pricing Overview here. The pricing component that I wanted to analyse in this post is NBN Co’s additional charge for “Connectivity Virtual Circuit” (CVC) capacity.

CVC is bandwidth that ISPs will need to purchase from NBN Co, charged at the rate of $20 (ex-GST) per Mbps per month. Note that this CVC is on top of the backhaul and international transit required to pipe all those interwebs into your home. And just like backhaul and international transit, if an ISP doesn’t buy enough CVC from NBN Co to cover peak utilisation, its customers will experience a congested service.

The problem with the CVC tax, priced as it is by NBN Co, is that it punishes small players. By my calculations, an ISP of (say) 1000 subscribers will need to spend proportionally a lot more on CVC than an ISP of 1,000,000 subscribers if they want to provide a service that delivers the speeds it promises.

Here comes the statistics.

Consider NBN Co’s theoretical 12 megabit service with 10GB of monthly quota example that they use in the document I linked to above. 10GB per month, at 12Mbps gives you 6,827 seconds (a bit under 2 hours) at full speed before you’re throttled off. There’s 2,592,000 seconds in a 30-day month, so if I choose a random moment in time there is a 6827/2592000 = 0.263% chance that I’ll find you downloading at full speed.

That’s on average. The probability would naturally be higher during peak times. But let’s assume in this example that our theoretical ISP has a perfectly balanced network profile (no peak or off-peak periods). It doesn’t affect the point I’ll labour to make.

A mid-sized ISP with (let’s say) 100,000 subscribers can expect, on average, to have 100,000*0.263% = only 263 of those customers downloading at full speed simultaneously at any particular second. However, the Binomial distribution tells us that there’s a relatively small, but still statistically significant (at the alpha=0.05 level) probability that there could be 290 or more customers downloading at the same time.

So a quality ISP of 100,000 subscribers will plan to buy enough CVC bandwidth to service 263 customers at any one time. But a statistician would advise the ISP to buy enough CVC bandwidth to service 290 subscribers, an additional (290-263)/263 = 10% , or find itself with a congested service about one day in every 20.

This additional “burst headroom”, as a percentage, increases as the size of the ISP decreases. From above, an ISP of 10,000 can expect to have 26 customers downloading simultaneously at any random moment in time. But there’s a statistically significant chance this could be 35+. This requires them to buy an additional (35-26)/26=33% in CVC over and above what was expected to cover peak bursts.

The table below summarises, for ISPs of various sizes, how much additional CVC would need to be purchased over and above the expected amount, to provide an uncontended service 95%+ of the time.



Graphically it looks a bit like this…

As you can see, things only really start to settle down for ISPs larger than 100,000 subscribers. Any smaller than that and your relative cost of CVC per subscriber per month is disproportionally large.

….

Further reading:

Rebalancing NBNCo Pricing Model

NBN Pricing – Background & Examples, Part 1

——

Australian ISP Market Share, 2009-2010

The market research firm, Roy Morgan, has released its latest ISP satisfaction data, with an overwhelmingly positive result recorded for Internode and iiNet.

According to the latest Roy Morgan Internet Satisfaction data, Internode (93.4%) is still the top performer for customer satisfaction while iiNet (89.9%) appears to be closing the gap from 5.6% points in the 6 months to April 2010 to 3.5% points in the 6 months to May 2010.

Scrolling further down the Roy Morgan press release page, you’ll find individual ISP customer profiles available for purchase.  At the bottom of each report’s synopsis, you’ll see that a sample size has been included [for example, the Internode customer profile is based on a sample of 305 customers].  Now, combining these sample sizes from the ISPs’ profiles I think could potentially provide a good basis for estimating market share.

My results/estimates are presented in the table below.  Market share is ALL business, government and home subscribers with ANY kind of internet access including dialup, DSL, cable, fibre, satellite and wireless [fixed & mobile].  The Roy Morgan samples were taken between April 2009 and May 2010.

Table 1: Estimated Australian ISP market share, 2009-2010

Internet Service Provider Roy Morgan sample
(no.)
Est. market share
2009-2010 (%)
3 Internet 322 2.6%
AAPT 342 2.8%
Adam 134 1.1%
Chariot 134 1.1%
Dodo 321 2.6%
Exetel 206 1.7%
iiNet 509 4.2%
Internode 305 2.5%
iPrimus 284 2.3%
Netspace 166 1.4%
Optus 2,099 17.3%
Primus-AOL 153 1.3%
TADAust 119 1.0%
Telstra 5,710 46.9%
TPG 539 4.4%
Unwired 175 1.4%
Virgin 158 1.3%
Vodafone 103 0.8%
Westnet 384 3.2%
TOTAL 12,163* 100.0%
  • AAPT, Netspace and Westnet are owned by iiNet
  • Chariot is owned by TPG
  • Adam offers residental internet access in South Australia & Northern Territory only

——

Big Blog Theory nomination

It was very nice to receive an email a couple of days ago, letting me know that The Bernoulli Trial has been nominated for The Big Blog Theory science blogging competition.  The Big Blog Theory “was created to celebrate National Science Week 2010 and to acknowledge the Australian bloggers out there who contribute to the communication and understanding of science online.”  Finalists will be announced on Friday, 9 July.

So thank you to whoever nominated me.

I’ve updated my Stock Market Seismometer (see separate page tab above) to bring the data up to the end of the Australian 2009-10 financial year.  Not good news for share market punters.  Half of the recovery recorded throughout most of a hopeful 2009 calendar year has been wiped out.  How much further it has to fall, or whether it can turn things around, I just don’t know.  I don’t think anyone does.

——

ISP Customer Service in 2009

A few weeks ago Whirlpool’s Australian Broadband Survey 2009 Report was released.  Last year I used the 2008 report to analyse the survey results specifically as they pertained to ISP customer service; so I thought it would be good idea to update my analysis, and see just how much the ISP customer service landscape has changed over the 12 month period.

My objective, as it was last year, was to take results from these three survey questions related to customer service

  1. When calling customer support, how long did you have to wait on the phone (or talk to an operator) before you spoke to the right person?
  2. How quickly have technical support issues typically taken to resolve?
  3. How would you rate their customer service?

and distil them down to a single score that can be used to rank providers.  I arbitrarily set the benchmark score across the whole industry to be 1000, with each individual ISP’s customer service ranked relative to that benchmark.  So an ISP score higher than 1000 is above the industry average.  Lower than 1000 is below average.

The methodology employed was exactly the same as last year, so no need to go into the details.  Without further ado, here are the updated results:

Stan’s Top Five ISPs for Customer Service in 2009 [2008 rank in brackets]

  1. Adam Internet [3]
  2. Westnet [1]
  3. Amnet [2]
  4. Internode [4]
  5. iiNet [6]

Congratulations go to local Adelaide-based outfit, Adam Internet.  Number 1 with a bullet in 2009.  Westnet (purchased by iiNet in 2008) has always prided itself on providing subscribers with a premium customer service experience, so it was very surprising to see them knocked off their coveted number 1 spot.  Also surprising to see aaNet slip out of the Top Five altogether, replaced by iiNet.

The overall results from the three customer service questions (equally weighted) are as follows:

Table 1: Australian ISP customer service scores
<1000:below average1000:average>1000:above average

ISP 1. Time in queue 2. Speed of resolution 3. Rating of service OVERALL CUSTOMER SERVICE SCORE
Telstra Cable 513 556 731 586
Telstra DSL 488 523 724 562
Optus Cable 605 886 812 748
Optus DSL 669 668 809 710
iiNet 1389 1410 1101 1284
Internode 1636 1803 1176 1488
TPG 757 665 812 739
Westnet 2249 2601 1220 1820
Adam 2796 2544 1139 1842
Exetel 1125 954 919 991
Netspace 559 958 973 777
aaNet 903 863 904 889
iPrimus 950 1273 1025 1066
Amnet 2792 2242 1128 1774
Telstra NextG 389 365 645 437
AAPT 879 604 846 755
Other ISPs 844 725 938 826
TOTAL 1000 1000 1000 1000

It’s important to keep in mind that Whirlpool’s Australian Broadband Survey isn’t scientific.  Although it gets tens of thousands of responses, it only reflects the opinions of those who are aware of the Whirlpool site and motivated to express an opinion.  It is a self-select survey and, as such, the respondents’ attitudes may not be statistically representative of the ISP’s customer base.   In other words, take with a grain of salt.

Ranked from highest to lowest the results are as follows:
ISP (2009 score) (2008 score):

  • Adam Internet (1842) (1727)
  • Westnet (1820) (2132)
  • Amnet (1774) (1735)
  • Internode (1488) (1348)
  • iiNet (1284) (1081)
    ——
  • iPrimus (1066) (903)
  • Exetel (991) (992)
    ——
  • aaNet (889) (1204)
  • Netspace (777) (912)
  • AAPT (755) (808)
  • Optus Cable (748) (736)
  • TPG (739) (963)
  • Optus DSL (710) (672)
  • Telstra Cable (586) (711)
  • Telstra DSL (562) (676)
  • Telstra NextG (437) (562)
    ——
  • Other (826) (959)
    ——
  • AVERAGE (1000)

——

Benford’s Law in China

0. INTRODUCTION
About a week ago, Dan Silver, a management consultant in Taipei who has been studying China for 25 years, emailed me to say that he enjoyed my piece on the application of Benford’s Law to Australian population data.  Dan asked if I’d be willing to run the same tests on population data from China if he supplied the numbers.  It sounded like an interesting exercise, so I agreed.

What I found intrigued me.  The Chinese population data did not conform to Benford’s Law.  I stress that this departure from what was expected should not be taken as any kind of proof of fraudulent data entry (Benford’s Law often applies, but not always).  However, it was a surprising result.

1. BENFORD’S LAW
Tally the number of times the first digit 1 through to 9 occurs in any “real world” data source such as lengths of rivers or heights of buildings (unit of measurement doesn’t matter).  Intuitively you would expect the digits should be uniformly distributed with each one being observed 1/9th (about 11%) of the time.  Under Benford’s Law, however, a first digit of 1 tends to appear with a frequency of about 30%, a leading digit of 2 tends to occur about 18% of the time, a 3 about 13%, and so on in a logarithmically decreasing pattern, with a leading digit of 9 often being observed less than 5% of the time.

2. DATA
Dan provided land area (in square kilometres), along with two series of population data: “actual” (常住人口); and official/registered (戶籍人口) for each of the more than 350 administrative areas in China.  These data can be viewed here.  Dan notes that at the bottom of the spreadsheet are four cities which the Chinese government counts as provinces.  These are included to bring the population to the official total for China.  If you scan through the columns of figures, you’ll see that some administrative areas don’t have a land area recorded next to them, or an “actual” population.  However, they all have a registered population.  These gaps mean that the totals presented in the tables below won’t match.  The data are as at the end of 2007.

3.1 BENFORD ANALYSIS – CHINESE LAND AREA
So on to the analysis.  Let’s start by tallying the occurrences of first digits in the column of figures holding the land area data.  Table 1 summarises these frequencies (number and percentage) that were observed and expected under Benford’s Law.

Table 1: Chinese administrative land areas (sq kms)

Leading digit
No. times leading digit
was observed
No. times leading digit
was expected
1 133 (39.8%) 101 (30.1%)
2 57 (17.1%) 59 (17.6%)
3 28 (8.4%) 42 (12.5%)
4 22 (6.6%) 32 (9.7%)
5 21 (6.3%) 26 (7.9%)
6 12 (3.6%) 22 (6.7%)
7 22 (6.6%) 19 (5.8%)
8 20 (6.0%) 17 (5.1%)
9 19 (5.7%) 15 (4.6%)
TOTAL
334 (100%) 334 (100%)

I think the data are easier to digest when presented graphically.

Figure 1: Chinese administrative land areas (sq kms)


The leading digits of Chinese land area data by administrative area do decrease roughly logarithmically, although don’t technically follow a Benford distribution when subject to a Chi-square goodness of fit test (X2=26.05 on 8 d.f.; p=0.00103).  Data are overweight in figures starting with a “1” and underweight in areas leading with digits “3” to “6”.  But as Benford Law predicts, “1” is the most common leading digit, followed by “2” with the remaining leading digits bringing up the rear.

3.2 BENFORD ANALYSIS – CHINESE ACTUAL POPULATION
Similarly, let’s look at the leading digits of “actual” populations by administrative areas.

Table 2: Chinese administrative area “actual” populations

Leading digit
No. times leading digit
was observed
No. times leading digit
was expected
1 39 (14.8%) 79 (30.1%)
2 53 (20.2%) 46 (17.6%)
3 44 (16.7%) 33 (12.5%)
4 39 (14.8%) 25 (9.7%)
5 28 (10.6%) 21 (7.9%)
6 25 (9.5%) 18 (6.7%)
7 15 (5.7%) 15 (5.8%)
8 13 (4.9%) 13 (5.1%)
9 7 (2.7%) 12 (4.6%)
TOTAL
263 (100%) 263 (100%)

When plotted, the difference between frequencies observed and what was expected under Benford’s Law is, I think, striking.

Figure 2: Chinese administrative area “actual” populations


Not even close to a Benford distribution.  The leading digit of “1” actually occurs less often than a leading digit of “2”, although after that things do decrease logarithmically.

3.3 BENFORD ANALYSIS – CHINESE REGISTERED POPULATION
Dan asked that I focussed on the registered/official population figures (the highlighted column of raw data in the spreadsheet linked above).  Leading digits from the registered, official population figures by administrative areas are presented in Table 3 and Figure 3 below.

Table 3: Chinese administrative area “registered” populations

Leading digit
No. times leading digit
was observed
No. times leading digit
was expected
1 71 (19.6%) 109 (30.1%)
2 67 (18.5%) 64 (17.6%)
3 64 (17.6%) 45 (12.5%)
4 46 (12.7%) 35 (9.7%)
5 40 (11.0%) 29 (7.9%)
6 25 (6.9%) 24 (6.7%)
7 24 (6.6%) 21 (5.8%)
8 14 (3.9%) 19 (5.1%)
9 12 (3.3%) 17 (4.6%)
TOTAL
363 (100%) 363 (100%)


Figure 3: Chinese administrative area “registered” populations

Things look a bit better than the “actual” population data, but registered populations certainly don’t follow Benford’s Law either.  The leading digit of “1” is severely under-weight compared to its predicted frequency and, in fact, digits “1” through to “3” from registered population data occur with almost equal frequency.

4. SUMMARY
Chinese land area data by Chinese government administrative area follow a logarithmically decreasing pattern, although technically not a Benford distribution.  Chinese population data (“actual” and registered) by the same areas don’t follow a logarithmically decreasing pattern at all, Benford or otherwise.

That these Chinese government data don’t conform to Benford’s Law should not be taken as any kind of proof or insinuation that something untoward is going on.  It’s something that I would say warrants further investigation, but I expect there’ll be a rational explanation for it.  It’s probably more indicative of my own gaps in understanding than anything else.  Having said that, it was a very interesting exercise and I’d like to thank Dan Silver for the opportunity to write about it.

——