Can Big Data measure progress and development?

Jose Ramon G. Albert

This is AI generated summarization, which may have errors. For context, always refer to the full article.

Can Big Data measure progress and development?
While Big Data are here to stay, they do not mean the end of official statistics. The challenge is to explore how to make use of nontraditional data sources, such as Big Data, to complement traditional data sources

Big Data – what is it all about?

In 2008, Google established a near-real-time flu tracker called “Google Flu Trends” that monitors Google searches on the flu. An article in Nature (Ginsburg et al. 2009) reported that flu incidence estimate from Google correlates strongly with the official statistics released by the US Centers for Disease Control and Prevention (CDC). What was astonishing here was that the Google statistics on flu incidence were aggregates with a delay of just one day, while official statistics from the CDC took a week to put together based on administrative reports from hospitals. What was also astounding was that the flu tracker was quick, accurate, and cheap, while official statistics were not as timely and involved huge costs.

Since then, the term “Big Data” has become a buzzword in business, technology, and science, taking over the hype from “Data Mining” and other similar terms about the use of information from huge databases. Although there is no official definition yet of Big Data, they are thought of as various digital data by-products from electronic devices (smart phones, tablets, laptops), social media, search engines, as well as sensors and tracking devices (including climate sensors and global positioning system or GPS). Big Data are characterized by 3Vs: (high) volume, velocity, and variety.

Businesses have used Big Data to learn about customer preferences. For instance, Amazon uses customer database to inform clients that “customers who bought Product A also bought Product B, Product C or Product D …” based on predictive modeling association rules and collaborative filtering. Social media data, such as tweets on Twitter, are examined in terms of “polarity” (i.e., positive, negative, or neutral) of expressed sentiments on a product, such as a movie.

Big Data applications have gone beyond the realms of business. At the 44th Session of the United Nations (UN) Statistics Commission held in February 2003, a seminar titled “Big Data for Policy, Development and Official Statistics” was conducted where the High-Level Group (HLG) for the Modernisation of Statistical Production and Services released a white paper discussing issues on the use of Big Data in official statistics, especially the major concern on privacy that could potentially allow “Big Brother” to watch over us.

This Policy Note provides basic information on Big Data and how they compare with traditional data sources of official statistics. Taking note of their limitations, the possible developmental uses in the Philippines of this nontraditional data source are explored, such as in disaster risk management. It also discusses privacy, analytics, and other issues that could lead to the misuse of big data.

Public policy, statistics development, and the data revolution

The information age has led us to recognize the importance of statistics for managing economies effectively. Countries desire to meet national development plans and accelerate progress in reducing poverty and other Millennium Development Goals (MDGs). The Report of the High-Level Panel (HLP) of Eminent Persons initiated discussions on the Post-2015 Development Agenda taking stock of the unfinished agenda from the MDGs, of the need to reach zero poverty, and of ensuring “Leaving No One Behind.” As the UN’s Open Working Group (OWG) on Sustainable Development Goals finalizes the Post-2015 Agenda, statisticians have been sought after for their experience and expertise.

Developing countries are crafting roadmaps for statistics development, especially in the context of the data revolution that involves use of new technologies and techniques to yield more reliable data for informing policymakers and measuring the progress of societies.

The UN Global Pulse reports of successes correlating Twitter tweets in Jakarta on the high price of rice with the actual price of rice (Letouze 2012), as well as examining mobile phone usage in Jakarta in relation to real-time roadway traffic conditions in the Indonesian capital. Digital traces of mobile phone usage can be examined to look into population movements and people’s behavior in the wake of disasters. 

Watch this video for a visualization in Geneva:

There is, however, some apprehension about Big Data since these are not tailor made for statistical purposes and thus could yield inaccurate statistics. But this is also true of administrative reporting systems, which are the basis of many official statistics. Producers of statistics within governments follow a code of practice embodied in the United Nations Fundamental Principles of Official Statistics (FPOS). Implicit in the first principle are the definition of official statistics, which is, “data about the economic, demographic, social and environmental situation,” and the mandate of national statistical systems to be auditors of a country’s socioeconomic performance.

Maintaining independence and adherence to professional conduct in the production and release of official statistics help guarantee the credibility of official statistics in the public arena.

While the quality of statistics involves various criteria, there has been more focus by national statistics offices (NSOs) on managing precision and accuracy over timeliness and other quality issues. Official statistics are almost exclusively based on surveys and censuses, as well as administrative data reporting systems of government programs that often arise from legislative mandates provided to NSOs.

A comparison between traditional data sources and big data is given in Table 1. It shows that Big Data is largely unstructured, unfiltered data exhaust from digital products. Its analytics can be poor, but traditional data sources have a fairly high cost for data collection and are typically conducted irregularly with time lags that stakeholders find unreasonable in an age of fast data.

 

Fixed-telephone subscriptions per 100 population: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam 24.26 26.08 23.46 23.17 23.03 22.81 21.4 20.86 20.82 20.42 19.95 19.64 17.21
Cambodia 0.25 0.27 0.28 0.24 0.24 0.25 0.19 0.27 0.31 0.38 2.5 3.63 3.93
Indonesia 3.19 3.41 3.6 3.69 4.69 6.02 6.51 8.46 12.97 14.66 17.01 15.84 15.39
Lao P.D.R. 0.76 0.96 1.12 1.24 1.32 1.57 1.56 1.58 2.08 1.6 1.61 1.65 1.81
Malaysia 19.76 19.68 19.13 18.37 17.53 16.89 16.49 16.22 16.53 16.28 16.3 15.73 15.69
Myanmar 0.56 0.6 0.69 0.73 0.85 1 1.13 0.91 0.99 0.86 0.95 1 1.05
Philippines 3.94 4.18 4.09 4.04 4.08 3.92 4.16 4.43 4.51 4.46 3.57 3.74 4.07
Singapore 49.67 48.41 46.59 44.41 42.44 41.03 40.17 39.34 38.69 38.9 39.3 38.87 37.48
Thailand 8.97 9.59 10.28 10.28 10.47 10.73 10.73 10.63 11.17 10.87 10.29 9.94 9.51
Viet Nam 3.14 3.73 4.76 5.28 12.03 9.99 12.9 16.9 19.76 16.14 11.32 11.22
… – not available
Source: International Telecommunication Union

 

Mobile-cellular telephone subscriptions per 100 population: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam 28.63 42.17 44.35 50.15 56.11 63.32 80.44 95.99 102.79 104.69 108.62 109.02 113.95
Cambodia 1.07 1.79 2.99 3.85 6.55 7.95 12.7 18.79 30.39 44.31 56.74 94.19 128.53
Indonesia 1.76 3.08 5.44 8.48 13.71 20.9 28.02 40.43 60.01 68.92 87.79 102.46 114.22
Lao P.D.R. 0.24 0.54 0.99 2 3.58 11.36 17.12 24.59 32.94 51.61 62.59 84.05 64.7
Malaysia 21.87 30.87 37.08 44.69 57.6 75.63 73.93 87.07 101.5 108.47 119.74 127.48 141.33
Myanmar 0.03 0.05 0.1 0.13 0.19 0.26 0.42 0.49 0.72 0.97 1.14 2.38 10.3
Philippines 8.31 15.33 19 27.25 39.1 40.52 49.07 64.52 75.37 82.26 88.98 99.09 106.51
Singapore 70.12 74.36 80.1 84.07 91.21 97.53 103.78 125.19 132.3 138.69 145.4 150.12 152.13
Thailand 4.9 11.97 27.35 33.52 41.43 46.46 60.9 80.17 93.43 99.51 108.02 116.33 127.29
Viet Nam 0.97 1.53 2.3 3.29 5.89 11.29 22.03 52.02 85.7 111.37 125.29 141.6 147.66
Source: International Telecommunication Union

 

Percentage of Individuals using the Internet: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam 9 12.92 15.33 19.6 29.72 36.47 42.19 44.68 46 49 53 56 60.27
Cambodia 0.05 0.08 0.23 0.26 0.3 0.32 0.47 0.49 0.51 0.53 1.26 3.1 4.94
Indonesia 0.93 2.02 2.13 2.39 2.6 3.6 4.76 5.79 7.92 6.92 10.92 12.28 15.36
Lao P.D.R. 0.11 0.18 0.27 0.33 0.36 0.85 1.17 1.64 3.55 6 7 9 10.75
Malaysia 21.38 26.7 32.34 34.97 42.25 48.63 51.64 55.7 55.8 55.9 56.3 61 65.8
Myanmar …  0 0 0.02 0.02 0.07 0.18 0.22 0.22 0.22 0.25 0.98 1.07
Philippines 1.98 2.52 4.33 4.86 5.24 5.4 5.74 5.97 6.22 9 25 29 36.24
Singapore 36 41.67 47 53.84 62 61 59 69.9 69 69 71 71 74.18
Thailand 3.69 5.56 7.53 9.3 10.68 15.03 17.16 20.03 18.2 20.1 22.4 23.7 26.5
Viet Nam 0.25 1.27 1.85 3.78 7.64 12.74 17.25 20.76 23.92 26.55 30.65 35.07 39.49

 

Fixed (wired)-broadband subscriptions per 100 inhabitants: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam …  0.56 0.8 1.09 1.74 2.21 2.39 3.05 4.35 5.08 5.42 5.7 4.81
Cambodia …  0 0 0 0.01 0.01 0.02 0.06 0.12 0.21 0.25 0.15 0.2
Indonesia 0 0.01 0.02 0.03 0.04 0.05 0.09 0.34 0.42 0.78 0.95 1.12 1.21
Lao P.D.R. …  0 0 0 0 0.01 0.01 0.02 0.05 0.07 0.09 0.1 0.11
Malaysia 0 0.02 0.08 0.44 1 1.87 2.85 3.82 4.83 5.55 6.49 7.43 8.41
Myanmar 0 0 0 0 0 0 0.01 0.01 0.02 0.04 0.04 0.02 0.01
Philippines 0 0.01 0.03 0.07 0.11 0.14 0.3 0.56 1.16 1.87 1.84 1.88 2.22
Singapore 1.76 3.75 6.53 9.8 12.46 14.6 17.08 18.94 21.12 23.58 24.98 25.61 25.44
Thailand 0 0 …  0.02 0.25 0.85 1.36 1.96 3.13 3.96 4.9 5.89 8.15
Viet Nam …  0 0 0.01 0.06 0.25 0.6 1.5 2.35 3.64 4.12 4.27 4.9
… – not available
Source: International Telecommunication Union

Moreover, data sources in official statistics have been “tried and tested” mechanisms for ensuring the credibility of official statistics. National income accounts and data on prices, among others, typically follow a Statistical Quality Assessment Framework to ensure integrity and credibility of resulting figures. Big Data, however, are like a tsunami of digital exhaust that can be a messy collage of data points collected for distinct purposes but whose accuracy is difficult to establish. Nevertheless, statistics from Big Data can be updated in near real time.

The rise of Big Data

The Mckinsey Global Institute (Manyika et al. 2011) suggests that there are three contributory factors to the rise of Big Data: (a) sensors and electronic gadgets being interconnected to computing resources; (b) availability of data on the public domain (especially on social media); and (c) suitable technologies, mainly statistical models and methods for data mining. Internet penetration certainly has led to an increase in data on the public domain. In the Philippines, one out of three (36%) has access to the Net in 2012, a big rise from the 2% figure in 2000 (Figure 1). Similarly, mobile subscription has greatly increased and even surpassed the total population. In the Philippines, as of 2012, there were already 102 mobile subscriptions per 100 persons (Figure 2), which means that some people in the Philippines have more than one mobile subscription.1

1 To count subscribers, we would have to remove double counts. However, telecommunication companies (hereinafter called ‘telcos’) do not have information about the identities of prepaid subscribers. Only postpaid subscribers have to register with the telcos; it is also assumed that each telco will not be sharing their database of postpaid customers with other telcos.

Figure 1. Percentage of individuals using the Internet in the ASEAN member-economies: 2000–2012

Source: International Telecommunication Union

Figure 2. Mobile telephone subscriptions per 100 inhabitants in the ASEAN member-economies, 2000–2012

Source: International Telecommunication Union

Fixed-telephone subscriptions per 100 population: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam 24.26 26.08 23.46 23.17 23.03 22.81 21.4 20.86 20.82 20.42 19.95 19.64 17.21
Cambodia 0.25 0.27 0.28 0.24 0.24 0.25 0.19 0.27 0.31 0.38 2.5 3.63 3.93
Indonesia 3.19 3.41 3.6 3.69 4.69 6.02 6.51 8.46 12.97 14.66 17.01 15.84 15.39
Lao P.D.R. 0.76 0.96 1.12 1.24 1.32 1.57 1.56 1.58 2.08 1.6 1.61 1.65 1.81
Malaysia 19.76 19.68 19.13 18.37 17.53 16.89 16.49 16.22 16.53 16.28 16.3 15.73 15.69
Myanmar 0.56 0.6 0.69 0.73 0.85 1 1.13 0.91 0.99 0.86 0.95 1 1.05
Philippines 3.94 4.18 4.09 4.04 4.08 3.92 4.16 4.43 4.51 4.46 3.57 3.74 4.07
Singapore 49.67 48.41 46.59 44.41 42.44 41.03 40.17 39.34 38.69 38.9 39.3 38.87 37.48
Thailand 8.97 9.59 10.28 10.28 10.47 10.73 10.73 10.63 11.17 10.87 10.29 9.94 9.51
Viet Nam 3.14 3.73 4.76 5.28 12.03 9.99 12.9 16.9 19.76 16.14 11.32 11.22
… – not available
Source: International Telecommunication Union

 

Mobile-cellular telephone subscriptions per 100 population: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam 28.63 42.17 44.35 50.15 56.11 63.32 80.44 95.99 102.79 104.69 108.62 109.02 113.95
Cambodia 1.07 1.79 2.99 3.85 6.55 7.95 12.7 18.79 30.39 44.31 56.74 94.19 128.53
Indonesia 1.76 3.08 5.44 8.48 13.71 20.9 28.02 40.43 60.01 68.92 87.79 102.46 114.22
Lao P.D.R. 0.24 0.54 0.99 2 3.58 11.36 17.12 24.59 32.94 51.61 62.59 84.05 64.7
Malaysia 21.87 30.87 37.08 44.69 57.6 75.63 73.93 87.07 101.5 108.47 119.74 127.48 141.33
Myanmar 0.03 0.05 0.1 0.13 0.19 0.26 0.42 0.49 0.72 0.97 1.14 2.38 10.3
Philippines 8.31 15.33 19 27.25 39.1 40.52 49.07 64.52 75.37 82.26 88.98 99.09 106.51
Singapore 70.12 74.36 80.1 84.07 91.21 97.53 103.78 125.19 132.3 138.69 145.4 150.12 152.13
Thailand 4.9 11.97 27.35 33.52 41.43 46.46 60.9 80.17 93.43 99.51 108.02 116.33 127.29
Viet Nam 0.97 1.53 2.3 3.29 5.89 11.29 22.03 52.02 85.7 111.37 125.29 141.6 147.66
Source: International Telecommunication Union

 

Percentage of Individuals using the Internet: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam 9 12.92 15.33 19.6 29.72 36.47 42.19 44.68 46 49 53 56 60.27
Cambodia 0.05 0.08 0.23 0.26 0.3 0.32 0.47 0.49 0.51 0.53 1.26 3.1 4.94
Indonesia 0.93 2.02 2.13 2.39 2.6 3.6 4.76 5.79 7.92 6.92 10.92 12.28 15.36
Lao P.D.R. 0.11 0.18 0.27 0.33 0.36 0.85 1.17 1.64 3.55 6 7 9 10.75
Malaysia 21.38 26.7 32.34 34.97 42.25 48.63 51.64 55.7 55.8 55.9 56.3 61 65.8
Myanmar …  0 0 0.02 0.02 0.07 0.18 0.22 0.22 0.22 0.25 0.98 1.07
Philippines 1.98 2.52 4.33 4.86 5.24 5.4 5.74 5.97 6.22 9 25 29 36.24
Singapore 36 41.67 47 53.84 62 61 59 69.9 69 69 71 71 74.18
Thailand 3.69 5.56 7.53 9.3 10.68 15.03 17.16 20.03 18.2 20.1 22.4 23.7 26.5
Viet Nam 0.25 1.27 1.85 3.78 7.64 12.74 17.25 20.76 23.92 26.55 30.65 35.07 39.49

 

Fixed (wired)-broadband subscriptions per 100 inhabitants: ASEAN, 2000- 2012
ASEAN Member State2000200120022003200420052006200720082009201020112012
Brunei Darussalam …  0.56 0.8 1.09 1.74 2.21 2.39 3.05 4.35 5.08 5.42 5.7 4.81
Cambodia …  0 0 0 0.01 0.01 0.02 0.06 0.12 0.21 0.25 0.15 0.2
Indonesia 0 0.01 0.02 0.03 0.04 0.05 0.09 0.34 0.42 0.78 0.95 1.12 1.21
Lao P.D.R. …  0 0 0 0 0.01 0.01 0.02 0.05 0.07 0.09 0.1 0.11
Malaysia 0 0.02 0.08 0.44 1 1.87 2.85 3.82 4.83 5.55 6.49 7.43 8.41
Myanmar 0 0 0 0 0 0 0.01 0.01 0.02 0.04 0.04 0.02 0.01
Philippines 0 0.01 0.03 0.07 0.11 0.14 0.3 0.56 1.16 1.87 1.84 1.88 2.22
Singapore 1.76 3.75 6.53 9.8 12.46 14.6 17.08 18.94 21.12 23.58 24.98 25.61 25.44
Thailand 0 0 …  0.02 0.25 0.85 1.36 1.96 3.13 3.96 4.9 5.89 8.15
Viet Nam …  0 0 0.01 0.06 0.25 0.6 1.5 2.35 3.64 4.12 4.27 4.9
… – not available
Source: International Telecommunication Union

Moreover, the capacity to collect, store, retrieve, use, and reuse data has risen because of technologies and gadgets. In 2012, the amount of data worldwide was reported to have doubled every 40 months since the 1980s, with about 2.5 quintillion (2.5 x 1018) bytes of data being created daily.

As regards social media usage, Forrester Research (Kaplan and Haelein 2009) reports that as of the second quarter of 2008, three out every four Internet surfers used social media, a significant rise from the estimated 56 percent in 2007. In the Philippines, as of February 2013, about a third of Filipinos were on Facebook. As of July 2012, there were around 9.5 million Twitter users in the Philippines. Posts on social media and other data shared and transmitted on the Web and by way of various electronic means, including tracking devices (such as mobile phones and GPS), are rising at an exponential rate in a variety of formats. With this phenomenal rise, public need and expectation for “Knowing (information) in (Real) Time” have likewise hastened.

Use of Big Data in Philippine disaster risk management

In the Philippines, climate disasters are considered a very serious threat to the country’s development (Thomas et al. 2013). The Philippine government’s weather bureau reports that the annual average frequency of tropical cyclones entering the country between 1951 and 2010 has remained unchanged at 19–20 cyclones. However, their path has shifted toward central and southern Philippines in recent decades. The amount of rainfall that comes with these cyclones has also increased. To manage risks associated with natural hazards and disasters, the government started a flagship project called Nationwide Operational Assessment of Hazards (NOAH) in June 2012. Project NOAH involves the development of hydromet sensors (e.g., automatic rain gauges, water-level sensors, stream gauges) and high-resolution geohazard maps. The latter can give government lead time of about six hours or less to act and thus could minimize the damage to lives, property, and livelihood from natural hazards. Project NOAH uses topographic maps generated by light detection and ranging (LiDAR) for flood modeling but currently the maps generated are limited to selected locations around the country’s 18 major river basins. These maps and other weather information are shared publicly through the Project NOAH website and mirror sites.

These high-velocity and high-volume data have helped national and local governments to become more prepared for disasters. In Cagayan de Oro City, there is evidence of how better access to information has saved lives. In 2011, Typhoon Sendong led to 676 deaths in Cagayan de Oro City. A year later, a typhoon with a similar strength (Pablo) only had one associated death reported. The huge deaths caused by Super Typhoon Yolanda (Haiyan), whose direction was accurately predicted by Project NOAH, suggest the importance of having local chief executives understand disaster risk data. Otherwise, information has no use to minimize the costs of disasters.

Issues on the use of Big Data

While there is reason to be excited about Big Data, there are issues that need to be examined. Much of Big Data includes personal data with precise, geolocation-based information. We are well aware that e-commerce sites are watching our shopping preferences; search engines are examining our browsing habits; social media sites are inspecting our personal data, including our social relationships and what we share; and mobile service providers are collecting data on who we talk or send text messages to and possibly what we say to them. Using an example from social media, Johannes Jutting of the PARIS21 Secretariat illustrated the ill effects of interlinked data during the 2013 National Convention on Statistics.

Watch the video:

Privacy has legal, technological, and ethical issues. Users of technology would routinely tick a box to gain access to Web-generated data, unconscious or uninformed of the possible negative consequences. When Google Flu Trends was developed in 2008, did Google contact all its users for approval to use old search queries for this project? Even if that were possible, the time and cost required would have been enormous for Google. So, should users be asked to agree to any possible future use of their data?

Of course, there are other ways to protect privacy and confidentiality. Providers of data could opt out (but this can still leave a trace), and the same goes for anonimization (as “re-identification” is still possible).

Entities with Big Data holdings would not desire to put their businesses at stake if the public gets more concerned about privacy issues. In the case of social media, much of what is in the public domain does not involve privacy. New applications such as “Whisper” have provided people with a tool to vent their “shout-outs” but without having them traced.

Ensuring the accuracy of big data is fraught with methodological challenges. Statistics guru Nate Silver (2013) points out that:

“[Big Data] is sometimes seen as a cure-all, … Chris Anderson… wrote in 2008 that … sheer volume … would obviate … need for theory, and even the scientific method…. [T]hese views are badly mistaken. … If the quantity of information is increasing by 2.5 quintillion bytes per day, the amount of useful information almost certainly isn’t. Most of it is just noise, and the noise is increasing faster than the signal.”

Some think that while Big Data may not be accurate, they are good enough. But how good is good enough? A recent article in Nature (Butler 2013) reports the discrepancy between the Google Virus Trends estimate of flu levels in the United States and the official estimate from the CDC (11% versus 6%). A study of Twitter and Foursquare data before, during, and in aftermath of Hurricane Sandy (Grinberg et al. 2013) reveals that the greatest number of tweets about Hurricane Sandy came from Manhattan. This created an illusion that Manhattan was the most hit by Hurricane Sandy, which certainly was not.

Mayer-Schonberger and Cukier (2013) discuss the dangers of predictive modeling with Big Data, citing as an example the movie “Minority Report”, which is about someone getting arrested for a crime he is supposedly going to commit. The authors point out that predictive modeling is already being used in the United States, as the following examples show:

(a) The parole boards in the United States are using “predictions” from data analysis to decide whether to give parole to inmates or not.

(b) The City of Memphis, Tennessee uses a program called Blue CRUSH (Crime Reduction Utilizing Statistical History) to focus police resources on certain areas at certain time periods. The crime incidence reportedly fell by a quarter in 2006 since the CRUSH program was introduced.

(c) The US Department of Homeland Security uses the Future Attribute Screening Technology to identify potential terrorists. This technology is reportedly 70% accurate but how this rate was obtained is baffling.

Conclusion

While Big Data are here to stay, the data revolution and the use of Big Data do not mean the end of official statistics. The challenge is to explore how to make use of nontraditional data sources, such as Big Data, to complement traditional data sources. There is also a need to identify legal protocols and institutional arrangements to access Big Data holdings for development purposes. Current efforts are being initiated by Senator Bam Aquino toward the establishment of a Big Data center in the Philippines. If it will be similar to Jakarta’s Pulse Lab, a public-private partnership scheme for its establishment is required.

Issues on privacy, security, intellectual property, accessibility for development purposes, and accountability have to be addressed to prevent the misuse of Big Data. While Big Data sources have uses to examine issues on development, the right to privacy is an ethical issue that should not be overlooked. Even after legal issues have been resolved, investments on capacity building would be needed to harness Big Data and train users so that the official statistics community can help identify “signals” within “noise”, certify quality, and ultimately decipher truth from falsehood. This way, statistics can truly matter.

References

Butler, D. 2013. When Google got flu wrong. Nature. (accessed on August 15, 2013).

Fellegi, I. 1996. Characteristics of an effective statistical system. International Statistical Review 64:165–197.

Ginsburg, J., M. Mohebbi1, R. Patel, L. Brammer, M. Smolinski, and L. Brilliant. 2009. Detecting influenza epidemics using search engine query data. Nature 457:1012–14. (accessed on August 15, 2013).

Grinberg, N., M. Naaman, B. Shaw, and G. Lotan. 2013. Extracting diurnal patterns of real world activity from social media. (accessed on August 30, 2013).

Kaplan, A. and M. Haelein. 2009 Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons (2010) 53, 59—68. (accessed on December 15, 2013).

Letouzé, E. 2012. Big Data for development: Challenges and opportunities. UN Global Pulse. (accessed on August 15, 2013).

Manyika, J., M. Chui, J. Bughin, B. Brown, R. Dobbs, C. Roxburgh, and A.H. Byers. 2011. Big Data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute. (accessed on August 15, 2013).

Mayer-Schonberger, V. and K. Cukier. 2013. Big Data: A revolution that will transform how we live, work, and think. New York: Houghton Mifflin Harcourt Publishing Company.

Silver, N. 2013. The signal and the noise: Why so many predictions fail—but some don’t. United Kingdom: Penguin.

Thomas, V., J.R. Albert, and R. Perez. 2013. Climate-related disasters in Asia and the Pacific. ADB Economics Working Paper Series No. 358. Mandaluyong City: Asian Development Bank.

United Nations Economic Commission for Europe (UNECE). 2013. What does “Big Data” mean for official statistics? (accessed on February 15, 2013). – Rappler.com

The article is a reprint from Philippine Institute for Development Studies (PIDS) Policy Notes 2014-06. Views expressed here do not reflect those of the PIDS.

Jose Ramon “Toots” Albert is a Senior Research Fellow of the PIDS. He finished a PhD in Statistics from the State University of New York at Stony Brook.  He is the 2014-2015 President of the Philippine Statistical Association Inc., the nation’s sole professional organization of statistical practitioners. He is also an elected member of the International Statistical Institute, and of the National Research Council of the Philippines, as well as a Fellow of the Social Weather Stations. From 2012-2014, he served as Secretary General of the now defunct National Statistical Coordination Board. He has served as consultant to various government agencies, private firms, and international organizations. He has worked for 19 countries spanning South East Asia, South Asia, East Asia, the Pacific, the Middle East, and Africa on poverty measurement and analysis, econometric methods, and survey data analysis.

 

Image from Shutterstock 

Add a comment

Sort by

There are no comments yet. Add your comment to start the conversation.

Summarize this article with AI

How does this make you feel?

Loading
Download the Rappler App!