[OPINION] The dirty data, misleading graphs that muddle Duterte’s pandemic response

JC Punongbayan
'Until we have trustworthy data on cases, deaths, and tests to guide our policymakers, government might as well be crafting their policies by flipping coins, or leaving us all to our own devices'

So much of Filipinos’ lives now depend on the word of government officials who, in turn, shape their policies on the basis of the data: the number of COVID-19 cases, deaths, tests, and so on.

We need these data to be as clean and reliable as possible.

Yet more than 4 months since the first confirmed case of COVID-19, the government’s data are, in fact, dirtier and less reliable than ever. 

Worse, the Duterte government actively misleads and adds to the confusion with the way it presents some of its key statistics and graphs. (READ: Has change really come? Misleading graphs and how to spot them)

Absent proper data, Filipinos are effectively groping in the dark, and government officials are like the blind leading the blind.

Dirty, siloed data

First and foremost, the Department of Health (DOH) has failed to provide the people with COVID-19 data that’s clean, consistent, and fully open.

In past weeks many analysts have noticed increasingly many discrepancies in the data, particularly in the DOH’s regular data drop.

The University of the Philippines COVID-19 Pandemic Response Team — a group of professors and experts specializing in various fields — summarized the anomalies in a policy note (see Figure 1). 

For instance, in some days some variables — such as patients’ age or date of onset of COVID-19 — inexplicably disappeared. In 45 cases, people’s sexes suddenly switched from male to female or vice-versa. In 75 cases patients became younger or older overnight. In at least one case, a dead person was later reported to be alive. 

{source}<iframe src=”https://www.facebook.com/plugins/post.php?href=https%3A%2F%2Fwww.facebook.com%2FUPResilienceInstitute%2Fposts%2F262034388536392&width=500″ width=”500″ height=”806″ style=”border:none;overflow:hidden” scrolling=”no” frameborder=”0″ allowTransparency=”true” allow=”encrypted-media”></iframe>{/source}

Figure 1.

Regions and dates are also encoded using varied formats, making data cleanup needlessly hard for researchers. Cases reported by the DOH and local government units (LGUs) also don’t always tally, leading some researchers to rely more on LGUs than the DOH itself, supposedly a data clearinghouse.

In some days datasets on individual cases would disappear completely, robbing researchers of any chance to parse the data at the LGU level. In other days still, entire data drops never came at all.

Finally, the DOH has not been fully open with its data, with some private sector groups enjoying privileged access relative to others. 

This is unhealthy. Said the UP team, “Without access to full government data entrusted to select private sector groups, the task for an independent corroboration — the hallmark of any scientific undertaking — becomes impossible, to the detriment of public welfare and interest.”

“Fresh” vs “late” cases

The lack of proper data casts much doubt on the DOH’s daily reports. 

For instance, we should all be keeping an eye on the country’s “epidemic curve,” or the daily number of new COVID-19 cases. The Philippines’ epidemic curve is supposed to be hump-shaped and — if we are indeed suppressing the spread of the virus — supposed to be going down by now. 

But the latest shape of the epidemic curve is far from hump-shaped and declining. In fact, of late it seems to have skyrocketed (Figure 2).

{source}<iframe title=”Far from flattening the curve” aria-label=”Interactive line chart” id=”datawrapper-chart-G0vnp” src=”https://datawrapper.dwcdn.net/G0vnp/3/” scrolling=”no” frameborder=”0″ style=”width: 0; min-width: 100% !important; border: none;” height=”400″></iframe><script type=”text/javascript”>!function(){“use strict”;window.addEventListener(“message”,(function(a){if(void 0!==a.data[“datawrapper-height”])for(var e in a.data[“datawrapper-height”]){var t=document.getElementById(“datawrapper-chart-“+e)||document.querySelector(“iframe[src*='”+e+”‘]”);t&&(t.style.height=a.data[“datawrapper-height”][e]+”px”)}}))}();

</script> {/source}

Figure 2.

Rather than present this graph in their briefings, the DOH opted instead to change the way they report the data.

On May 28 they started to distinguish between “fresh” and “late” cases: fresh cases are those processed in the past 3 days; late cases are those processed 4 days or longer.

They justified this because of backlogs. But backlogs are not new, and one can’t help but ask: How accurate were past data pronouncements, as well as the policies that they supposedly informed? 

On May 31 the DOH announced, “Starting June 1, no more late cases will be reported until the remaining operational laboratories submit their complete line lists.” But no sooner had they said this than, on June 2, they said they’d continue the “fresh” vs “late” distinction.

The government sows even more confusion with its graphs. 

On June 1 they produced the following graph (Figure 3) mixing fresh cases, late cases, recoveries, and deaths. But by failing to show the entire epidemic curve, the graph ultimately fails to convey our progress against COVID-19 — defeating the purpose of graphing in the first place.

Figure 3. Source: Laging Handa briefing on June 1, 2020.

Ostensibly, to give a more proper picture of the epidemic curve, the DOH likes to show an alternative graph showing the number of new cases by date of onset (Figure 4). 

But data analysts say nobody outside the DOH can replicate this. 

Figure 4. Source: DOH briefing on May 30, 2020.

If you use the DOH’s data drops alone, you will in fact arrive at an epidemic curve with a completely different shape (Figure 5). 

Figure 5. Source: Prof. Peter Cayton.

The DOH also claimed that on June 1 it recorded 1.6 deaths per day, the “lowest number of mortalities in a single day since early March.” But this is an illusion; one data analyst says many fatalities in late May won’t be reported until middle or end of June (Figure 6). 


{source}<blockquote class=”twitter-tweet”><p lang=”en” dir=”ltr”>Department of Health in the Philippines <a href=”https://twitter.com/DOHgovph?ref_src=twsrc%5Etfw“>@DOHgovph</a> is once again publishing misleading data. For the month of May, there&#39;s no single day of reporting where daily fatality was less than 4 in a day, yet they declared the average was 1.6 per day? <a href=”https://twitter.com/hashtag/COVID19PH?src=hash&amp;ref_src=twsrc%5Etfw“>#COVID19PH</a> <a href=”https://t.co/TeTfZWLP7M>pic.twitter.com/TeTfZWLP7M</a></p>&mdash; Drei (@_drei) <a href=”https://twitter.com/_drei/status/1268334450795884544?ref_src=twsrc%5Etfw“>June 4, 2020</a></blockquote> <script async src=”https://platform.twitter.com/widgets.js” charset=”utf-8″></script>{/source}

Figure 6.

Why these blatant discrepancies? Does the DOH know something we don’t know? What information are they hiding from us? And, most importantly, what’s the real shape of the epidemic curve? 

Your guess is as good as mine — or the DOH’s.

Testing capacity lies

Government has also conspicuously distorted testing data. 

On June 1 Presidential Spokesperson Harry Roque presented a graph (Figure 7) and confidently said, “Naabot na po natin ang ating target na 30,000 PCR tests per day. Ang original target po ay [30,000] by May 30, pero nung May 20 po eh nakaabot na po tayo sa 32,100 tests per day. Nalampasan po natin ang ating target.” 

(We already reached our target of 30,000 PCR tests per day. The original target was 3,000 by May 30, but we already reached 32,100 per day as early as May 20. We already exceeded our target.)

But this is at odds with DOH’s testing data, which show 9,795 tests as of May 20 — almost 70% lower than Roque’s claim.

Even supposing testing capacity (not actual tests) did increase to 32,100, why were only 9,795 tests conducted that day? What explains the underutilization of testing capacity?

Figure 7. Date: May 25, 2020.

Roque added, “Anim na beses na po ang inilaki ng ating testing capacity (PCR) mula 5,000 noong April 15, sa loob lamang ng isang buwan.” (We managed to increase our PCR testing capacity sixfold in just a month, from 5,000 last April 15.)

But in fact actual tests increased by only 3.6 times, not 6.

For one reason or another, government keeps missing its own testing targets. Previously they aimed to conduct 8,000 tests by end of April, but only achieved that in mid-May.

Worse, they choose to hide their incompetence by deliberately fudging numbers — hoping perhaps no one would notice. 

It’s all about trust

On June 1, the Duterte government already put Metro Manila under general community quarantine (GCQ), in a bid to reopen the economy and abate the dire economic slump we are in now. 

Reopening the economy boils down to a matter of trust: before Filipinos can go out of their homes to work or travel, they should be able to trust that the risks of contracting COVID-19 are minimal — and enough policies and safeguards are in place to minimize such risks.

Data is our primary weapon in this battle. But until we have trustworthy data on cases, deaths, and tests to guide our policymakers, government might as well be crafting their policies by flipping coins, or leaving us all to our own devices.

What’s new, though? – Rappler.com

The author is a PhD candidate and teaching fellow at the UP School of Economics. His views are independent of the views of his affiliations. Thanks to Prof. Peter Cayton for useful comments and suggestions. Follow JC on Twitter (@jcpunongbayan) and Usapang Econ.

JC Punongbayan

JC Punongbayan, PhD is a senior lecturer at the UP School of Economics. His views are independent of the views of his affiliations.