If you’ve ever been to a party, then you’ve probably been introduced to the concept of small talk. That awkward chit-chat between two people who don’t really know each other but find themselves trapped in the same room with nowhere to go but through the ice. And we often have our ice breakers ready too. “How do you know the host?”, “I like your shoes, where did you get them?” and the perennial “What do you do?”.
There’s nothing scarier than that last question. Trust us, we’re data analysts. In fact, if I had a superpower, it would probably be turning people’s faces blank everywhere I utter the words “data” next to “analyst”. But, I get it, nobody really knows what the hell that even means.
Plus, after the heavily moustached rugby player just told his story of the championship winning drop kick, it just doesn’t quite have the same ring to it.
But, over the past year and a half, the weirdest thing has happened. In some twisted turn of events, people’s ears have started to prick up whenever I mention my job title. And I know what’s coming next too. A question about COVID, and something along the lines of “…and what the bloody hell does data modelling mean? They keep harping on about it in the press conferences”.
What ensues is a drawn out, often tipsy conversation where I try to explain data flows, excel spreadsheets, statistical software and data visualisation, all while watching the light slowly die in their eyes. No fantastical story about atomic powered AIs, the reality of COVID modelling is very boring. Well, not for those of us who work in the data space, but I digress.
Data modelling has become something of a buzzword, as it’s thrown around like a greasy rugby ball at year 7 lunch in those 11am press conferences. “These new restrictions are based on the data”, Gladys Berejikilian will say as she goes on to mandate that nobody can leave their house, masks must be worn, and chicken can only be cooked in the oven between the hours of 7am and 3pm on a Tuesday.
It leaves many people scratching their head, wondering what actually goes on behind those curtains covered in red wattles. Well, fear no longer, the analytics department is here!
So, what is statistical modelling? A statistical model, put simply, is a mathematical representation of observed data. When looking at data, it is usually simple to look for and identify a pattern. All modelling does is allow the observer to create a mathematical representation of these patterns. This mathematical representation opens the door for accurate forecasts based on previous trends.
While we’re hearing our lockdown rules being dictated off the back of “the modelling”, it helps to better understand these decisions when we can see the raw data for ourselves. When hearing government agencies talk about these complicated statistical models and the maths/data behind them, it’s easy to get overwhelmed and assume to “leave it to the professionals”. Although, in Australia, we are lucky enough to have this data publicly available and easily accessible. Allowing everyone to look at the data themselves and with a little research begin conducting their own crude statistical models.
The maths/data behind these models isn’t an overly complicated secret warehouse of numbers, but in reality a group of passionate professionals relaying their findings to those in charge. These findings are used to inform the decisions made around the pandemic and lockdown restrictions (not always to best effect as we’ve seen recently in NSW).
We are lucky enough in Australia to have this full transparency of data made available by these government agencies. With Open data policies from NSW Health and Transport NSW, we have available information around the current pandemic and public transportation systems. We all most likely use this data in our everyday lives without really noticing, with live feeds in applications such as Google Maps and Trip view.
This data is easily available from government sites such as Opendata.transport.nsw and data.nsw.gov.au. Accessing these sites is as simple as generating an account, where you are given in individual auth code that allows this data to be pulled through API software. Although, most of the data can be downloaded as a simple excel file.
From here, the sky's the limit. You can build out simple graphs in excel. Or you can create in-depth data visualisation dashboards and run statistical models like us:
Using the publicly available data, we were able to graph the distribution of COVID cases across Australia, look at things like the average income of those infected, their background, and how the different strains of COVID move through society.
With this information, we were then able to pull top level insights, such as the fact that COVID case numbers have a direct correlation with the number of cars on roads in areas affected. Fewer cars, fewer cases. We can also identify and predict areas of risk using demographic and sociographic profiles. Different people move around society in different ways, and as a result, the virus will spread throughout their postcodes differently too.
Interestingly, the NSW government has become much more effective at containing COVID outbreaks. In fact, high case numbers today are a direct result of a more virulent strain infecting twice as many people within the same postcode as in the previous outbreak, despite harsher restrictions. And that’s just the tip of the proverbial iceberg. Using the data, we can run all sorts of statistical modelling, including trying to replicate the Rt score often used to predict future case numbers.
Over the coming months we will continue to work with the data and publish updates, new findings and give insights into the life of a data analyst. We aren’t health experts, we don’t work for the government, but thanks to Australia’s extensive open data policies we are able to gain in-depth insights into the data and modelling that is dictating NSW COVID restrictions. Many of the processes we used to arrive at these insights were very straightforward – anyone can replicate them if they have a computer, the internet and Excel.
In fact, a data analyst’s job is becoming less and less specialised. We’re all doing data analytics in our day-to-day as we navigate graphs, numbers and statistics. It’s a reality of the modern world. So maybe we should stop being so scared of the numbers, embrace our inner nerds and fire up Excel. We might all be a little more understanding and cooperative if we did!
Thanks for coming! If you found this post of interest, please like, share or, better still, follow us on LinkedIn: https://www.linkedin.com/company/spark-foundry-australia