DataCafé artwork

DataCafé

30 episodes - English - Latest episode: 11 months ago -

Welcome to the DataCafé: a special-interest Data Science podcast with Dr Jason Byrne and Dr Jeremy Bradley, interviewing leading data science researchers and domain experts in all things business, stats, maths, science and tech.

Mathematics Science Education datacafé data science data analytics advanced analytics business intelligence optimisation machine learning artificial intelligence data engineering
Homepage Google Podcasts Overcast Castro Pocket Casts RSS feed

Episodes

Science Communication with physicist Laurie Winkless, author of "Sticky" & "Science and the City"

June 02, 2023 07:00 - 36 minutes - 25.3 MB

A key part of the scientific method is communicating the insights to an audience, for any field of research or problem context. This is where the ultimate value comes from: by sharing the cutting-edge results that can improve our understanding of the world and help deliver new innovations in people's lives. Effective science communication sits at the intersection of data, research, and the art of storytelling. In this episode of the DataCafé we have the pleasure of welcoming Laurie Winkless...

A Culture of Innovation

September 06, 2022 07:00 - 33 minutes - 23.3 MB

Culture is a key enabler of innovation in an organisation. Culture underpins the values that are important to people and the motivations for their behaviours. When these values and behaviours align with the goals of innovation, it can lead to high performance across teams that are tasked with the challenge of leading, inspiring and delivering innovation. Many scientists and researchers are faced with these challenges in various scenarios, yet may be unaware of the level of influence that com...

Scaling the Internet

July 30, 2022 09:00 - 45 minutes - 31.2 MB

Do you have multiple devices connected to your internet fighting for your bandwidth? Are you asking your children (or even neighbours!) to get off the network so you can finish an important call? Recent lockdowns caused huge network contention as everyone moved to online meetings and virtual classrooms. This is an optimisation challenge that requires advanced modelling and simulation to tackle. How can a network provider know how much bandwidth to provision to a town or a city to cope with p...

[Bite] Documenting Data Science Project Work

June 29, 2022 06:00 - 16 minutes - 11.5 MB

Do you ever find yourself wondering what the data was you used in a project? When was it obtained and where is it stored? Or even just the way to run a piece of code that produced a previous output and needs to be revisited? Chances are the answer is yes. And it’s likely you have been frustrated by not knowing how to reproduce an output or rerun a codebase or even who to talk to to obtain a refresh of the data - in some way, shape, or form.  The problem that a lot of project teams face, an...

[Bite] Documenting Data Science Projects

June 29, 2022 06:00 - 16 minutes - 11.5 MB

Do you ever find yourself wondering what the data was you used in a project? When was it obtained and where is it stored? Or even just the way to run a piece of code that produced a previous output and needs to be revisited? Chances are the answer is yes. And it’s likely you have been frustrated by not knowing how to reproduce an output or rerun a codebase or even who to talk to to obtain a refresh of the data - in some way, shape, or form.  The problem that a lot of project teams face, an...

Landing Data Science Projects: The Art of Change Management & Implementation

May 31, 2022 05:00 - 29 minutes - 20.6 MB

Are people resistant to change? And if so, how do you manage that when trying to introduce and deliver innovation through Data Science? In this episode of the DataCafé we discuss the challenges faced when trying to land a data science project. There are a number of potential barriers to success that need to be carefully managed. We talk about "change management" and aspects of employee behaviours and stakeholder management that influence the chances of landing a project. This is especially...

[Bite] Version Control for Data Scientists

May 05, 2022 15:00 - 15 minutes - 10.7 MB

Data scientists usually have to write code to prototype software, be it to preprocess and clean data, engineer features, build a model, or deploy a codebase into a production environment or other use case. The evolution of a codebase is important for a number of reasons which is where version control can help, such as: collaborating with other code developers (due diligence in coordination and delegation) generating backups recording versions tracking changes experimenting and testing ...

Deep Learning Neural Networks: Building Trust and Breaking Bias

April 07, 2022 05:00 - 51 minutes - 35.3 MB

We explore one of the key issues around Deep Learning Neural Networks - how can you prove that your neural network will perform correctly? Especially if the neural network in question is at the heart of a mission-critical application, such as making a real-time control decision in an autonomous car. Similarly, how can you establish if you've trained your neural network at the heart of a loan decision agent  with a prebuilt bias? How can you be sure that your black box is going to adapt to cr...

[Bite] Wordle: Winning against the algorithm

March 14, 2022 23:00 - 11 minutes - 7.91 MB

The grey, green and yellow squares taking over social media in the last few weeks is an example of the fascinating field of study known as Game Theory.  In this bite episode of DataCafé we talk casually about Wordle - the internet phenomenon currently challenging players to guess a new five letter word each day. Six guesses inform players what letters they have gotten right and if they are in the right place. It’s a lovely example of the different ways people approach game strategy through...

Series 2 Introduction

March 14, 2022 22:00 - 5 minutes - 3.83 MB

Looks like we might be about to have a new Series of DataCafé! Recording date: 15 Feb 2022 Intro music by Music 4 Video Library (Patreon supporter) Thanks for joining us in the DataCafé. You can follow us on twitter @DataCafePodcast and feel free to contact us about anything you've heard here or think would be an interesting topic in the future.

[Bite] Why Data Science projects fail

June 21, 2021 07:00 - 19 minutes - 13.3 MB

Data Science in a commercial setting should be a no-brainer, right? Firstly, data is becoming ubiquitous, with gigabytes being generated and collected every second. And secondly, there are new and more powerful data science tools and algorithms being developed and published every week. Surely just bringing the two together will deliver success... In this episode, we explore why so many Data Science projects fail to live up to their initial potential. In a recent Gartner report, it is antic...

Data Science for Good

May 31, 2021 19:00 - 36 minutes - 24.9 MB

What's the difference between a commercial data science project and a Data Science project for social benefit? Often so-called Data Science for Good projects involve a throwing together of many people from different backgrounds under a common motivation to have a positive effect. We talk to a Data Science team that was formed to tackle the unemployment crisis that is coming out of the pandemic and help people to find excellent jobs in different industries for which they have a good skills m...

[Bite] Data Science and the Scientific Method

May 03, 2021 05:00 - 17 minutes - 12 MB

The scientific method consists of systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses. But what does this mean in the context of Data Science, where a wealth of unstructured data and variety of computational models can be used to deduce an insight and inform a stakeholder's decision? In this bite episode we discuss the importance of the scientific method for data scientists. Data science is, after all, the application of scientif...

Data Science on Mars

April 19, 2021 05:00 - 58 minutes - 40.3 MB

On 30 July 2020 NASA launched the Mars 2020 mission from Earth carrying a rover called Perseverance, and rotorcraft called Ingenuity, to land on and study Mars. The mission so far has been a resounding success, touching down in Jezero Crater on 18 February 2021, and sending back data and imagery of the Martian landscape since then. The aim of the mission is to advance NASA's scientific goals of establishing if there was ever life on Mars, what its climate and geology are, and to pave the wa...

[Bite] How to hire a great Data Scientist

April 05, 2021 05:00 - 14 minutes - 9.83 MB

Welcome to the first DataCafé Bite: a bite-size episode where Jason and Jeremy drop-in for a quick chat about a relevant or newsworthy topic from the world of Data Science. In this episode, we discuss how to hire a great Data Scientist, which is a challenge faced by many companies and is not easy to get right. From endless coding tests and weird logic puzzles, to personality quizzes and competency-based interviews; there are many examples of how companies try to assess how a candidate  hand...

Bayesian Inference: The Foundation of Data Science

March 23, 2021 06:00 - 42 minutes - 29.1 MB

In this episode we talk about all things Bayesian. What is Bayesian inference and why is it the cornerstone of Data Science? Bayesian statistics embodies the Data Scientist and their role in the data modelling process. A Data Scientist starts with an idea of how to capture a particular phenomena in a mathematical model - maybe derived from talking to experts in the company. This represents the prior belief about the model. Then the model consumes data around the problem - historical data, r...

Apple Tasting

February 22, 2021 06:00 - 35 minutes - 24.4 MB

Have you ever come home from the supermarket to discover one of the apples you bought is rotten? It's likely your trust for that grocer was diminished, or you might stop buying that particular brand of apples altogether. In this episode, we discuss how the quality controls in a production line need to use smart sampling methods in order to avoid sending bad products to the customer, which could ruin the reputation of both the brand and seller. To do this we describe a thought experiment c...

Apple Tasting: Reinforcement learning for quality control

February 22, 2021 06:00 - 35 minutes - 24.4 MB

Have you ever come home from the supermarket to discover one of the apples you bought is rotten? It's likely your trust for that grocer was diminished, or you might stop buying that particular brand of apples altogether. In this episode, we discuss how the quality controls in a production line need to use smart sampling methods in order to avoid sending bad products to the customer, which could ruin the reputation of both the brand and seller. To do this we describe a thought experiment c...

Optimising the Future

January 04, 2021 06:00 - 35 minutes - 24.7 MB

As we look ahead to a new year, and reflect on the last, we consider how data science can be used to optimise the future. But to what degree can we trust past experiences and observations, essentially relying on historical data to predict the future? And with what level of accuracy? In this episode of the DataCafé we ask: how can we optimise our predictions of future scenarios to maximise the benefit we can obtain from them while minimising the risk of unknowns? Data Science is made up of...

US Election Special

November 01, 2020 18:00 - 31 minutes - 21.9 MB

What exciting data science problems emerge when you try to forecast an election? Many, it turns out! We're very excited to turn our DataCafé lens on the current Presidential race in the US as an exemplar of statistical modelling right now. Typically state election polls are asking around 1000 people in a state of maybe 12 million people how they will vote (or even if they have voted already) and return a predictive result with an estimated polling error of about 4%. In this episode, we loo...

Forecasting Solar Radiation Storms

October 19, 2020 06:00 - 40 minutes - 27.5 MB

What are solar storms? How are they caused? And how can we use data science to forecast them? In this episode of DataCafé we talk about the Sun and how it drives space weather, and the efforts to forecast solar radiation storms that can have a massive impact here on Earth. On a regular day, the Sun has a constant stream of charged particles, or plasma, coming off its surface into the solar system, known as the solar wind. But in times of high activity it can undergo much more explosive ph...

Entrepreneurship in Data Science

September 19, 2020 17:00 - 50 minutes - 34.9 MB

How do you get your latest and greatest data science tool to make an impact? How can you avoid wasting time building a supposedly great data product only to see it fall flat on launch? In this episode, we discuss how you need to start with the idea before you get to a data product. As all good entrepreneurs know, if you can't sell the idea, you're certainly not going to be able to sell the product. We take inspiration from a particular way of thinking about software engineering called Lean ...

Viruses: Keep Calm and Use Statistics

August 12, 2020 05:00 - 46 minutes - 32.3 MB

What is a virus? How can we spot human viruses in danger of becoming pandemics? How can we use statistics to understand their origins and transmission? This turns out to be a hard problem - not least because there can be many hundreds or thousands of slightly modified strains of a virus in a small sample of blood. It is of great importance which version of a virus will become a pandemic in a population and which will merely peter out. Viral geneticists have to be expert statisticians to be ...

Changepoint Detection: Secret Weapon of the Data Scientist

July 13, 2020 18:00 - 31 minutes - 21.9 MB

How can we spot a change in a jet engine vibration that might mean it’s about to fail catastrophically? How can a service forecast adapt to unexpected changes brought about by a pandemic? How might we spot an increase in rate of change of pollution in the atmosphere? The answer to all these questions is changepoints, or rather changepoint detection.  Common to all these systems is a set of ordered data, usually a time series of observations or measurements that may be noisy but have some un...

Multi-Armed Bandits

June 01, 2020 06:00 - 23 minutes - 16.4 MB

Have you ever wondered why you keep getting adverts for products that you've only just bought and now don't need? The online advert auto-server is probably using a multi-armed bandit learner that needs a little algorithmic improvement. We speak to Ciara Pike-Burke about her work on trying to make multi-armed bandits smarter and more useful. The multi-armed bandit problem is a classic reinforcement learning problem where we are given a slot machine with n arms (bandits) with each arm having ...

Inventory Optimisation: Reducing waste, Improving availability

June 01, 2020 06:00 - 30 minutes - 21.1 MB

How do big grocery retailers maintain product availability for their customers day after day while minimising food wastage and storage costs? The answer is Inventory Optimisation, the science of maintaining sufficient stock levels of a set of products so that customers see an appropriate level of availability when they walk into your store.  The trade-off It’s hard because it often costs money to maintain a large inventory of products, because of space that is given over to bulky stock as t...

Optimal Control in Price Decision Making

June 01, 2020 06:00 - 27 minutes - 18.9 MB

Optimal Control is the science of making decisions in a way that optimises a key quantity such as revenue, customer satisfaction, or quality of service. Cake example Bertrand has a cake. He likes cake a lot but he can overeat cake sometimes in which case he doesn’t enjoy it so much. He would like to work out how much cake he should eat today and the next and the next so that he maximises his overall enjoyment of the cake, possibly making it last a long time (but not so long that it goes st...

Vehicle Routing Problem for Electric Vehicles

June 01, 2020 06:00 - 39 minutes - 26.9 MB

How can we generate efficient routes for a large fleets of vehicles that have to make many thousands of deliveries a day while taking into account breaks, shift patterns and traffic conditions? Now let's make those vehicles electric and we need to take into account vehicle battery charge level, recharging station locations and anticipated energy efficiency. It's a challenging problem! Vehicle Routing Problem (VRP) is the optimisation problem that describes all manifested delivery operations...

Inventory Optimisation

June 01, 2020 06:00 - 30 minutes - 21.1 MB

How do big grocery retailers maintain product availability for their customers day after day while minimising food wastage and storage costs? The answer is Inventory Optimisation, the science of maintaining sufficient stock levels of a set of products so that customers see an appropriate level of availability when they walk into your store.  The trade-off It’s hard because it often costs money to maintain a large inventory of products, because of space that is given over to bulky stock as t...

Multi-Armed Bandits: Learning better decisions

June 01, 2020 06:00 - 23 minutes - 16.4 MB

Have you ever wondered why you keep getting adverts for products that you've only just bought and now don't need? The online advert auto-server is probably using a multi-armed bandit learner that needs a little algorithmic improvement. We speak to Ciara Pike-Burke about her work on trying to make multi-armed bandits smarter and more useful. The multi-armed bandit problem is a classic reinforcement learning problem where we are given a slot machine with n arms (bandits) with each arm having ...

Twitter Mentions

@laurie_winkless 1 Episode
@theplanetaryguy 1 Episode