In this episode of Data Dialogues, we explain how smart data can help organizations combat the growing digital threat of identity fraud. Aparna Sheth, product leader for Equifax’s Identity and Fraud Solutions Group, interviews Cori Shen, who leads a data science team responsible for data and machine learning and AI-driven product innovations to solve identity and fraud challenges. 


Jump ahead to these highlights:


0:40 - Cori’s role and team responsibilities


0:54 - Consumers shift from digital-first to digital-only business environment


1:37 - Fraud has multiplied


2:18 - New fraud opportunities emerge during unprecedented economic conditions


3:36 - How to use data and analytics to solve fraud


4:41 - How smart data works


8:40 - Role of digital signals and bureau data


10:00 - Explaining graph networks


10:58 - How to make the insights actionable and examples


14:38 - Our smart data approach



Podcast Transcription


Aparna:

Welcome to Data Dialogues.  Today, we are discussing how smart data can help organizations fight the evolving challenges of identity fraud. My name is Aparna Sheth. I'm a product leader here at Equifax in our identity and fraud solutions group. And I'm so happy to have Cori Shen here with me, who leads our data science team. Hi, Cori, would you like to share more about what you do?


Cori:

Sure. Thanks, Aparna. Happy to be here too. And I'm glad that we can discuss this topic together. I'm Cori Shen. I lead our identity and fraud data science team for Equifax.


Aparna:

Alright. So speaking of identity and fraud, 2020 has been quite a year. COVID accelerated digital transformation across the board. We saw a stark paradigm shift take place last year, where we went from a “digital first” to “digital only” business environment. And this was of course brought on by abrupt shelter in place orders.


Cori:

That's right, Aparna. I totally agree with you. You know, consumers were forced to do everything online from buying groceries to ordering food. And of course they're doing all their financial transactions online. You know, last year 80% of my groceries were done through a mobile app.


Aparna:

Oh wow.  Yeah, I know. And we saw during this pandemic that not only did the new fraud schemes emerge, but we also saw the existing types of fraud have multiplied. Right?


Cori:

That's absolutely true. You probably saw this report coming from the Federal Trade Commission, right?  The report shows they have received about, I think 275,000 fraud complaints last year. And also when we track the fraud trends in our own data, we see that the authorized user abuse risk in 2020 went up by over 23% compared to 2019 and 2018.


Aparna:

Wow. The other factor, of course, was the unprecedented unemployment rates and economic downturn. And to combat that, as we all know, Congress passed trillion plus dollars of stimulus relief packages to help struggling families and boost the economy. We saw new fraud schemes in March exploiting PPP, which is the Payroll Protection Program, as well as the expanded unemployment insurance program.So as millions of Americans were applying for help, we had these international and national criminal rings that were working relentlessly to steal these funds, using sophisticated methods of identity theft.


Cori:

That's right, Aparna. You know, with all the relief money that went to the market in 2020, I think it really made fraudsters go all out on it. As a matter of fact, these fraud schemes might be new, but the underlying fraud challenges are the same ones like synthetic ID, the compromised ID, which has been around for years. And I think that's why now more than ever, we need something better in identity and fraud prevention.


Aparna:

I couldn't agree more. So let's talk about how we can use data and analytics to solve this, right? There is just so much data out there. Not just related to our credit file, but also every digital interaction that we make as individuals. Be it social media or when we shop online. So how do we sort through these billions of interactions and use analytics to really drive those insights that can be used to mitigate against these growing challenges?


Cori:

This is a great question.  Because if we look at today's digital paradigm, managing big data from multiple sources is no longer a challenge. What matters most is how to make sense of big data and how to intelligently and efficiently assemble multi-source data for the right insights. And we will call it smart data because we want data to talk, and we want data to be able to offer recommendations.


Aparna:

I love it. Smart data. I mean, it sounds fantastic, right? But it's easier said than done, isn't it? Let's take synthetic identities for example.  We know that many of these have been in the system for a while and they look like legitimate people. Very often their identity information is complete, and it matches to what systems have. As a matter of fact, sometimes they even have a matched social media profile. That's why these fake identities look like real people and can be used to create fake businesses, defraud the system with millions of dollars of PPP or employment claims. Right? So even if we do identity verification matches from multiple sources, we may not be able to catch them. So what should we do?


Cori:

Ah, what should we do? This is exactly the right question. I totally agree with you. If we're just talking about matching identities from multiple sources, it is not smart data. Smart data has two components: insights and connections. We think a real effective way to build smart data is to connect to the useful insights from a graph network perspective. Let me take synthetic ID detection for example. Here is how you can build.  First, build useful insights from multiple sources. You want to search for the abnormal signals throughout an identity's lifecycle. To do so you will need the consumer activity data from multiple sources and from multiple systems. For example, the consumer applies for credit cards or loans. The consumer checks their credit online. They enroll. We're logging into an online system. They're making payments. They're making purchases from e-commerce sites. All these different data points are consumer activity data.


We all know that we cannot listen to what fraudsters say. But we need to watch what they do. Because fraudsters will give you a fake ID and tell you, Hey, everything's good. Everything matched. And I want to borrow $50,000. But when you get the power of the consumer activity data, what you can do is that you can look closely into their activities. And then, you will find out a lot of secrets about them. And here are some examples. All the synthetic ID outliers appear at an early stage. You will see some synthetic IDs apply for mortgages and shop for luxury cars. However, when you look at the activity pattern for a regular legit consumer at the earlier stage, you will often see they only apply for cell phone, apartments, internet service, credit cards.  These types of starter programs. Another example, sometimes synthetic ID can be a very patient game. This means that, you know, fraudsters can wait for a couple years to build their cred...