Applications of Data Science


Case: Fraud detection

Suppose you work in a fraud detection at a large bank, you'd like to use data to determine probability that the transaction is fake.

To answer this question you might start gathering information about each purchase such as the amount, date, location, purchase type, and card holder's address

You'll need many examples of such transactions as well as a label that tells you whether the transaction is valid or fraudulent. Luckily, you'll have this information in the database.

These records are called training data and are used to build a model.

Each time a transaction occurs, you'll give this information to the model you've built and you'll receive the probability of the transaction being fraudulent.

What do we need for machine learning to work it's magic?

First, a data science problem begins with a well-defined question. Our question in previous instance was, What is the probability that this transaction is fraudulent?

Next, we need some data to analyse, we had months of transactions and associated metadata that had already been identified as been fraudulent or valid.

Finally, we need additional data every time we want to make a new prediction. We need it to have same type of information on every new purchase so that we could label it as fraudulent or valid.

Case: Smart Watch

Suppose, you're trying to build a smart watch to monitor physical activity. You want to be able to auto-detect different activities such as walking or running.

Your smart watch is equipped with an accelerometer, that monitors motion in three dimensions. The data generated by sensors is the basis of your machine learning problem. You could ask several volunteers to use your watch and record when they're running or walking.

You could then develop an algorithm that recognizes accelerometer data as representing one of those two states: walking or running.

Internet of Things (IoT)

Your smart watch is a part of a fast growing field named Internet of Things, also know as IoT which is often combined with data science.

Things in IoT are gadgets that are not standard computers but still have the ability to transmit data.

This includes:

  1. Smart Watches
  1. Internet-connected home security systems
  1. Electronic toll collection systems
  1. Building energy management systems
  1. Much, much more!

IoT data is a great resource for data science projects.

Case: Image recognition

Let's tackle another example, a key task for self-driving cars is to identify when an image contains a human.

What would the data set be for this problem? We can represent the picture with a matrix of numbers where each number represents a pixel.

However, this approach would probably fail if we fed the matrix into a traditional machine learning model. There's simply too much input data. We need more advanced algorithms such as deep learning.

Deep Learning

In deep learning, multiple layers of many neurons work together to draw complex conclusions.

Deep learning takes much more data than traditional machine learning models but it's also able to learn relationships that traditional models cannot.

Deep learning is used to solve data intensive problems such as image classification, or language understanding.

You're now familiar with several different applications of data science.