Fraunhofer USA Center Mid-Atlantic –AI/ML Classification & Prediction Division: Competence Areas

AI/ML Classification and prediction

Information Regarding - AI/ML Classification and prediction

Data collection

Data collection is the foundation of the pyramid, the stage where you identify what data you need and what is available. If the goal is a user-facing product, are all relevant interactions logged? If it is a sensor, what data is coming through and how? Without data, no machine learning or AI solution can learn or predict outcomes.

 

Data flow

Identify how the data flows through the system. Is there a reliable stream or extract, transform, and load (ETL) process established? Where is the data stored, and how easy is it to access and analyze?

 

Explore and transform

This is a time-consuming and underestimated stage of the data science project life cycle. At this point, you realize you are missing data, your machine sensors are unreliable, or you are not tracking relevant information about customers. You may be forced to return to data collection and ensure the foundation is solid before moving forward.

 

Business intelligence and analytics

After you can reliably explore and clean data, you can start building what is traditionally thought of as business intelligence or analytics, such as defining key metrics to track, identifying how seasonality impacts product sales and operations, segmenting users based on demographic factors, and the like.

Now is the time to determine:

  • The features or attributes to include in machine learning models
  • The training data the machine will need to learn
  • What you want to predict and automate
  • How to create the labels from which the machine will learn

 

Machine learning and benchmarking

To avoid real-world disasters, before the sample data is used to make predictions, create a framework for A/B testing or experimentation and deploy models incrementally. Model validation and experimentation can provide a rough estimate of the effects of changes before you implement them. Establish a very simple baseline or benchmark for performance tracking. For example, if you are building a credit card fraud detection system, create test data by monitoring known fraudulent credit card transactions and compare them to the results of your model to verify it accurately detects fraud.

 

Artificial intelligence

After you reach this stage, you can improve processes, predictions, outcomes, and insights by expanding your knowledge, understanding, and experience with new methods and techniques in machine learning and deep learning.