CN3105 โ Data Science for Chemical Engineers
Machine learning fundamentals and their applications to chemical engineering problems.
What is machine learning and how does it work?
Machine learning is a method of data analysis that focuses on using data and algorithms to imitate the way humans learn, gradually improving accuracy over time. Rather than being explicitly programmed with rules, a machine learning model discovers patterns in data and uses them to make predictions or decisions. In chemical engineering, ML enables data-driven modelling of complex processes where first-principles mechanistic models are expensive or impractical to build.
Machine learning (ML) is a subfield of Artificial Intelligence (AI). The hierarchy is: AI โ ML โ Deep Learning. ML creates algorithms that learn from data and make decisions based on observed patterns, requiring human intervention when decisions are incorrect. Deep Learning uses artificial neural networks to reach accurate conclusions without human intervention.
A field of study concerned with giving computers the ability to learn without being explicitly programmed. Samuel demonstrated this by writing a checkers bot that learned from millions of self-play games โ teaching itself which moves lead to wins and which lead to losses โ without being told the rules of good play.
Learn a function f(x) mapping inputs x to outputs y from a labelled training set {(xแตข, yแตข)}. If y is continuous, the task is regression (e.g., predicting reactor yield). If y is categorical, the task is classification (e.g., fault detection: normal vs. faulty). This is the primary paradigm covered in CN3105.
Discover patterns or structure from a data set without any label information. Clustering groups similar data points (e.g., customer segmentation, identifying similar operating regimes in a plant). Dimensionality reduction reduces features while retaining the most important information (e.g., PCA for process data visualisation).
A learning algorithm that takes actions by interacting with an environment, receiving rewards or penalties for each action. It learns a policy that maximises cumulative reward over time. Applications include optimising control policies for chemical reactors and scheduling production in a plant.
Q1.In the context of ChBE process data, 'feature engineering' refers to: