Before starting machine learning, let us discuss some terminologies frequently used in machine learning.
In a simple way, we shall say that learning is a process of converting experience into knowledge or expertise. Thus, in machine learning, we wish to write programs so that a computer can learn from the available inputs. Hence, the input to a learning algorithm is considered a training dataset. Once an algorithm is designed and tuned for the given input to achieve maximum accuracy, we say that the model is built. Hence, we can say
model = algorithm(input data)
Why Machine Learning?
In machine learning, experts develop general-purpose algorithms that can be used on large classes of learning problems. To solve a specific task you only need to feed specific data to the algorithm, in such a way that you are programming by example.
A computer uses this data as its source of information and compares the output with the desired output and tries to maximize the accuracy.
System Modeling
System modeling is very important while machine learning is applied in a field of study. Generally, the modeling phase consists of the following steps:
- Feature Engineering (feature normalization and feature selection etc.)
- Algorithm selection
- Training, model validation, and model selection
- Applying the trained model to unseen data
Learning Problems
There are a large number of learning problems, which can be categorized based on the objectives of the problems. Here, we are categorizing the problems as follows:
Regression: If you are trying to predict a real value based on the given past performance, probably a new need to apply regression. For example, predict the value of a stock tomorrow given its past performance.
Binary Classification: If you are trying to predict a simple yes/no response, probably you need binary classification. For example, predict whether a user review of the new product is positive or negative about the product.
Multiclass Classification: If you are trying to put an example into one of a number of classes, you might apply multiclass classification. For example, predict whether a news story is about entertainment, sports, politics, religion, etc.
Ranking: If you are trying to put a set of objects in order of relevance. For instance, predicting what order to put web pages in, in response to a user query.
The Key Features of Machine Learning
- Automatic discovery of patterns
- Prediction of likely outcomes
- Creation of actionable information
- Ability to analyze potentially large volumes of data
Automatic Discovery: Machine learning is a process where a model uses an algorithm to act on a set of data, with most models being generalizable to new data through a scoring process, which is the process of applying a model to new data.
Prediction: Machine learning models can predict income based on demographic factors like education. These predictions have a probability or confidence, which is how likely the prediction is to be true. Some predictive machine learning models generate rules, which are conditions that imply a given outcome. For example, a person with a bachelor's degree and a certain neighborhood is likely to have higher income than the regional average. These rules have an associated support, indicating the percentage of the population that satisfies the rule.
Grouping: Machine learning models can identify natural groupings in data, such as a segment of the population with a specific income range, good driving record, and yearly car leases, based on specific criteria.
Actionable Information: Machine learning can provide actionable insights from vast data, enabling businesses to develop strategies for low-income housing and high-value customer promotions. For instance, a town planner can use a model to predict income based on demographics.
Machine Learning and Statistics
Machine learning and statistics share similarities, with many techniques being incorporated into statistical frameworks. However, machine learning methods make weak assumptions about data, making them less valid if the assumptions are flawed.
Statistical models, on the other hand, make strong statements about results based on flawed assumptions. Consequently, machine learning cannot make such strong statements.
Despite this, machine learning can produce good results regardless of the data. Traditional statistical methods require significant user interaction to validate model correctness, making them difficult to automate. Machine learning, on the other hand, requires less user interaction and data knowledge, making it easier to automate than traditional statistical methods. This makes Oracle Machine Learning techniques easier to automate than traditional statistical methods.
What Can Machine Learning Do and Not Do?
Machine learning is a powerful tool that can identify patterns and relationships in data, but it does not replace the need for understanding your business, data, or analytical methods. It can confirm or qualify empirical observations and find new patterns that are not immediately discernible. However, it is important to note that predictive relationships discovered through machine learning are not causal relationships.
For example, machine learning might determine that males with incomes between $50,000 and $65,000 who subscribe to certain magazines are likely to buy a product.
This information can be used to develop a marketing strategy, but it should not be assumed that the identified population will buy the product because they belong to this population. Machine learning yields probabilities, not exact answers, and rare events can occur.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.