In machine learning problems, there are often too many factors on the basis of which the final classification is done. These factors are basically variables called features. The higher the number of features, the harder it gets to visualize the training set and then work on it. The process of selecting a subset of features for use in model construction is called Dimensionality Reduction.
Before Learning the techniques of Dimensionality Reduction, lets understand why it is important to do Dimensionality Reduction in our Dataset.
1) The abundance of redundant and irrelevant features
2) With a fixed number of training samples…
“ the field of study that gives computer the ability to learn without being explicitly programmed”.
“ A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E”.
Example : playing checkers
E = the experience of playing many games of checkers
T = the task of playing checkers
P = the probability that the program will win the next game
Supervised Learning : In the supervised learning…
Sometimes it might be confusing to some people to distinguish between Data Science and Data Mining, so after reading this article it will clear your concepts about Data Science and Data Mining.
Data Mining is a process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.Data mining is an inter- disciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use.
Data Analysis| Machine Learning | Passionate about solving business problems by data-driven approaches. 📊📈