Mastering Machine Learning Algorithms

This is the code repository for Mastering Machine Learning Algorithms, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.

About the Book

Machine learning is a subset of AI that aims to make modern-day computer systems smarter and more intelligent. The real power of machine learning resides in its algorithms, which make even the most difficult things capable of being handled by machines. However, with the advancement in the technology and requirements of data, machines will have to be smarter than they are today to meet the overwhelming data needs; mastering these algorithms and using them optimally is the need of the hour.

Instructions and Navigation

All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.

The code will look like the following:

from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.7,
random_state=1)

There are no strict prerequisites for this book; however, it's important to have basic intermediate Python knowledge with a specific focus on NumPy. Whenever necessary, I will provide instructions/references to install specific packages and exploit more advanced functionalities. As Python is based on a semantic indentation, the published version can contain incorrect newlines that raise exceptions when executing the code. For this reason, I invite all readers without deep knowledge of this language to refer to the original source code provided with the book. All the examples are based on Python 3.5+. I suggest using the Anaconda distribution (https://www. anaconda. com/download/), which is probably the most complete and powerful one for scientific projects. The majority of the required packages are already built in and it's very easy to install the new ones (sometimes with optimized versions). However, any other Python distribution can be used. Moreover, I invite readers to test the examples using Jupyter (formerly known as IPython) notebooks so as to avoid rerunning the whole example when a change is made. If instead an IDE is preferred, I suggest PyCharm, which offers many built-in functionalities that are very helpful in data-oriented and scientific projects (such as the internal Matplotlib viewer). A good mathematics background is necessary to fully understand the theoretical part. In particular, basic skills in probability theory, calculus, and linear algebra are required. However, I advise you not to give up when a concept seems too difficult. The reference sections contain many useful books, and the majority of concepts are explained quite well on Wikipedia too. When something unknown is encountered, I suggest reading the specific documentation before continuing. In many cases, it's not necessary to have complete knowledge and even an introductory paragraph can be enough to understand their rationale.

Related Products

Suggestions and Feedback

Click here if you have any feedback or suggestions.