My attempt in understanding Deep Learning
This blog is a very amateur attempt on deep learning from a computational mathematician perspective. I hope to uplift the topic as lucid possible as per my understanding so bear with me dear readers❤
Deep learning 101
Deep learning is a subfield of machine learning that focuses on training artificial neural networks with multiple layers to learn hierarchical features from input data. These neural networks are designed to automatically adapt and improve their performance as more data is fed into the system. The main advantage of deep learning is its ability to discover intricate patterns and relationships in complex data, such as images, speech, or text.
To understand DL one must have a clear understandig of linear algebra (check NPTEL vids on LA for AI/ML or you can check out my post on SVD for a comprehensive understanding of linear algebra in the context of AI/ML) Deep learning is often used in various fields such as computer vision, natural language processing, and speech recognition to solve complex problems that were previously challenging for traditional machine learning algorithms . One of the key reasons for the popularity of deep learning is the advancement in processing abilities and the reduction in computing hardware costs. This has allowed researchers to train larger and more complex neural networks, leading to more accurate results.
Basic Math Setup
To go throught the landscape one must atleast be aware of preliminary concepts of linear algebra, calculus, and probability theory.(In case you need material do hit me up on Instagram or email me) Deep learning relies heavily on linear algebra for tasks such as matrix operations and transformations. To review the basic nomenclature we would title A space for possible datra points as X and the space for possible lables Y. In deep learning, the input data (X) is processed through a series of layers, each consisting of interconnected artificial neurons. These neurons perform mathematical operations, such as multiplying inputs by weights, applying activation functions, and passing the outputs to the next layer. The output of the final layer is then compared to the desired output (Y) using a loss function, which measures the difference between the predicted and actualoutput.
In deep learning, the mathematical operations involved in training the neural network are optimized through a process called backpropagation. This process involves calculating the gradients of the loss function with respect to the weights and biases of the network, and then updating these parameters in the direction that minimizes the loss.
At the end of the day the accurate models are the ones which mimics the processing pattern of brain the best. In this attempt the computational scientists needs to be aware of the feature selection process, which involves identifying and selecting the most relevant features from the input data that will contribute to the accuracy of the model. They are a type of ANN with muktpile layer between the input and output layer.
Neural Networks are a method in Artificila Intelligence AI that teachers computer to process data in a way that is inspired by the working of the brain. The most optimized algorithm requires less computing power.
Ever thought how certain OTT streming platforms know exactly your preference and majority of times recommend you films/moves/webshows that you end up enjoying? Gives a friendly vibe I know! Well, that’s the power of deep learning algorithms at work.
Jaccard similarity is an algorithm which recommends movies or shows based on what other might have watched with similar virewing history. The chart below for understanding.
But Jaccard Similarity often fails to detect similarity in polar tastes. Say two names= Neha and Keya have very different preferences and rarely watch the same type of content. Neha enjoys “Lords of the Rings” whereas Keya is in love with “Its Ok Not to be Okay”. Both watched the two shows giving rating as follows:
Neha: “Lords of the Rings” — 4.5 stars, “Its Ok Not to be Okay” — 2.5 stars
Keya: “Lords of the Rings” — 1 stars, “Its Ok Not to be Okay” — 5 stars
What Jaccard Similarity might do is recommend shows that have been watched or liked by users with similar overall preferences, which may not necessarily align with Neha and Keya’s individual tastes. To avoid this issue, more advanced recommendation algorithms such as collaborative filtering or deep neural networks can be used. Jaccard similarity coefficient also know =n as Jaccard index is a measure of the similarity between two sets, defined as the size of their intersection divided by the size of their union.
Cosine Similarity will consider the ratings as well as the common viewing habits while checking similarity.This allows for more accurate recommendations that take into account individual preferences and not just overall preferences.
Data Sparcity is a common challenge in recommendation systems, especially when dealing with user-item interactions.To address data sparsity, techniques like matrix factorization and content-based filtering can be applied.Deep learning algorithms, such as deep neural networks, play a crucial role in accurately predicting weather forecasts. They can analyze large amounts of historical weather data and learn complex patterns and relationships to make more accurate predictions. This is especially important in areas like agriculture, where accurate weather forecasts can help farmers make informed decisions about planting, irrigation, and harvesting.
Gray Sheep users-separsated using proposed algorithms can be targeted with personalized recommendations based on their specific preferences, ensuring that even those with unique taste can find content they enjoy.Overall, the use of advanced recommendation algorithms such as collaborative filtering and deep neural networks can overcome the limitations of traditional similarity metrics like Jaccard index. These advanced algorithms take into account individual preferences and viewing habits, providing more accurate and personalized recommendations to users.They are also capable of handling data sparsity and targeting gray sheep users, making them invaluable tools for designing effective recommendation systems.
Now the question arises: how do we train these deep learning models for recommendation systems? It’s a multi-step process involving data preprocessing, model selection, and hyperparameter tuning.
First, we need to preprocess the data, which may involve handling missing values, encoding categorical features, and scaling numerical features.
Next, we need to select an appropriate deep learning model, such as a neural collaborative filtering (NCF) model or a deep neural network (DNN) model.
Then, we need to tune the hyperparameters of the model, such as the learning rate, the number of epochs, the batch size, and the regularization parameters.
Understanding Generative AI Understanding generative AI is crucial in the field of deep learning. Generative AI refers to the ability of an algorithm or model to generate new data samples that are similar to the training data it has learned from. This capability enables generative AI models to create new images, audio, text, or other types of data that resemble the original dataset. Generative AI models, such as Generative Adversarial Networks and Variational Autoencoders, have revolutionized the fields of computer vision, natural language processing, and creativity in AI. Understanding generative AI is crucial in the field of deep learning. By harnessing the power of generative AI, deep learning models can generate new data samples that resemble the original training dataset, allowing for creative and innovative solutions in various domains such as art, music, and even drug discovery. I personally followed an array of tutorials, research papers, and online courses to gain a solid understanding of generative AI in the field of deep learning. To train deep learning models for recommendation systems, a multi-step process is followed involving data preprocessing, model selection, and hyperparameter tuning. This process begins with preprocessing the data, which includes handling missing values, encoding categorical features, and scaling numerical features. During the data preprocessing stage, missing values are addressed, categorical features are encoded, and numerical features are scaled to ensure the data is in a suitable format to be fed into the deep learning model. Once the data is preprocessed, the next step is to select an appropriate deep learning model for recommendation systems.
Now comes application in CV ie computer vision. First, what do we mean by CV? and how deep learning used in CV?
Computer vision is a field of study and research that focuses on enabling computers to understand, interpret, and analyze visual information from images or videos. Deep learning is a subset of machine learning that utilizes neural networks with multiple layers to extract high-level features from visual data. It has been observed that deep learning algorithms have revolutionized the field of computer vision by achieving state-of-the-art performance in various tasks such as image classification, object detection, image segmentation, and image generation. We are aware of Image Segmentation technique and face recognition technique using Pytorch(Coursera courses may guide you if you’re a beginner here) Computer Vision refers to the field of study and research focused on enabling computers to understand, interpret, and analyze visual information from images or videos.
Now let’s talk about regression models!
Regression models are statistical models used to analyze the relationship between a dependent variable and one or more independent variables. They are widely used in various fields, including economics, finance, social sciences, and even computer vision. In computer vision, regression models can be used to predict continuous variables based on visual features extracted from images or videos. For example, in object detection tasks, regression models can be used to predict the location and size of objects in an image. This information can then be used for various applications such as autonomous driving, surveillance systems, and augmented reality.