Interview with a Kaggle Master, GANS & Much More!

Chitwan Manchanda
4 min readMay 19, 2022

1. Exclusive Interview with 2x Kaggle Master Gilles Vandewiele!

“I think one of the nice things about the data science field is that it is so multi-disciplinary and that anyone who aspires to become a data scientist can do so.” — Gilles Vandewiele Golden words! As a beginner in data science, this quote gives me a lot of hope provided that I, like many other data science aspirants, don’t come from a scientific or technical background. While a data scientist should have some experience in each of the steps of such a pipeline, we cannot expect everyone to be an expert in all of those steps. The main reason why Kaggle is a better learning environment than the real world is that your boundaries are pushed further by other competitors: you want to end up high in competition and thus create a solution that is better than the other solutions (which are often 1000s of them); in the real world, you create a solution that fulfills the need of the clients and then you are done. I typically start out with a schematic drawing of my solution, which helps to structure my post and also gives me an overview of the components that need to be discussed. I think some of the most valuable learning experiences on Kaggle are made in a team, as you learn from others.

Categories: Kaggle

Level: Beginner

Link to the entire article: https://www.analyticsvidhya.com/blog/2020/11/exclusive-interview-with-gilles-vandekaggle-grandmaster-series-exclusive-interview-with-kaggle-rank-147-and-competitions-master-gilles-vandewielewiele/

2. Getting started with Kaggle using Facial Detection

This article will guide you to get started with Kaggle using the OpenCV (Open Source Computer Vision) library in python. Kaggle is often referred to as the Airbnb for Data Scientists. If you are interested to venture into Machine Learning and want to learn by trying out some of the readily available algorithms and libraries, then Kaggle is the right place to start. Let’s go over the code performing facial detection using OpenCV.

Categories: face detection, facial detection, OpenCV

Level: Advanced

Link to the entire article: https://www.analyticsvidhya.com/blog/2021/04/getting-started-with-kaggle-using-facial-detection/

3. Training Neural Network with Keras and basics of Deep Learning

The model could be sequential, implying that the layers are piled one on top of the other with a single input and output. Because a functional API is a data structure, it’s simple to save it as a single file that can be used to rebuild the exact model without necessarily knowing the source code. Convolution Layers can be created in a variety of ways using the Convolution Layer class. A callback is a type of object that can be used to accomplish tasks at different points of the training process (i.e at the start /end of an epoch, before/after a single batch). The data is usually in raw format and structured in directories, and it must be preprocessed before it can be supplied to the model for fitting.

Categories: Keras, Neural Network

Level: Advanced

Link to the entire article: https://www.analyticsvidhya.com/blog/2021/11/training-neural-network-with-keras-and-basics-of-deep-learning/

4. Logistic Regression using Python and Excel

The ratio of p to (1-p) is called the Odds, as follows- In simple linear regression, the model to estimate the continuous response variable y as a linear function of the explanatory variable x as follows- However, when the response variable is discrete, in terms of 1 or 0 (True or False, Success or Failure), estimation is done based on the Probability of success.

Categories: Advanced MS Excel, Blogathon, Logistic Regression, Python

Level: Beginner

Link to the entire article: https://www.analyticsvidhya.com/blog/2022/02/logistic-regression-using-python-and-excel/

5. Train Your First GAN Model | Let’s Talk About GANs Part 2

The Generator’s goal is to generate a fake image from the given distribution (set of images), it does so with the following procedure: A set of input vectors (random noise) is passed through the Generator’s Neural Network which creates a whole new image by multiplying the Generator weight matrix with the input noise. Once we are satisfied with the accuracy of the Generator we save the weights of the Generator and remove the Discriminator from the network, and use that weight matrix for generating further new images by passing it a different random noise matrix each time. In order to optimize the parameters of GANs, we need a cost function that tells the network how much it needs to improve by just calculating the difference between actual and predicted values. First-term is: if the actual value is “1” and the predicted value is “~0” in this case since log(~0) tends to negative infinity or very high, and if the predicted value is also “~1” then the log(~1) would be close to “0” or very less, so this term helps in calculating loss for the label values “1”.

Categories: Blogathon, GAN, Python, Pytorch

Level: Intermediate

Link to the entire article: https://www.analyticsvidhya.com/blog/2021/05/%e2%80%8atrain-your-first-gan-model-lets-talk-about-gans-part-2%e2%80%8a/

Conclusion

I hope you found this blog post insightful. Please do share it with your friends & family and subscribe to my blog Keeping Up With Data Science for more informative content on Data Science straight to your inbox. You can reach out to me on Twitter & LinkedIn. I am quite active there & I will be happy to have a conversation with you. Please feel free to drop your feedback in the comments that helps me to improve the quality of my work. I will keep on sharing more content as I grow & mature as a Data Scientist. Until next time, Keep Hustling & Keep Up with Data Science. Happy Learning 🙂

--

--

Chitwan Manchanda

Currently, working as a ML Engineer-II at Turing. I like to write about Data Science and ML, checkout my work here bit.ly/KeepingUpWithDS