Shedding light on the concept of 1x1 convolution operation which appears in paper, Network in Network by Lin et al. and Google Inception

Photo by Liam Charmer on Unsplash

Having read the Network in Network (NiN) paper by Lin et al a while ago, I came across a peculiar convolution operation that allowed cascading across channel parameters, enabling learning of complex interactions by pooling cross-channel information. They called it the “cross channel parametric pooling layer” (if I remember correctly), comparing it to an operation involving convolution with a 1x1 convolutional kernel.

Skimming over the details at the time (as I often do with such esoteric terminology), I never thought I would be writing about this operation, let alone providing my own thoughts on its workings. But as it goes…


The high computational cost of inferring from a complicated and often intractable, ‘true posterior distribution’ has always been a stumbling block in the Bayesian framework. However (and thankfully), there are certain inference techniques that are able to achieve a reasonable approximation of this intractable posterior with something… tractable. See what I did there?

Photo by Antoine Dautry on Unsplash

One such approximate inference technique that has gained popularity in recent times is the Variational Bayes (VB). Having a relatively low computational cost and a good empirical approximation has propelled it to drive the intuition behind successful models like the Variational Auto-encoders and more. …


Photo by Thomas Martinsen on Unsplash

This year celebrates the 50th anniversary of the paper by Rudolf E. Kálmán that conferred upon the world, the remarkable idea of a Kalman Filter. Appreciation for the beauty (and simplicity) of this filtering technique often gets lost in technical, verbose definitions like the one found on Wikipedia:

In statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, producing estimates of unknown variables that tend to be more accurate than those based on a single measurement alone. …


Picture by Franki Chamaki on Unsplash

If a picture says a thousand words, your data probably says a lot more! You just need to listen.

A few months ago, working on a coursework assignment, we were posed with a problem: to develop an analytical model capable of detecting malicious or fraudulent transactions from credit card data-logs. Excited, we got to the drawing board to start building our classification model. What started as a seemingly ‘grades-in-the-bag’ assignment quickly turned into a ‘not-so-straightforward’ project. The complications lay in certain artefacts of the supplied data-set which if not anything, at least made us question our naivety!

The thing with…


Bayesian Inference is based on generative thinking

In the Machine Learning domain, we often find ourselves in the pursuit of inferring a functional relationship between the attribute variable(s) (i.e. features or simply, input data: x) and its associated response (i.e. the target variable: y). Being able to learn this relationship allows one to build a model that can predict a response, given any set of such attribute variables (i.e. the test data).


Estimating vital signs like heart rate, breathing rate and SpO2 levels from facial videos using computer vision.

Photo by Ryan Stone on Unsplash

Let's start with a small exercise. But first I have to ask you to put on your fitness tracker; You do own one right? Oh.. you thought the exercise was mental? Disastrous. The point is, if you’ve been even a tiny bit observant with the fitness device that sits on your wrist at the moment, you will have noticed a shiny light at the back (usually green). Now if you’ve been curious or a bit more observant, you’ll know that this light is used as a medium (pun unintended) to extract the pulse signal from your wrist! And that’s where…


Photo by Volkan Olmez on Unsplash

When we think of an image, it's almost natural to think of a two-dimensional (2-D) representation right? Well, in reading the previous sentence carefully, ‘almost’ is the keyword.

In a recent paper, Scaling down deep learning, Sam Greydanus introduced a rather cool 1-D image dataset called the MNIST 1-D labelling it as, “A minimalist, low-memory, and low-compute alternative to classic deep learning benchmarks.” The benchmark being referred to is the MNIST dataset which is a dataset of hand-written digits that is quite well known within the machine learning (ML) or more generally, the data science space. Unlike the original MNIST…


Photo by Matthew Osborn on Unsplash

Back in high-school, we’ve all had that phase of derision in response to frequent reprimands issued by our teachers to scrupulously demonstrate the steps leading to a solution, especially while working out questions in an examination. “It promotes readability and calls attention to your intuition!”, they said. I vividly remember our Mathematics teacher bellowing at a classroom full of iffy students, re-iterating the importance of this so-called ‘step-wise’ approach to presenting our solutions. …

Anwesh Marwade

A final year student pursuing Masters in Data Science and Computer Vision. A motivated learner with a liking towards Sports Analytics. anweshcr7.github.io

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store