Blog posts

2024

Introduction to Unittesting

5 minute read

Published:

Why use Unit Testing?

Unittesting for any serious project is mandatory, not optional. It’s a good negative indicator -i.e. it points out poor-quality code with relatively high accuracy-. If the code is hard to unit test, it’s a strong sing that the code needs improvement.

Harris Detector

5 minute read

Published:

Widely used for corner detection in computer

Sequence Types

9 minute read

Published:

Built-in Sequences in Python

Python provides several built-in sequence types that are implemented in C, which are divided into two categories based on how they store their contents: Container sequences and Flat sequences.

Dunder Methods

1 minute read

Published:

What are special methods?

They are ‘predefined’ methods that allow you to customize the behavior of classes. Can also be referred to as “dunder” (double-underscore) methods.

KNN-Graph

3 minute read

Published:

Let’s say we have a dataset of 10 molecules (A through J), and we want to construct a 3-NN graph (k=3) where k indicates the number of neighbours.

TMAP

5 minute read

Published:

The general process of TMAP looks like this: image

LSH Forest

3 minute read

Published:

Locality-sensitive hashing or LSH, allows us to focus on pairs that are likely to be similar, instead of having to look at all pairs possible. It reduces therefore the amount of computational time required by a brute force approach.

MinHashing

4 minute read

Published:

TL;DR: MinHashing is a way to reduce the dimensionality of fingerprints. Instead of comparing 512-bit vectors directly -these are the vectors that we obtain after the fingerprint is applied to a SMILES format molecule-, MinHash reduces these vectors to smaller hashes that preserve similarity between molecules.