From Data Scraping to Classification, LSTM vs Transformers

This is what the article is about. Robot photo by Phillip Glickman on Unsplash. Developer persona, Beetle, and Fox images by Open Clipart, tortoise by creozavr, ladybug by Prawny, bee by gustavorezende, other beetle by Clker. On Pixabay for the last N after the Unsplash. Composition by author.

Machine learning (ML) techniques for Natural Language Processing (NLP) offer impressive results these days. Libraries such as Keras, PyTorch, and HuggingFace NLP make the application of latest research and models in the area a (relatively) easy task. In this article, I implement and compare two different NLP based classifier model architectures using the Firefox browser issue tracker data.

I previously built a similar issue report classifier. Back then, I found the deep-learning based LSTM (Long-Short-Term-Memory) model architecture performed very well for the task. I later added the Attention mechanism on top of the LSTM, improving the results. …


An Explanation Between Cartoons and Greek Symbols

If you spend a bit more time reading about cryptocurrencies, blockchains, or many other related technologies, you likely run into the term zero knowledge proofs (ZKP). To me, the term sounds like a paradox. How can you prove anything with zero knowledge? How do you event know what to prove, if you have zero knowledge?

So I tried to build myself some understanding on this. In this article, I try to share this understanding of what zero knowledge means in ZKP, and what is the proof really about. And how do the two relate to each other.

I start with…


The data structure within Bitcoin, Amazon Dynamo DB, ZFS, …

This article explores what are Merkle trees, and how they are used in practice in different systems including Bitcoin, Amazon’s Dynamo DB, and the ZFS filesystem. The basic concept is quite simple, but some of the clever applications are not so obvious.

First, lets start with the concept of Merkle trees. As I said, it is not too complicated in its basic form.

What is a Merkle Tree

A Merkle tree is fundamentally just a hierarchical set of hash values, building from a set of actual data (Merkle leaf) to intermediate hashes (Merkle braches) and up to the Merkle root that summarizes all the data…


Applying the Poisson Distribution to events over time and area, with examples

I find probability distributions would often be useful tools to know and understand, but the explanations are not always very intuitive. The Poisson distribution is one of the probability distributions that I have run into quite often. Most recently I ran into it when preparing for some AWS Machine Learning certification questions. Since this is not the first time I run into it, I figured it would be nice to understand it better. In this article, I explore questions on when, where, and how to apply it. …


Making Sense of Big Data

Techniques for Testing Autonomous Cars and other ML-Based Systems

Testing machine learning (ML)-based systems requires different approaches compared to traditional software. With traditional software, the specification and its relation to the implementation is typically quite explicit: “When the user types a valid username and matching password, they are successfully logged in”. Very simple to understand, deterministic, and easy to write a test case for.

ML-based systems are quite different. Instead of clearly defined inputs and logical flows based on explicit programming statements, a ML-based system is based on potentially huge input spaces with probabilistic outcomes from largely black-box components (models). In this article, I take a look at metamorphic…


Getting Started

Exploring the relations between machine learning metrics

Terminology of a specific domain is often difficult to start with. With a software engineering background, machine learning has many such terms that I find I need to remember to use the tools and read the articles.

Some basic terms are Precision, Recall, and F1-Score. These relate to getting a finer-grained idea of how well a classifier is doing, as opposed to just looking at overall accuracy. Writing an explanation forces me to think it through, and helps me remember the topic myself. That’s why I like to write these articles.

I am looking at a binary classifier in this…


How well does LIME work in practice?

Machine learning algorithms can produce impressive results in classification, prediction, anomaly detection, and many other hard problems. Understanding what the results are based on is often complicated, since many algorithms are black boxes with little visibility into their inner working. Explainable AI is a term referring to techniques for providing human-understandable explanations of ML algorithm outputs.

Explainable AI is interesting for many reasons, including being able to reason about the algorithms used, the data we have to train them, and to understand better how to test the system using such algorithms.

LIME, or Local Interpretable Model-Agnostic Explanations is one technique…

Teemu Kanstrén

PhD. Technology research and software engineering. Typically I write too long, because I try to understand something myself.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store