Home | Hippocampus's Garden

24 posts tagged with "python"

Elo vs Bradley-Terry: Which is Better for Comparing the Performance of LLMs?

March 17, 2024 | 4 min read

Chatbot Arena updated its LLM ranking method from Elo to Bradley-Terry. What changed? Let's dig into the differences.

Simulating Real-Time Chats using Flask's Server-Sent Events

November 06, 2023 | 2 min read

Discover the power of Flask's Server-Sent Events for better developer's experience of chatbots.

Calculating Color Histogram of Image Tensor: OpenCV vs PyTorch

August 10, 2022 | 3 min read

Two ways to calculate color histogram: OpenCV-based and PyTorch-based.

Fast Way to Get Top-K Elements from Numpy Array

May 14, 2022 | 1 min read

An optimized NumPy implementation of top-k function.

Meet Pandas: Converting DataFrame to CSR Matrix

March 08, 2022 | 2 min read

This post shows how to convert a DataFrame of user-item interactions to a compressed sparse row (CSR) matrix, the most common format for sparse matrices.

How to Setup Jupyter in Pipenv / Poetry

August 11, 2021 | 2 min read

It's so easy for me to forget how to setup Jupyter in a newly created Poetry / Pipenv environment. So, here it is.

Stats with Python: Multiple Linear Regression

March 31, 2021 | 3 min read

This post steps forward to multiple linear regression. The method of least squares is revisited --with linear algebra.

Stats with Python: Simple Linear Regression

March 22, 2021 | 5 min read

This post summarizes the basics of simple linear regression --method of least squares and coefficient of determination.

Stats with Python: Sample Correlation Coefficient is Biased

February 24, 2021 | 6 min read

Is the sample correlation coefficient an unbiased estimator? No! This post visualizes how large its bias is and shows how to fix it.

Stats with Python: Rank Correlation

February 06, 2021 | 8 min read

The correlation coefficient is a familiar statistic, but there are several variations whose differences should be noted. This post recaps the definitions of these common measures.

Stats with Python: Finite Population Correction

January 29, 2021 | 6 min read

When you sample from a finite population without replacement, beware the finite population correction. The samples are not independent of each other.

Stats with Python: Unbiased Variance

January 17, 2021 | 6 min read

What is unbiased sample variance? Why divide by n-1? With a little programming with Python, it's easier to understand.

Creating a Face Swapping Model in 10 Minutes

January 13, 2021 | 3 min read

Let's re-inplement face swapping in 10 minutes! This post shows a naive solution using a pre-trained CNN and OpenCV.

How I Built 🍣This Sushi Does Not Exist🍣

December 19, 2020 | 3 min read

Lightweight GAN has opened the way for generating fine images with ~100 training samples and affordable computing resources. This post presents "This Sushi Does Not Exist" and how I built it with GAE.

Custom Objective for LightGBM

November 22, 2020 | 6 min read

If you want to use a custom loss function with a modern GBDT model, you'll need the first- and second-order derivatives. This post shows how to implement them, using LightGBM as an example

Meet Pandas: Group-wise Sampling

October 13, 2020 | 2 min read

This post introduces how to sample groups from a dataset, which is helpful when you want to avoid data leakage.

Different Measures of Feature Importance Behave Differently

September 05, 2020 | 7 min read

This post compares the behaviors of different feature importance measures in tricky situations.

Meet Pandas: Query Dataframe

August 25, 2020 | 2 min read

This post introduces the Pandas method of `query`, which allows us to query dataframes in an SQL-like manner.

Learning to Play Slime Volleyball with PFRL

August 08, 2020 | 5 min read

This post introduces PFRL, a new reinforcement learning library, and uses it to learn to play the Slime Volleyball game on Colaboratory.

Meet Pandas: Grouping and Boxplot

June 14, 2020 | 2 min read

This post summarizes how to group data by some variable and draw boxplots on it using Pandas and Seaborn.

Reproducing Deep Double Descent

June 13, 2020 | 7 min read

Double descent is one of the mysteries of modern machine learning. I reproduced the main results of the recent paper by Nakkiran et al. and posed some questions that occurred to me.

Meet Pandas: loc, iloc, at & iat

April 27, 2020 | 2 min read

Have you ever confused Pandas methods `loc`, `at`, and `iloc` with each other? It's no more confusing when you have this table in mind.

PageRank Explained: Theory, Algorithm, and Some Experiments

March 12, 2020 | 9 min read

How does Google's PageRank work? Its theory and algorithm are explained, followed by numerical experiments.

Downsizing StyleGAN2 for Training on a Single GPU

March 04, 2020 | 3 min read

Want to generate realistic images with a single GPU? This post demonstrates how to downsize StyleGAN2 with slight performance degradation.

Hippocampus's Garden