Posts
-
Benchmarking Postgres on Hetzner servers with pgbench
For an upcoming project, I conducted a comprehensive benchmark of Postgres on both Hetzner VMs and dedicated servers. I won’t go into a detailed analysis of the results (as I am also not a Postgres expert), but I thought sharing them here might be valuable for some of you.
pgbench setup
# AlmaLinux 9 dnf -y module enable postgresql:16 dnf install -y postgresql-server postgresql-contrib postgresql-devel postgresql-setup --initdb systemctl start postgresql sudo -u postgres createdb pgbench_test sudo -u postgres pgbench -i -s 10 pgbench_test sudo -u postgres pgbench -c 30 -j 4 -T 120 pgbench_test # write sudo -u postgres pgbench -S -c 30 -j 4 -T 120 pgbench_test # read
which results in a load of 30 parallel client connections with 4 threads for a duration of 120 seconds.
-
Versioning with mkdocs when not using GitHub
Mike is the go-to tool for versioning MkDocs sites. It allows you to easily manage multiple versions of your site and provides a simple way to deploy them to a static hosting service. However,
mike
is heavily focused on GitHub Pages, which may not be suitable for all use cases. For example, if you want to host your site on a different platform or use a different versioning strategy, it can get a bit complicated. -
Integrating Umami into Astro website
There are many website analytics applications available today. Aside from the dominant but privacy-concerning Google Analytics, I have been using Umami for some time as my personal choice.
After building my company website with Astro, I wanted to integrate Umami into it. Normally, this is as simple as adding a
<script></script>
tag into the<head></head>
section of the overall layout template.Problems
However, after doing that, I encountered two issues:
-
Installing AlmaLinux 9 on Hetzner VPS
Background
I recently ordered an AX42 VPS from Hetzner. The idea was to go with AlmaLinux 9 as the OS. When starting off, I realized Hetzner only provides installers for AlmaLinux 8.
After the first install, I upgraded to Alma 9 and quickly realized that there might be a reason for the limitations: the VPS didn’t boot properly anymore. I did some research and found that there is an issue with
grub2
in AlmaLinux 9 (at least with Hetzner’s images/networking). Although Hetzner provides Alma 9 installers in general, they prevent you from installing Alma 9 when attempting to do so viainstallimage
using their Rescue System. -
Ansible - using Woodpecker as an alternative to Semaphore
Ansible can be used in many ways: most people likely execute their playbooks on their local machines. Then there is Ansible Automation Platform (AAP) (formerly Ansible Tower) and its small brother AWX. Both are the standard in larger organizations to allow for a more controlled way of running Ansible playbooks and offer RBAC capabilities. These two are also really “heavy” and require a lot of resources to run. Since some time both also require a Kubernetes backend.
-
Asserting container health when using docker with Ansible
When using Docker with Ansible, the
docker_container
anddocker_compose
modules are usually used to manage containers. While these modules work fine and provide a lot of options, the Ansible task itself will only check if this specific task was successful or not. This means it will only check for valid inputs and if the container was started or not. It will not check if the container is actually in a healthy state after starting up. -
Performing Postgresql major version upgrades - in Docker
As an advocate of microservices, I am also running various PostgreSQL databases in Docker containers. This works quite well, and updating is easy. As long as one stays within the major version that was initially used when first creating the container.
However, when a major version upgrade is needed, things get a bit more complicated. PostgreSQL provides a documentation page on how to upgrade from one major version to another. It has a link to the
pg_upgrade
executable, which somehow combines all steps into one command. Yet it still requires a backup, having the executable installed in the first place, and so on. -
Generic website pull request previews using S3 buckets
tl;dr: see the example repo and full code at the end of the post.
Pull request previews of websites are neat: they provide a direct way to inspect changes to a website before they are merged into the main branch. Yet setting up a CI/CD workflow that achieves this is not always trivial and depends on the specific CI/CD provider. This might be the main reason why people rely on Netlify for this task. Netlify does exactly this: it builds a website for each pull request and provides a link to the preview. It is easy to set up as it only requires linking a GitHub repository to Netlify. As long as the repository is public and one is okay with all other restrictions, this is a great solution. Yet, this won’t work for private repositories and repositories which do not live on GitHub.
-
Quarto - Metropolis theme
As a fan of the Metropolis beamer theme (by Matthias Vogelsang) I’ve created a theme port for the use within {xaringan} a few years ago (pat-s/xaringan-metropolis). Fast-forward to today, quarto is the new kid on the block when it comes to presentations in R. Recently it was time for my first presentation in the
quarto
era and I used the opportunity to also create a theme port of the Metropolis theme forquarto
. -
Announcing "rcli" - a command line tool to install and switch between R versions
GitHub repo: pat-s/rcli
Introduction
Ever since I am doing R development I am missing a generic way to install any R version on the command line. “Why does one need multiple R versions?” - one might ask.
-
Transitioning from x86 to arm64 on macOS - experiences of an R user
- Update 2022-10-01: Update section listing tools for R version switching
- Update 2022-04-01: Update gfortran section
- Update 2022-02-28: Add section describing how to enable openMP support
- Update 2021-12-15: Add section describing how to install rJava from source
- Update 2021-12-06: Add section describing how to deal with rJava
- Update 2021-11-19: Add section mentioning new libRblas.veclib
- Update 2021-11-14: Removed .R suffix from .Renviron.d files
Note: To avoid (questionable) third-party discussion tools, please post your thoughts and comments in an issue at https://codeberg.org/pat-s/pat-s.me/issues.
-
Running RStudio Server/Workbench as a desktop app on macOS
Introduction
In my work as an R consultant/scientist, I often work on/with RStudio Server (soon to be RStudio Workbench) instances. These have several advantages:
- The workload is executed on the server and not your local machine
- The environment can be centrally managed for many users and prevents OS-related issues on Windows/macOS/Linux user machines
- Session keep running in the background even if the local machine is powered off
- Often RStudio Server instances are way quicker than local RStudio Desktop installations
However, there is also (at least) one downside: it runs in a browser (tab). This comes with the issue that the keybindings of your browser of choice will conflict RStudio’s keybindings. Using keybindings is a great way to improve productivity and increase working experience in general, not only in RStudio.
-
Reproducibility of parallel tasks in R
Reproducibility is important. More important than ever. However, making a project reproducible is not as trivial as it sounds.
-
Using ccache to speed up R package checks on Travis CI
Introduction
Continuous integration checking for R packages is usually done on Travis CI because the R community has established a community driven build framework for R. In case you are not aware, there are also other tools that try to simplify the CI tasks for R even more.
-
Emoji support for Notion.so on Linux
Notion.so is a great tool for various tasks. I use it as a personal wiki but also for work-related notes.
Unfortunately there is no native support for Linux and even though this point has been mentioned quite often by the community, the Notion team did not provide a Linux Desktop App yet. Maybe it will never be shipped.
The Linux Desktop world can be evil when you have to make money selling applications. A lot of distributions with many different packaging standards and a small user base (compared to MacOS and Windows).
-
i3wm: Introducing my Linux desktop setup
Introduction
After switching from Windows/macOS to Linux about 1.5 years ago I tried many different Linux distributions and desktop environments. I do not want to go into details why I am running Arch Linux (Manjaro) in this post but rather talk about the “visual engine” in the background: the desktop environment. Strictly said, i3wm is not even a desktop environment but rather “only” a tiling window-manager. The key point is that most navigation is done via the keyboard. Windows cannot be re-sized with the mouse (by default) as they do not open in floating mode but only in full-size unless they are split in half (or further) by opening other applications.
-
Arch Linux setup guide tailored towards data science, R and spatial analysis
This guide reflects my view on how to setup a working Arch Linux system tailored towards data science, R and spatial analysis. If you have suggestions for modifications, please open an issue at https://codeberg.org/pat-s/pat-s.me. Enjoy the power of Linux!
-
Convert NEWS.md to ASCII NEWS
Maybe you know that for some packages in R you can view the NEWS file in the help pane of RStudio:
If you click on it, you can instantly see the NEWS file of a package. This is pretty neat if one of your most used packages was updated (maybe there was even a major version change!) and you want to see what changed.
To make this possible, the maintainer needs to have an ASCII NEWS file when he submits the package to CRAN. While having an ASCII NEWS file was the standard some years ago, nowadays almost everyone uses a markdown written
NEWS.md
file. This is good because its easier to read in the browser than a plain text file. -
Auto-mount network shares (cifs, sshfs, nfs) on-demand using autofs
Introduction
At work I usually have to connect to several servers. Some are Windows Servers, some are Linux Servers. On my local Linux machines (running Kubuntu 17.10 at the time writing this) I usually used
/etc/fstab
entries. However, thefstab
way does not mount on boot and always needs manual re-mounting. I was told that there have been times in which automatic mounting during boot usingfstab
was working but I never managed to get it working although I tried several mount options like_netdev
and others. Since I often have to re-mount the network shares (whenever there was a network disconnect), an option to automatically re-connect and mount the folders on boot was highly sought for. -
Introducing R Package 'oddsratio'
You are dealing with statistical models (GLMs or GAMs) with a binomial response variable? Then the
oddsratio
package will improve your analysis routine!This package simplifies the calculation of odds ratios in binomial models. For GAMs, it also provides you with the power to insert your results into the smooth functions of your predictors! But let’s start with some basics…
GLM
The concept of odds ratio calculation
The standard approach to calculate odds ratios in Generalized Linear Models (GLMs) is to exponentiate the function coefficients using
exp(coef(model))
. Since the coefficients are returned in log odds, exponentiating converts them to odds. But wait! We want odds ratios showing the change in odds for a specific predictor change! Usually you just create a vector which stores the increments of your predictors you want to calculate odds ratios for. Next, you have to remove the first value of thecoef
output (which is usually the intercept) because you only want to calculate odds ratios for your predictors! Then you multiply the coefficients with your increment values.