<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Arian Farid</title>
        <link>https://arianfarid.me/</link>
        <description>Arian Farid's blog covering software development and complex systems.</description>
        <lastBuildDate>Sun, 26 Apr 2026 02:17:56 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <image>
            <title>Arian Farid</title>
            <url>https://arianfarid.me/images/avatar.jpeg</url>
            <link>https://arianfarid.me/</link>
        </image>
        <copyright>Copyright © 2026 Arian Farid</copyright>
        <item>
            <title><![CDATA[Simple Key-Value Store Built in Golang]]></title>
            <link>https://arianfarid.me/articles/simple-kv-cache</link>
            <guid isPermaLink="false">https://arianfarid.me/articles/simple-kv-cache</guid>
            <pubDate>Sat, 13 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[An append-only log, built with zero dependencies in Golang, and how I integrate it into my workflow with Hammerspoon.]]></description>
            <author>Arian Farid</author>
            <category>Golang</category>
            <category>Algorithms</category>
            <category>Cache</category>
            <category>Store</category>
            <category>Hammerspoon</category>
        </item>
        <item>
            <title><![CDATA[Bitwise DNA Compression in Rust: Small Footprint with Fast Reverse Complements]]></title>
            <link>https://arianfarid.me/articles/dna-compression</link>
            <guid isPermaLink="false">https://arianfarid.me/articles/dna-compression</guid>
            <pubDate>Tue, 17 Jun 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[How I used Rust to compress DNA sequences with 4-bit encodings, enabling fast bitwise rotation-based DNA Complementary base pairs.]]></description>
            <content:encoded><![CDATA[<p>DNA datasets are massive. A single human genome can use several gigabytes of storage in its most simplest form of storage. Certain forms of storage can even scale to <a href="https://medium.com/precision-medicine/how-big-is-the-human-genome-e90caa3409b0" target="_blank" rel="noreferrer">200 GB for a single genome alone</a>. As DNA sequencing becomes cheaper, roughly <a href="https://www.genome.gov/about-genomics/fact-sheets/Genomic-Data-Science" target="_blank" rel="noreferrer">40 exabytes of genomic data are produced per year</a>.</p>
<p>Efficiently storing and analyzing these sequences is a critical challenge. Furthermore, the ability to analyze large sequences of data are increasingly critical. In this post, we will explore a method to compress DNA using 4-bits per nucleotide in pure Rust, that allows us to generate Complementary base pairs in its compressed form.</p>
<p>This technique is especially useful in DNA analytical pipelines, where performance and memory constraints are critical. By minimizing the footprint of each sequence, we simultaneously reduce storage overhead and in-memory costs, without sacrificing speed or the ability to operate directly on compressed data.</p>
<h2 id="background" tabindex="-1">Background <a class="header-anchor" href="#background" aria-label="Permalink to &quot;Background&quot;">&ZeroWidthSpace;</a></h2>
<h3 id="dna-bases-and-iupac-codes" tabindex="-1">DNA Bases and IUPAC Codes <a class="header-anchor" href="#dna-bases-and-iupac-codes" aria-label="Permalink to &quot;DNA Bases and IUPAC Codes&quot;">&ZeroWidthSpace;</a></h3>
<p>There are <a href="https://genome.ucsc.edu/goldenPath/help/iupac.html" target="_blank" rel="noreferrer">15 IUPAC codes</a>. The ones that most are familiar with are &quot;A&quot;, &quot;G&quot;, &quot;C&quot;, and &quot;T&quot;, representing the four standard DNA bases. However, DNA sequencing often produces ambiguous results. The remaining 11 codes are for these cases. For example, &quot;R&quot; can represent &quot;G&quot; <em>or</em> &quot;A&quot;, while &quot;N&quot; can represent <em>any</em> nucleotide.</p>
<details>
<summary>Click to show all 15 IUPAC codes</summary>
<p>|  | Symbol | Bases |
|</p>
]]></content:encoded>
            <author>Arian Farid</author>
            <category>Rust</category>
            <category>Systems Programming</category>
            <category>Algorithms</category>
            <category>DNA</category>
            <category>Compression</category>
            <category>Bitwise</category>
            <category>Bioinformatics</category>
        </item>
    </channel>
</rss>