<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://mukulkadel.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://mukulkadel.com/" rel="alternate" type="text/html" /><updated>2026-05-23T18:31:05+00:00</updated><id>https://mukulkadel.com/feed.xml</id><title type="html">Mukul Kadel</title><subtitle>Software developer&apos;s blog — technology, programming, book reviews, and more.</subtitle><author><name>Mukul Kadel</name></author><entry><title type="html">Understanding ArgoCD: GitOps for Kubernetes</title><link href="https://mukulkadel.com/argocd-gitops-kubernetes/" rel="alternate" type="text/html" title="Understanding ArgoCD: GitOps for Kubernetes" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/argocd-gitops-kubernetes</id><content type="html" xml:base="https://mukulkadel.com/argocd-gitops-kubernetes/"><![CDATA[<p>With Kubernetes, you can apply manifests manually with <code>kubectl apply</code> or trigger deployments from a CI pipeline. Both approaches work — until someone applies a hotfix directly to the cluster and now production no longer matches what’s in git. ArgoCD solves this by making git the single source of truth: the cluster continuously reconciles itself to match whatever is in the repository.</p>

<h2 id="what-is-gitops">What is GitOps?</h2>

<p><strong>GitOps</strong> is an operational model where:</p>

<ol>
  <li>The desired state of your system lives in git</li>
  <li>An agent continuously compares desired state to actual state</li>
  <li>Any drift is automatically corrected (or flagged)</li>
</ol>

<p>With ArgoCD, you push a change to git and the cluster catches up — you never <code>kubectl apply</code> to production directly.</p>

<h2 id="installing-argocd">Installing ArgoCD</h2>

<pre><code class="language-bash">$ kubectl create namespace argocd

$ kubectl apply -n argocd -f \
  https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# wait for pods to be ready
$ kubectl wait --for=condition=available deployment \
  -l app.kubernetes.io/name=argocd-server \
  -n argocd --timeout=120s

# port-forward the UI
$ kubectl port-forward svc/argocd-server -n argocd 8080:443
</code></pre>

<p>Get the initial admin password:</p>

<pre><code class="language-bash">$ kubectl get secret argocd-initial-admin-secret -n argocd \
  -o jsonpath='{.data.password}' | base64 -d
r8Kf9pXqZmN2wQ4T
</code></pre>

<p>Visit <code>https://localhost:8080</code>, log in as <code>admin</code>, and you’re in.</p>

<p>Install the CLI:</p>

<pre><code class="language-bash">$ brew install argocd

$ argocd login localhost:8080 \
  --username admin \
  --password r8Kf9pXqZmN2wQ4T \
  --insecure
</code></pre>

<h2 id="creating-your-first-application">Creating Your First Application</h2>

<p>An ArgoCD <strong>Application</strong> links a git repository path to a cluster namespace. Here’s the YAML approach:</p>

<pre><code class="language-yaml">apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp
  namespace: argocd
spec:
  project: default

  source:
    repoURL: https://github.com/myorg/k8s-manifests
    targetRevision: main
    path: apps/myapp

  destination:
    server: https://kubernetes.default.svc
    namespace: production

  syncPolicy:
    automated:
      prune: true       # delete resources removed from git
      selfHeal: true    # revert manual changes to the cluster
    syncOptions:
      - CreateNamespace=true
</code></pre>

<pre><code class="language-bash">$ kubectl apply -f myapp-argocd.yaml
application.argoproj.io/myapp created

$ argocd app get myapp
Name:               myapp
Project:            default
Server:             https://kubernetes.default.svc
Namespace:          production
URL:                https://localhost:8080/applications/myapp
Repo:               https://github.com/myorg/k8s-manifests
Target:             main
Path:               apps/myapp
Sync Policy:        Automated (Prune, Self Heal)
Sync Status:        Synced to main (a3f2c9d)
Health Status:      Healthy
</code></pre>

<h2 id="sync-policies">Sync Policies</h2>

<p>ArgoCD can sync manually or automatically.</p>

<p><strong>Manual sync</strong> — ArgoCD shows drift but waits for you to approve:</p>

<pre><code class="language-bash">$ argocd app sync myapp
</code></pre>

<p><strong>Automated sync</strong> — ArgoCD applies changes within seconds of a git push. Use <code>prune: true</code> to also delete resources that were removed from git, and <code>selfHeal: true</code> to revert any manual <code>kubectl</code> edits.</p>

<p>For production, many teams prefer <strong>automated sync with manual promotion</strong>: automated in staging, manual approval gate for production. You can set up ArgoCD sync windows to restrict when automated syncs are allowed.</p>

<h2 id="repository-structure-patterns">Repository Structure Patterns</h2>

<p>ArgoCD works with raw manifests, Helm charts, or Kustomize. A common layout for multiple apps and environments:</p>

<pre><code>k8s-manifests/
├── apps/
│   ├── myapp/
│   │   ├── base/          # shared manifests
│   │   └── overlays/
│   │       ├── staging/   # kustomize patches for staging
│   │       └── production/
│   └── api-gateway/
└── argocd/
    └── applications/      # ArgoCD Application YAMLs
</code></pre>

<p>Kustomize source:</p>

<pre><code class="language-yaml">source:
  repoURL: https://github.com/myorg/k8s-manifests
  path: apps/myapp/overlays/production
  targetRevision: main
</code></pre>

<p>Helm chart source:</p>

<pre><code class="language-yaml">source:
  repoURL: https://github.com/myorg/k8s-manifests
  path: helm/myapp
  targetRevision: main
  helm:
    valueFiles:
      - values/production.yaml
</code></pre>

<h2 id="app-of-apps-pattern">App of Apps Pattern</h2>

<p>Managing dozens of ArgoCD Application resources manually doesn’t scale. The <strong>App of Apps</strong> pattern uses one root Application to manage all others:</p>

<pre><code class="language-yaml"># root-app.yaml
spec:
  source:
    path: argocd/applications   # directory of Application YAMLs
</code></pre>

<p>Push a new Application YAML to <code>argocd/applications/</code> and ArgoCD picks it up automatically — no manual <code>kubectl apply</code> needed.</p>

<h2 id="checking-sync-status">Checking Sync Status</h2>

<pre><code class="language-bash"># list all apps
$ argocd app list
NAME    CLUSTER        NAMESPACE   PROJECT  STATUS  HEALTH   SYNCPOLICY
myapp   in-cluster     production  default  Synced  Healthy  Auto-Prune

# see what's out of sync (before syncing)
$ argocd app diff myapp

# force a sync
$ argocd app sync myapp --prune

# roll back to a previous git commit
$ argocd app rollback myapp &lt;revision-id&gt;
</code></pre>

<h2 id="conclusion">Conclusion</h2>

<p>ArgoCD enforces a discipline that’s hard to achieve with pipelines alone: every change to the cluster goes through git, drift is visible and correctable, and the history of every deployment is a git commit. The combination of automated sync, self-healing, and pull-request-based change management gives you Kubernetes deployments that are auditable, reproducible, and recoverable. Pair it with Helm or Kustomize for per-environment configuration and you have a complete GitOps workflow.</p>]]></content><author><name>Mukul Kadel</name></author><category term="wiki" /><category term="Programming" /><category term="argocd" /><category term="gitops" /><category term="kubernetes" /><category term="k8s" /><category term="devops" /><category term="continuous delivery" /><category term="cloud native" /><category term="deployment" /><summary type="html"><![CDATA[Learn how ArgoCD implements GitOps by syncing Kubernetes clusters to a Git repository, with a full setup walkthrough and real-world deployment patterns.]]></summary></entry><entry><title type="html">awk Command Cheat Sheet with Real Examples</title><link href="https://mukulkadel.com/awk-command-cheat-sheet/" rel="alternate" type="text/html" title="awk Command Cheat Sheet with Real Examples" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/awk-command-cheat-sheet</id><content type="html" xml:base="https://mukulkadel.com/awk-command-cheat-sheet/"><![CDATA[<p><code>awk</code> is one of those Unix tools that looks cryptic at first glance and then becomes indispensable. It reads input line by line, splits each line into fields, and lets you run a tiny program against each one — making it perfect for log parsing, report generation, and quick data transformations without writing a script.</p>

<h2 id="basic-structure">Basic Structure</h2>

<p>Every <code>awk</code> program follows the same pattern:</p>

<pre><code>awk 'pattern { action }' file
</code></pre>

<ul>
  <li><strong>pattern</strong> — a condition that must be true for the action to run (optional)</li>
  <li><strong>action</strong> — what to do with the matching line (optional; default is <code>print</code>)</li>
</ul>

<p>If you omit the pattern, the action runs on every line. If you omit the action, <code>awk</code> prints the matching lines.</p>

<h2 id="fields-and-the-field-separator">Fields and the Field Separator</h2>

<p><code>awk</code> splits each line on whitespace by default. Fields are numbered <code>$1</code>, <code>$2</code>, …, and <code>$0</code> is the entire line.</p>

<pre><code class="language-bash">$ echo "Alice 30 Engineer" | awk '{ print $1, $3 }'
Alice Engineer
</code></pre>

<p>Change the field separator with <code>-F</code>:</p>

<pre><code class="language-bash">$ echo "root:x:0:0:root:/root:/bin/bash" | awk -F: '{ print $1, $7 }'
root /bin/bash
</code></pre>

<p>Set a multi-character or regex separator:</p>

<pre><code class="language-bash">$ echo "key=value=extra" | awk -F'=' '{ print $2 }'
value
</code></pre>

<h2 id="built-in-variables">Built-in Variables</h2>

<table>
  <thead>
    <tr>
      <th>Variable</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>$0</code></td>
      <td>The full current line</td>
    </tr>
    <tr>
      <td><code>$1</code>, <code>$2</code>, …</td>
      <td>Individual fields</td>
    </tr>
    <tr>
      <td><code>NF</code></td>
      <td>Number of fields on the current line</td>
    </tr>
    <tr>
      <td><code>NR</code></td>
      <td>Current line (record) number</td>
    </tr>
    <tr>
      <td><code>FS</code></td>
      <td>Input field separator (default: whitespace)</td>
    </tr>
    <tr>
      <td><code>OFS</code></td>
      <td>Output field separator (default: space)</td>
    </tr>
    <tr>
      <td><code>RS</code></td>
      <td>Input record separator (default: newline)</td>
    </tr>
    <tr>
      <td><code>ORS</code></td>
      <td>Output record separator (default: newline)</td>
    </tr>
    <tr>
      <td><code>FILENAME</code></td>
      <td>Name of the current input file</td>
    </tr>
  </tbody>
</table>

<pre><code class="language-bash">$ awk '{ print NR, NF, $0 }' /etc/hosts
1 2 127.0.0.1 localhost
2 1 ::1
3 4 255.255.255.255 broadcasthost
</code></pre>

<h2 id="pattern-matching">Pattern Matching</h2>

<p>Match lines containing a regex:</p>

<pre><code class="language-bash">$ awk '/ERROR/' app.log
2024-01-15 ERROR connection refused on port 5432
2024-01-15 ERROR disk quota exceeded
</code></pre>

<p>Negate with <code>!~</code>:</p>

<pre><code class="language-bash">$ awk '!/DEBUG/' app.log
</code></pre>

<p>Match a specific field against a pattern:</p>

<pre><code class="language-bash">$ awk '$3 ~ /ERROR/' app.log
</code></pre>

<p>Numeric comparisons work directly:</p>

<pre><code class="language-bash">$ awk '$5 &gt; 1000' access.log
</code></pre>

<h2 id="begin-and-end-blocks">BEGIN and END Blocks</h2>

<p><code>BEGIN</code> runs before any input is read. <code>END</code> runs after all input is processed. Both are optional.</p>

<pre><code class="language-bash">$ awk 'BEGIN { print "=== Report ===" } { print $1 } END { print "=== Done ===" }' names.txt
=== Report ===
Alice
Bob
Carol
=== Done ===
</code></pre>

<p>A classic use: summing a column.</p>

<pre><code class="language-bash">$ awk '{ sum += $3 } END { print "Total:", sum }' sales.txt
Total: 48320
</code></pre>

<h2 id="printing-and-formatting">Printing and Formatting</h2>

<p><code>print</code> separates items with <code>OFS</code> (default space). <code>printf</code> works just like C:</p>

<pre><code class="language-bash">$ awk '{ printf "%-20s %5d\n", $1, $2 }' data.txt
Alice                   30
Bob                     25
</code></pre>

<p>Set <code>OFS</code> to change the output delimiter:</p>

<pre><code class="language-bash">$ echo "Alice 30 Engineer" | awk 'BEGIN { OFS="," } { print $1, $2, $3 }'
Alice,30,Engineer
</code></pre>

<p>Print the last field regardless of how many there are:</p>

<pre><code class="language-bash">$ echo "a b c d e" | awk '{ print $NF }'
e
</code></pre>

<p>Print all but the first field:</p>

<pre><code class="language-bash">$ echo "timestamp INFO message goes here" | awk '{ $1=""; print $0 }'
 INFO message goes here
</code></pre>

<h2 id="conditionals-and-loops">Conditionals and Loops</h2>

<p><code>awk</code> has full <code>if/else</code>, <code>for</code>, and <code>while</code> support inside action blocks:</p>

<pre><code class="language-bash">$ awk '{ if ($2 &gt;= 90) print $1, "PASS"; else print $1, "FAIL" }' scores.txt
Alice PASS
Bob FAIL
Carol PASS
</code></pre>

<pre><code class="language-bash">$ awk 'BEGIN { for (i=1; i&lt;=5; i++) print i }'
1
2
3
4
5
</code></pre>

<h2 id="arrays">Arrays</h2>

<p><code>awk</code> arrays are associative (like a hash map). You don’t declare them — just use them:</p>

<pre><code class="language-bash">$ awk '{ count[$1]++ } END { for (user in count) print user, count[user] }' access.log
alice 42
bob 17
carol 8
</code></pre>

<p>Check if a key exists with <code>in</code>:</p>

<pre><code class="language-bash">$ awk '$1 in seen { next } { seen[$1]=1; print }' file.txt
</code></pre>

<p>This prints only the first occurrence of each value in column 1 — a simple dedup.</p>

<h2 id="common-one-liners">Common One-Liners</h2>

<p><strong>Print lines between two patterns (inclusive):</strong></p>

<pre><code class="language-bash">$ awk '/START/,/END/' file.txt
</code></pre>

<p><strong>Count lines matching a pattern:</strong></p>

<pre><code class="language-bash">$ awk '/ERROR/ { count++ } END { print count }' app.log
14
</code></pre>

<p><strong>Remove duplicate lines (preserving order):</strong></p>

<pre><code class="language-bash">$ awk '!seen[$0]++' file.txt
</code></pre>

<p><strong>Print every other line:</strong></p>

<pre><code class="language-bash">$ awk 'NR % 2 == 0' file.txt
</code></pre>

<p><strong>Sum the second column of a CSV:</strong></p>

<pre><code class="language-bash">$ awk -F, '{ sum += $2 } END { print sum }' data.csv
</code></pre>

<p><strong>Convert whitespace-delimited output to CSV:</strong></p>

<pre><code class="language-bash">$ ps aux | awk 'NR&gt;1 { print $1","$2","$11 }'
root,1,/sbin/launchd
_windowserver,188,/System/Library/PrivateFrameworks/SkyLight.framework/...
</code></pre>

<p><strong>Find lines longer than 80 characters:</strong></p>

<pre><code class="language-bash">$ awk 'length($0) &gt; 80' source.c
</code></pre>

<p><strong>Print unique values of a field:</strong></p>

<pre><code class="language-bash">$ awk -F: '!seen[$1]++ { print $1 }' /etc/passwd
root
nobody
daemon
</code></pre>

<h2 id="multiple-files-and-filename">Multiple Files and FILENAME</h2>

<p>When processing multiple files, <code>FILENAME</code> tells you which file the current line came from:</p>

<pre><code class="language-bash">$ awk '{ print FILENAME, NR, $0 }' *.log
access.log 1 GET /index.html 200
error.log 1 PHP Warning: ...
</code></pre>

<p>Reset a counter per file using <code>FNR</code> (file-relative record number) vs <code>NR</code> (global):</p>

<pre><code class="language-bash">$ awk 'FNR==1 { print "--- File:", FILENAME }' *.log
--- File: access.log
--- File: error.log
</code></pre>

<h2 id="conclusion">Conclusion</h2>

<p><code>awk</code> covers an enormous range of tasks — anywhere from a quick column extraction to a multi-pass report with totals and grouping. The key mental model is: pattern gates the action, fields are <code>$1</code>…<code>$NF</code>, and <code>BEGIN</code>/<code>END</code> handle setup and teardown. Once those three ideas are solid, most <code>awk</code> one-liners read naturally. For anything more complex, <code>sed</code> handles in-place substitution and <code>jq</code> handles JSON — but for structured plaintext, <code>awk</code> is still the sharpest tool in the box.</p>]]></content><author><name>Mukul Kadel</name></author><category term="wiki" /><category term="unix" /><category term="awk" /><category term="unix" /><category term="command line" /><category term="linux" /><category term="text processing" /><category term="cheatsheet" /><category term="terminal" /><category term="scripting" /><summary type="html"><![CDATA[A practical awk reference covering field splitting, patterns, built-in variables, and real one-liners for text processing on Linux and macOS.]]></summary></entry><entry><title type="html">cron and crontab — Scheduling Jobs on Linux and macOS</title><link href="https://mukulkadel.com/cron-crontab-scheduling-guide/" rel="alternate" type="text/html" title="cron and crontab — Scheduling Jobs on Linux and macOS" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/cron-crontab-scheduling-guide</id><content type="html" xml:base="https://mukulkadel.com/cron-crontab-scheduling-guide/"><![CDATA[<p><code>cron</code> is the backbone of task automation on Unix systems. Whether you’re rotating logs at midnight, sending a daily report, or polling an API every five minutes, cron is almost certainly the right tool for the job — and it’s been reliable enough to ship on every major Linux and macOS system for decades.</p>

<h2 id="how-cron-works">How cron works</h2>

<p><code>crond</code> is a daemon that runs in the background and wakes up every minute to check if any scheduled job is due. Jobs are defined in a <strong>crontab</strong> (cron table) — a plain text file with one job per line. Each line specifies a schedule and a command.</p>

<pre><code class="language-bash">$ crontab -l
# no crontab for mukul
</code></pre>

<p>To edit your crontab:</p>

<pre><code class="language-bash">$ crontab -e
</code></pre>

<p>This opens the file in <code>$EDITOR</code> (usually <code>vi</code> or <code>nano</code>). Changes take effect immediately after saving.</p>

<h2 id="crontab-syntax">Crontab syntax</h2>

<p>Every cron entry follows this structure:</p>

<pre><code>* * * * * /path/to/command
│ │ │ │ │
│ │ │ │ └─ Day of week  (0–7, 0 and 7 are Sunday)
│ │ │ └─── Month        (1–12)
│ │ └───── Day of month (1–31)
│ └─────── Hour         (0–23)
└───────── Minute       (0–59)
</code></pre>

<p>A <code>*</code> means “every valid value”. Here are some examples:</p>

<table>
  <thead>
    <tr>
      <th>Schedule</th>
      <th>Expression</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Every minute</td>
      <td><code>* * * * *</code></td>
    </tr>
    <tr>
      <td>Every day at midnight</td>
      <td><code>0 0 * * *</code></td>
    </tr>
    <tr>
      <td>Every Monday at 9 AM</td>
      <td><code>0 9 * * 1</code></td>
    </tr>
    <tr>
      <td>Every 5 minutes</td>
      <td><code>*/5 * * * *</code></td>
    </tr>
    <tr>
      <td>First day of every month</td>
      <td><code>0 0 1 * *</code></td>
    </tr>
    <tr>
      <td>Every weekday at 6:30 PM</td>
      <td><code>30 18 * * 1-5</code></td>
    </tr>
  </tbody>
</table>

<h3 id="step-values">Step values</h3>

<p><code>*/n</code> means “every n units”. <code>*/15</code> in the minute field means every 15 minutes.</p>

<h3 id="range-and-list">Range and list</h3>

<ul>
  <li>A range: <code>1-5</code> (Monday through Friday)</li>
  <li>A list: <code>1,3,5</code> (Monday, Wednesday, Friday)</li>
  <li>Combined: <code>0 9 * * 1,3,5</code> — 9 AM on Mon, Wed, Fri</li>
</ul>

<h2 id="practical-examples">Practical examples</h2>

<h3 id="run-a-backup-script-nightly-at-2-am">Run a backup script nightly at 2 AM</h3>

<pre><code>0 2 * * * /home/mukul/scripts/backup.sh &gt;&gt; /var/log/backup.log 2&gt;&amp;1
</code></pre>

<p>The <code>&gt;&gt; ... 2&gt;&amp;1</code> redirects both stdout and stderr to a log file. Without redirection, cron mails the output to the system user — which often disappears silently.</p>

<h3 id="clear-a-temp-directory-every-sunday-at-3-am">Clear a temp directory every Sunday at 3 AM</h3>

<pre><code>0 3 * * 0 rm -rf /tmp/myapp/cache/*
</code></pre>

<h3 id="restart-a-service-if-it-crashes-crude-watchdog">Restart a service if it crashes (crude watchdog)</h3>

<pre><code>*/5 * * * * pgrep myapp || systemctl restart myapp
</code></pre>

<h3 id="pull-latest-code-and-restart-an-app-every-hour">Pull latest code and restart an app every hour</h3>

<pre><code>0 * * * * cd /srv/myapp &amp;&amp; git pull &amp;&amp; systemctl restart myapp
</code></pre>

<h2 id="environment-in-cron">Environment in cron</h2>

<p>Cron jobs run with a minimal environment — no <code>~/.bashrc</code>, no <code>PATH</code> beyond <code>/usr/bin:/bin</code>. This is the most common source of “it works manually but not in cron” bugs.</p>

<p>Fix it by setting <code>PATH</code> at the top of your crontab:</p>

<pre><code>PATH=/usr/local/bin:/usr/bin:/bin:/home/mukul/.local/bin

0 2 * * * backup.sh
</code></pre>

<p>Or use absolute paths everywhere inside the script.</p>

<h2 id="system-wide-cron-directories">System-wide cron directories</h2>

<p>On Linux, you don’t always need <code>crontab -e</code>. Drop a script into one of these directories and cron will run it automatically:</p>

<pre><code class="language-bash">/etc/cron.hourly/
/etc/cron.daily/
/etc/cron.weekly/
/etc/cron.monthly/
</code></pre>

<p>Files in these directories must be executable and must not have an extension (no <code>.sh</code>).</p>

<pre><code class="language-bash">$ sudo cp myscript /etc/cron.daily/myscript
$ sudo chmod +x /etc/cron.daily/myscript
</code></pre>

<h2 id="macos-specifics">macOS specifics</h2>

<p>macOS ships with <code>cron</code> but Apple recommends <code>launchd</code> for new automations. That said, <code>crontab</code> still works fine on macOS for most use cases. One gotcha: full disk access permissions can block cron from accessing certain directories under modern macOS security restrictions. If a cron job silently fails, check System Settings → Privacy &amp; Security → Full Disk Access and add <code>cron</code> or your terminal emulator.</p>

<p>You can also use <code>launchd</code> plists in <code>~/Library/LaunchAgents/</code> for finer-grained scheduling (e.g., run at login, retry on failure), but for simple recurring jobs <code>crontab</code> is less boilerplate.</p>

<h2 id="debugging-cron-jobs">Debugging cron jobs</h2>

<p><strong>Check the system mail:</strong></p>
<pre><code class="language-bash">$ mail
# or
$ cat /var/spool/mail/$USER
</code></pre>

<p><strong>Check syslog for cron activity:</strong></p>
<pre><code class="language-bash">$ grep CRON /var/log/syslog | tail -20
May 23 02:00:01 hostname CRON[12345]: (mukul) CMD (/home/mukul/scripts/backup.sh)
</code></pre>

<p><strong>Test the command manually with a clean environment:</strong></p>
<pre><code class="language-bash">$ env -i HOME=$HOME PATH=/usr/bin:/bin bash -c 'your_command_here'
</code></pre>

<p>This strips most environment variables and closely mimics what cron sees.</p>

<h2 id="common-mistakes">Common mistakes</h2>

<ul>
  <li><strong>Forgetting <code>2&gt;&amp;1</code></strong> — error output silently disappears</li>
  <li><strong>Relative paths</strong> — use absolute paths or set <code>PATH</code> in the crontab</li>
  <li><strong>No newline at end of file</strong> — some cron implementations ignore the last line without a trailing newline</li>
  <li><strong>Editing <code>/etc/crontab</code> directly instead of <code>crontab -e</code></strong> — system crontab has an extra user field; mixing formats causes parse errors</li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p><code>cron</code> is one of those tools you can learn in 20 minutes and rely on for years. The syntax is compact but expressive enough for nearly every scheduling pattern you’ll encounter. Once you get the hang of redirecting output, setting <code>PATH</code>, and testing jobs in a clean environment, you’ll stop wrestling with silent failures and start trusting your scheduled jobs to just run.</p>]]></content><author><name>Mukul Kadel</name></author><category term="Tutorials" /><category term="unix" /><category term="cron" /><category term="crontab" /><category term="scheduling" /><category term="unix" /><category term="linux" /><category term="macos" /><category term="automation" /><category term="sysadmin" /><category term="terminal" /><summary type="html"><![CDATA[A practical guide to cron and crontab for scheduling recurring tasks on Linux and macOS, with syntax breakdowns and real-world examples.]]></summary></entry><entry><title type="html">DNS Deep Dive: A Records, CNAME, MX, TTL Explained</title><link href="https://mukulkadel.com/dns-deep-dive-records-explained/" rel="alternate" type="text/html" title="DNS Deep Dive: A Records, CNAME, MX, TTL Explained" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/dns-deep-dive-records-explained</id><content type="html" xml:base="https://mukulkadel.com/dns-deep-dive-records-explained/"><![CDATA[<p>DNS is the phonebook of the internet — a distributed, hierarchical system that translates human-readable domain names into IP addresses. When something goes wrong with DNS, everything breaks: websites won’t load, emails won’t send, and services silently fail in confusing ways. Understanding DNS properly — not just “add an A record pointing to your server IP” — is one of those foundational pieces of knowledge that pays off every time you deploy something.</p>

<h2 id="how-dns-resolution-works">How DNS Resolution Works</h2>

<p>When you type <code>example.com</code> into a browser, here’s what actually happens:</p>

<pre><code>Browser
  ↓
Local DNS cache (checked first)
  ↓
Recursive resolver (usually your ISP or 8.8.8.8)
  ↓
Root nameserver (.) — knows where .com nameservers are
  ↓
TLD nameserver (.com) — knows where example.com nameservers are
  ↓
Authoritative nameserver (ns1.example.com) — returns the actual record
  ↑
Answer propagates back up and is cached at each layer
</code></pre>

<p>The whole process typically takes 20–120 ms for a cold lookup. Subsequent lookups are answered from cache in under 1 ms.</p>

<h2 id="dns-record-types">DNS Record Types</h2>

<h3 id="a-record">A Record</h3>

<p>Maps a hostname to an IPv4 address. The most fundamental record.</p>

<pre><code>example.com.    300    IN    A    93.184.216.34
www.example.com. 300   IN    A    93.184.216.34
</code></pre>

<p>You can have multiple A records for the same hostname — DNS returns all of them, and the client picks one (usually the first, sometimes randomly). This is the simplest form of load balancing, though it has no health checking.</p>

<h3 id="aaaa-record">AAAA Record</h3>

<p>Maps a hostname to an IPv6 address. Same concept as A, but for IPv6.</p>

<pre><code>example.com.    300    IN    AAAA    2606:2800:220:1:248:1893:25c8:1946
</code></pre>

<h3 id="cname-record">CNAME Record</h3>

<p>An alias — maps one hostname to another. The target must be an A (or AAAA) record, not an IP address.</p>

<pre><code>www.example.com.    3600    IN    CNAME    example.com.
blog.example.com.   3600    IN    CNAME    mysite.netlify.app.
</code></pre>

<p><strong>The CNAME constraint</strong>: you cannot put a CNAME on the root domain (<code>example.com</code>) because the root needs SOA and NS records, which can’t coexist with a CNAME. Use an A record for the root and CNAME for subdomains. Some DNS providers offer “CNAME flattening” or “ALIAS” records to work around this.</p>

<p><strong>CNAME chain length</strong>: avoid chaining CNAMEs (<code>a</code> → <code>b</code> → <code>c</code>). Each hop costs a DNS lookup and adds latency.</p>

<h3 id="mx-record">MX Record</h3>

<p>Specifies mail servers for a domain, with a priority value (lower = higher priority):</p>

<pre><code>example.com.    3600    IN    MX    10    mail1.example.com.
example.com.    3600    IN    MX    20    mail2.example.com.
</code></pre>

<p>When email is sent to <code>user@example.com</code>, the sender’s mail server looks up the MX records and tries them in priority order.</p>

<h3 id="txt-record">TXT Record</h3>

<p>Arbitrary text. Used for domain verification, SPF, DKIM, and DMARC:</p>

<pre><code>; SPF — which servers are allowed to send email for this domain
example.com.    TXT    "v=spf1 include:_spf.google.com ~all"

; Domain ownership verification (Google Search Console, etc.)
example.com.    TXT    "google-site-verification=abc123..."

; DKIM public key (for email signing)
mail._domainkey.example.com.    TXT    "v=DKIM1; k=rsa; p=MIGfMA0..."
</code></pre>

<h3 id="ns-record">NS Record</h3>

<p>Nameserver — delegates authority for a zone to specific nameservers:</p>

<pre><code>example.com.    NS    ns1.cloudflare.com.
example.com.    NS    ns2.cloudflare.com.
</code></pre>

<p>NS records are set at your domain registrar and point to whoever hosts your DNS zone.</p>

<h3 id="soa-record">SOA Record</h3>

<p>Start of Authority — metadata about the DNS zone itself (primary nameserver, admin email, serial number, refresh intervals). You rarely set this manually; your DNS provider manages it.</p>

<h3 id="ptr-record">PTR Record</h3>

<p>Reverse DNS — maps an IP address back to a hostname. Used by mail servers to verify that sending servers are legitimate.</p>

<pre><code>34.216.184.93.in-addr.arpa.    PTR    example.com.
</code></pre>

<h2 id="ttl-time-to-live">TTL: Time to Live</h2>

<p>TTL is how long (in seconds) resolvers and clients should cache a DNS record before asking again.</p>

<pre><code>example.com.    300    IN    A    93.184.216.34
              ↑
           TTL = 300 seconds = 5 minutes
</code></pre>

<p><strong>Implications</strong>:</p>

<ul>
  <li><strong>Low TTL (60–300s)</strong>: DNS changes propagate quickly. Good before a migration, but generates more DNS traffic.</li>
  <li><strong>High TTL (3600–86400s)</strong>: Changes are slow to propagate (up to 24 hours), but faster for end users.</li>
</ul>

<p><strong>Best practice</strong>: lower the TTL to 300 before a planned IP change, wait for the old TTL to expire, make the change, then raise the TTL back after confirming everything works.</p>

<h2 id="debugging-dns-with-dig">Debugging DNS with <code>dig</code></h2>

<p><code>dig</code> is your primary DNS debugging tool:</p>

<pre><code class="language-bash"># Basic A record lookup
$ dig example.com

;; ANSWER SECTION:
example.com.    300    IN    A    93.184.216.34

# Lookup a specific record type
$ dig example.com MX
$ dig example.com TXT
$ dig example.com NS

# Use a specific resolver (bypass your local cache)
$ dig @8.8.8.8 example.com

# Check the full resolution chain (+trace)
$ dig +trace example.com

# Short output
$ dig +short example.com
93.184.216.34

# Reverse DNS lookup
$ dig -x 93.184.216.34
</code></pre>

<h2 id="debugging-with-nslookup">Debugging with <code>nslookup</code></h2>

<p><code>nslookup</code> is available on Windows and macOS by default:</p>

<pre><code class="language-bash">$ nslookup example.com
Server:    192.168.1.1
Address:   192.168.1.1#53

Non-authoritative answer:
Name:    example.com
Address: 93.184.216.34

# Query a specific record type
$ nslookup -type=MX example.com

# Query a specific server
$ nslookup example.com 8.8.8.8
</code></pre>

<h2 id="common-dns-pitfalls">Common DNS Pitfalls</h2>

<p><strong>“My DNS change isn’t propagating”</strong> — DNS propagation isn’t magic. What’s actually happening is that resolvers around the world are holding cached copies of the old record until their TTL expires. Check the old TTL before making changes.</p>

<p><strong>CNAME on the root domain</strong> — This breaks things. Use an A record for <code>example.com</code> and a CNAME for <code>www.example.com</code>.</p>

<p><strong>Missing the trailing dot</strong> — In zone files, hostnames are written with a trailing dot (<code>example.com.</code>) to indicate they’re absolute. Without it, many DNS tools append the zone name, turning <code>mail</code> into <code>mail.example.com.</code>. <code>dig</code> adds trailing dots in output; most DNS UIs handle it for you.</p>

<p><strong>Forgetting SPF/DKIM/DMARC</strong> — Email without these records gets flagged as spam or rejected outright. Set them up when you configure MX records.</p>

<h2 id="conclusion">Conclusion</h2>

<p>DNS is deceptively simple on the surface — point a domain at an IP, done — but the details matter when things go wrong or when you’re setting up email, CDNs, or multi-region infrastructure. The record types you’ll use most are A, CNAME, MX, and TXT. Keep TTLs low before planned changes, use <code>dig</code> to inspect what resolvers actually see (not what your DNS panel shows), and remember that “DNS propagation” is just cache TTL expiry — not something that happens on its own schedule.</p>]]></content><author><name>Mukul Kadel</name></author><category term="wiki" /><category term="dns" /><category term="networking" /><category term="a record" /><category term="cname" /><category term="mx" /><category term="ttl" /><category term="domain" /><category term="web" /><category term="devops" /><category term="sysadmin" /><category term="dig" /><category term="nslookup" /><summary type="html"><![CDATA[A complete guide to how DNS resolution works, the most important record types, TTL and caching, and how to debug DNS issues with dig and nslookup.]]></summary></entry><entry><title type="html">Docker Basics: Containers, Images, and Volumes Explained</title><link href="https://mukulkadel.com/docker-basics-containers-images-volumes/" rel="alternate" type="text/html" title="Docker Basics: Containers, Images, and Volumes Explained" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/docker-basics-containers-images-volumes</id><content type="html" xml:base="https://mukulkadel.com/docker-basics-containers-images-volumes/"><![CDATA[<p>If you’ve heard “just throw it in a container” more times than you can count, it’s time to understand what that actually means. Docker packages your app and all its dependencies into an isolated unit that runs the same way everywhere — on your laptop, a CI server, or production. This post walks through the three building blocks: images, containers, and volumes.</p>

<h2 id="images-vs-containers">Images vs Containers</h2>

<p>An <strong>image</strong> is a read-only blueprint. A <strong>container</strong> is a running instance of that blueprint. The relationship is the same as a class and an object in code — one image can spawn many containers.</p>

<p>Pull the official Nginx image and you’ve downloaded a layered filesystem snapshot:</p>

<pre><code class="language-bash">$ docker pull nginx:alpine
alpine: Pulling from library/nginx
Digest: sha256:2d194184...
Status: Downloaded newer image for nginx:alpine
</code></pre>

<p>Spin up a container from it:</p>

<pre><code class="language-bash">$ docker run -d -p 8080:80 --name web nginx:alpine
c3a1f9e7b2d4...

$ docker ps
CONTAINER ID   IMAGE          COMMAND                  CREATED         STATUS         PORTS                  NAMES
c3a1f9e7b2d4   nginx:alpine   "/docker-entrypoint.…"   5 seconds ago   Up 4 seconds   0.0.0.0:8080-&gt;80/tcp   web
</code></pre>

<p><code>-d</code> runs it in the background. <code>-p 8080:80</code> maps host port 8080 to container port 80. Visit <code>http://localhost:8080</code> and you’ll see Nginx’s welcome page.</p>

<h2 id="building-your-own-image">Building Your Own Image</h2>

<p>A <strong>Dockerfile</strong> is a recipe for your image. Each instruction adds a layer.</p>

<pre><code class="language-dockerfile">FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["python", "app.py"]
</code></pre>

<p>Build it:</p>

<pre><code class="language-bash">$ docker build -t myapp:latest .
[+] Building 12.3s
 =&gt; [1/5] FROM python:3.12-slim
 =&gt; [2/5] WORKDIR /app
 =&gt; [3/5] COPY requirements.txt .
 =&gt; [4/5] RUN pip install --no-cache-dir -r requirements.txt
 =&gt; [5/5] COPY . .
 =&gt; exporting to image
</code></pre>

<p>The <code>COPY requirements.txt</code> step is deliberately before <code>COPY . .</code> — Docker caches each layer. If you change application code but not <code>requirements.txt</code>, the pip install layer is reused and rebuilds are fast.</p>

<h2 id="essential-container-commands">Essential Container Commands</h2>

<pre><code class="language-bash"># list running containers
$ docker ps

# list all containers including stopped ones
$ docker ps -a

# stream logs
$ docker logs -f web

# open a shell inside a running container
$ docker exec -it web sh

# stop and remove
$ docker stop web &amp;&amp; docker rm web

# remove the image
$ docker rmi nginx:alpine
</code></pre>

<h2 id="volumes-persisting-data">Volumes: Persisting Data</h2>

<p>Containers are ephemeral — when you remove one, all data written inside it disappears. <strong>Volumes</strong> solve this by mounting a storage location from the host (or a Docker-managed directory) into the container.</p>

<pre><code class="language-bash"># named volume — Docker manages the location
$ docker run -d \
  -v pgdata:/var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=secret \
  --name db \
  postgres:16

# bind mount — explicit host path
$ docker run -d \
  -v /home/mukul/site:/usr/share/nginx/html:ro \
  -p 8080:80 \
  nginx:alpine
</code></pre>

<p>Named volumes survive container removal. Bind mounts are great for local development because you edit files on the host and the container picks up changes immediately.</p>

<p>List and inspect volumes:</p>

<pre><code class="language-bash">$ docker volume ls
DRIVER    VOLUME NAME
local     pgdata

$ docker volume inspect pgdata
[
  {
    "Name": "pgdata",
    "Mountpoint": "/var/lib/docker/volumes/pgdata/_data",
    ...
  }
]
</code></pre>

<h2 id="cleaning-up">Cleaning Up</h2>

<p>Docker accumulates unused images, stopped containers, and dangling volumes quickly.</p>

<pre><code class="language-bash"># remove all stopped containers, unused networks, dangling images
$ docker system prune
WARNING! This will remove:
  - all stopped containers
  - all networks not used by at least one container
  - all dangling images
  - all dangling build cache

Total reclaimed space: 1.23GB

# nuclear option — also removes unused images
$ docker system prune -a
</code></pre>

<h2 id="conclusion">Conclusion</h2>

<p>Images are immutable blueprints, containers are running instances, and volumes are where persistent data lives. With just <code>docker build</code>, <code>docker run</code>, and a well-written Dockerfile, you can package any application into a portable, reproducible artifact. Once you’re comfortable running single containers, the natural next step is coordinating multiple services with Docker Compose.</p>]]></content><author><name>Mukul Kadel</name></author><category term="Programming" /><category term="Tutorials" /><category term="docker" /><category term="containers" /><category term="devops" /><category term="images" /><category term="volumes" /><category term="dockerfile" /><category term="linux" /><category term="cloud" /><category term="tutorial" /><summary type="html"><![CDATA[Learn how Docker containers, images, and volumes work with practical examples you can run right now on any Linux or macOS machine.]]></summary></entry><entry><title type="html">Docker Compose — Multi-Container App Setup Guide</title><link href="https://mukulkadel.com/docker-compose-multi-container-guide/" rel="alternate" type="text/html" title="Docker Compose — Multi-Container App Setup Guide" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/docker-compose-multi-container-guide</id><content type="html" xml:base="https://mukulkadel.com/docker-compose-multi-container-guide/"><![CDATA[<p>Most real applications aren’t a single process — they’re a web server, a database, a cache, maybe a background worker, all talking to each other. Running each container with long <code>docker run</code> commands and manually wiring them together gets painful fast. Docker Compose lets you define the entire stack in one YAML file and bring it up with a single command.</p>

<h2 id="the-docker-composeyml-file">The <code>docker-compose.yml</code> File</h2>

<p>Compose describes your application as a set of <strong>services</strong>, each of which maps to a container. Here’s a practical example: a FastAPI backend backed by PostgreSQL and Redis.</p>

<pre><code class="language-yaml">services:
  api:
    build: .
    ports:
      - "8000:8000"
    environment:
      DATABASE_URL: postgresql://user:secret@db:5432/appdb
      REDIS_URL: redis://cache:6379
    depends_on:
      db:
        condition: service_healthy
      cache:
        condition: service_started
    volumes:
      - .:/app

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: secret
      POSTGRES_DB: appdb
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d appdb"]
      interval: 5s
      timeout: 5s
      retries: 5

  cache:
    image: redis:7-alpine
    volumes:
      - redisdata:/data

volumes:
  pgdata:
  redisdata:
</code></pre>

<p>Services can reference each other by <strong>service name</strong> as the hostname — <code>db</code> and <code>cache</code> are resolvable from the <code>api</code> container because Compose creates a shared network automatically.</p>

<h2 id="starting-and-stopping">Starting and Stopping</h2>

<pre><code class="language-bash"># start everything in the background
$ docker compose up -d
[+] Running 3/3
 ✔ Container myapp-db-1     Healthy
 ✔ Container myapp-cache-1  Started
 ✔ Container myapp-api-1    Started

# tail logs from all services
$ docker compose logs -f

# tail just the api service
$ docker compose logs -f api

# stop and remove containers (volumes are preserved)
$ docker compose down

# stop AND remove volumes (wipes the database)
$ docker compose down -v
</code></pre>

<h2 id="rebuilding-after-code-changes">Rebuilding After Code Changes</h2>

<p>When you change your <code>Dockerfile</code> or source code, Compose won’t automatically pick it up unless you rebuild.</p>

<pre><code class="language-bash">$ docker compose up -d --build
</code></pre>

<p>For local development with a bind mount (<code>- .:/app</code>), code changes reflect immediately without rebuilding — only dependency changes in your <code>Dockerfile</code> need a rebuild.</p>

<h2 id="running-one-off-commands">Running One-Off Commands</h2>

<pre><code class="language-bash"># open a postgres shell
$ docker compose exec db psql -U user -d appdb

# run database migrations
$ docker compose run --rm api python manage.py migrate

# run tests inside the api container
$ docker compose run --rm api pytest
</code></pre>

<p><code>exec</code> runs a command in an already-running container. <code>run</code> spins up a fresh container for the command and exits when done. <code>--rm</code> cleans it up afterwards.</p>

<h2 id="scaling-a-service">Scaling a Service</h2>

<pre><code class="language-bash">$ docker compose up -d --scale api=3
</code></pre>

<p>This starts three instances of the <code>api</code> service. You’d typically put a load balancer (Nginx or Traefik) in front of them, but it’s a fast way to test horizontal scaling locally.</p>

<h2 id="environment-files">Environment Files</h2>

<p>Hardcoding secrets in <code>docker-compose.yml</code> is fine for local dev, but for shared repos or staging environments, use a <code>.env</code> file:</p>

<pre><code class="language-bash"># .env
POSTGRES_PASSWORD=secret
REDIS_URL=redis://cache:6379
</code></pre>

<p>Compose reads <code>.env</code> automatically. Reference variables with <code>${VAR_NAME}</code>:</p>

<pre><code class="language-yaml">environment:
  DATABASE_URL: postgresql://user:${POSTGRES_PASSWORD}@db:5432/appdb
</code></pre>

<p>Add <code>.env</code> to <code>.gitignore</code> and commit a <code>.env.example</code> with placeholder values instead.</p>

<h2 id="useful-inspection-commands">Useful Inspection Commands</h2>

<pre><code class="language-bash"># show running containers and their status
$ docker compose ps

# show which host ports are mapped
$ docker compose port api 8000
0.0.0.0:8000

# show resource usage
$ docker compose stats
</code></pre>

<h2 id="conclusion">Conclusion</h2>

<p>Docker Compose turns a sprawling set of <code>docker run</code> commands into a single, readable file that anyone on the team can spin up identically. The <code>depends_on</code> with healthchecks ensures services start in the right order, named volumes keep data safe across restarts, and the built-in network means services find each other by name without any manual configuration. Once you’re running containers locally with Compose, moving to production with Kubernetes or a managed container service becomes a much smaller leap.</p>]]></content><author><name>Mukul Kadel</name></author><category term="Tutorials" /><category term="Programming" /><category term="docker" /><category term="docker compose" /><category term="containers" /><category term="devops" /><category term="multi-container" /><category term="yaml" /><category term="networking" /><category term="volumes" /><category term="tutorial" /><summary type="html"><![CDATA[Learn how to define, link, and run multi-container applications with Docker Compose using a practical web app and database example.]]></summary></entry><entry><title type="html">`find` — Advanced File Search with Real Examples</title><link href="https://mukulkadel.com/find-command-advanced-guide/" rel="alternate" type="text/html" title="`find` — Advanced File Search with Real Examples" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/find-command-advanced-guide</id><content type="html" xml:base="https://mukulkadel.com/find-command-advanced-guide/"><![CDATA[<p>The <code>find</code> command is one of the most powerful tools in the Unix toolkit — and one of the most underused. Most developers reach for it when they want to locate a file by name, but <code>find</code> can filter by size, age, permissions, and file type, and it can pipe results directly into other commands. Once you get comfortable with it, you’ll find yourself replacing a lot of ad-hoc scripting with a single well-crafted <code>find</code> invocation.</p>

<h2 id="basic-syntax">Basic Syntax</h2>

<pre><code class="language-bash">find [path] [expression]
</code></pre>

<p>The path is where to start searching (recursively). The expression is a combination of tests and actions.</p>

<pre><code class="language-bash">$ find . -name "*.log"
./app/logs/access.log
./app/logs/error.log
</code></pre>

<p>Start with <code>.</code> to search the current directory tree, or give an absolute path like <code>/var/log</code>.</p>

<h2 id="searching-by-name">Searching by Name</h2>

<p><code>-name</code> is case-sensitive. Use <code>-iname</code> for a case-insensitive match.</p>

<pre><code class="language-bash">$ find /etc -name "nginx.conf"
/etc/nginx/nginx.conf

$ find . -iname "readme*"
./README.md
./docs/readme.txt
</code></pre>

<p>Use shell wildcards (<code>*</code>, <code>?</code>) inside quotes to prevent the shell from expanding them before <code>find</code> sees them.</p>

<h2 id="searching-by-type">Searching by Type</h2>

<p><code>-type</code> filters results by file type:</p>

<table>
  <thead>
    <tr>
      <th>Flag</th>
      <th>Matches</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>-type f</code></td>
      <td>Regular files</td>
    </tr>
    <tr>
      <td><code>-type d</code></td>
      <td>Directories</td>
    </tr>
    <tr>
      <td><code>-type l</code></td>
      <td>Symbolic links</td>
    </tr>
    <tr>
      <td><code>-type s</code></td>
      <td>Sockets</td>
    </tr>
  </tbody>
</table>

<pre><code class="language-bash">$ find . -type d -name "__pycache__"
./src/__pycache__
./tests/__pycache__
</code></pre>

<h2 id="searching-by-size">Searching by Size</h2>

<p><code>-size</code> accepts a number with a unit suffix:</p>

<table>
  <thead>
    <tr>
      <th>Suffix</th>
      <th>Unit</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>c</code></td>
      <td>Bytes</td>
    </tr>
    <tr>
      <td><code>k</code></td>
      <td>Kilobytes</td>
    </tr>
    <tr>
      <td><code>M</code></td>
      <td>Megabytes</td>
    </tr>
    <tr>
      <td><code>G</code></td>
      <td>Gigabytes</td>
    </tr>
  </tbody>
</table>

<p>Prefix with <code>+</code> for “greater than” or <code>-</code> for “less than”:</p>

<pre><code class="language-bash">$ find /var/log -type f -size +100M
/var/log/syslog.1

$ find . -type f -size -1k
./config/.gitkeep
./config/empty.conf
</code></pre>

<h2 id="searching-by-modification-time">Searching by Modification Time</h2>

<p><code>-mtime</code> filters by last modification time in days. <code>-mmin</code> works the same way in minutes.</p>

<pre><code class="language-bash"># Files modified in the last 24 hours
$ find . -type f -mtime -1

# Files not touched in over 30 days
$ find /tmp -type f -mtime +30

# Files modified in the last 60 minutes
$ find . -type f -mmin -60
</code></pre>

<p>There are three time flags:</p>

<ul>
  <li><code>-mtime</code> — last modification time (content changed)</li>
  <li><code>-atime</code> — last access time (file was read)</li>
  <li><code>-ctime</code> — last status change time (permissions or ownership changed)</li>
</ul>

<h2 id="searching-by-permissions">Searching by Permissions</h2>

<p><code>-perm</code> matches files by their permission bits:</p>

<pre><code class="language-bash"># Files with exactly 644 permissions
$ find . -type f -perm 644

# Files that are world-writable (dangerous to have in web roots)
$ find /var/www -type f -perm -o+w
</code></pre>

<p>The <code>-</code> prefix means “at least these bits are set.” Without it, <code>find</code> requires an exact match.</p>

<h2 id="combining-tests-with-and-or-not">Combining Tests with AND, OR, NOT</h2>

<p>By default, multiple tests are ANDed together. Use <code>-o</code> for OR and <code>!</code> for NOT:</p>

<pre><code class="language-bash"># .log OR .tmp files
$ find . \( -name "*.log" -o -name "*.tmp" \) -type f

# Everything that is NOT a directory
$ find . ! -type d
</code></pre>

<h2 id="executing-commands-with--exec">Executing Commands with <code>-exec</code></h2>

<p>This is where <code>find</code> gets genuinely powerful. <code>-exec</code> runs a command for each matched file. Use <code>{}</code> as the placeholder for the filename and terminate the command with <code>\;</code>:</p>

<pre><code class="language-bash"># Delete all .pyc files
$ find . -name "*.pyc" -exec rm {} \;

# Change ownership on all files in a web directory
$ find /var/www -type f -exec chown www-data:www-data {} \;

# Print the size of each matching file
$ find . -name "*.log" -exec du -sh {} \;
4.0K    ./app/logs/access.log
128M    ./app/logs/error.log
</code></pre>

<p>Replace <code>\;</code> with <code>+</code> to batch arguments into a single command invocation (faster for large result sets):</p>

<pre><code class="language-bash">$ find . -name "*.pyc" -exec rm {} +
</code></pre>

<h2 id="using-xargs-for-better-performance">Using <code>xargs</code> for Better Performance</h2>

<p><code>xargs</code> is often more flexible than <code>-exec ... +</code>. Pipe <code>find</code> output into it:</p>

<pre><code class="language-bash">$ find . -name "*.log" -print0 | xargs -0 grep "ERROR"
</code></pre>

<p><code>-print0</code> and <code>-0</code> use the null byte as a delimiter, which handles filenames with spaces correctly.</p>

<h2 id="excluding-directories-with--prune">Excluding Directories with <code>-prune</code></h2>

<p><code>-prune</code> tells <code>find</code> to skip a directory entirely:</p>

<pre><code class="language-bash"># Search everything except node_modules
$ find . -name "node_modules" -prune -o -name "*.js" -print

# Exclude multiple directories
$ find . \( -name ".git" -o -name "vendor" -o -name "node_modules" \) -prune -o -type f -print
</code></pre>

<p>The <code>-o -print</code> at the end is necessary — without it, <code>-prune</code> would suppress output for the non-pruned results.</p>

<h2 id="practical-one-liners">Practical One-Liners</h2>

<pre><code class="language-bash"># Find and delete all empty directories
$ find . -type d -empty -delete

# Find duplicate filenames (same name, different location)
$ find . -type f -printf "%f\n" | sort | uniq -d

# List the 10 largest files under /var
$ find /var -type f -printf "%s %p\n" | sort -rn | head -10

# Find files owned by a specific user
$ find /home -user alice -type f

# Find SUID binaries (common security audit)
$ find / -perm -4000 -type f 2&gt;/dev/null
</code></pre>

<h2 id="conclusion">Conclusion</h2>

<p><code>find</code> rewards learning its flags properly. The combination of type filtering, time-based searches, permission checks, and <code>-exec</code> makes it a complete filesystem query language. The key patterns to internalize are: use <code>-print0 | xargs -0</code> for safe pipelining, use <code>-prune</code> to skip large directories, and always quote your wildcard patterns. Most tasks that feel like they need a shell loop can be handled more cleanly with a single <code>find</code> command.</p>]]></content><author><name>Mukul Kadel</name></author><category term="wiki" /><category term="unix" /><category term="find" /><category term="unix" /><category term="linux" /><category term="command line" /><category term="file search" /><category term="terminal" /><category term="cheatsheet" /><category term="sysadmin" /><category term="xargs" /><summary type="html"><![CDATA[Master the Unix find command with real examples covering name, size, time, permissions, and exec — the most powerful file search tool in your terminal.]]></summary></entry><entry><title type="html">GitHub Actions CI/CD: Build, Test, and Deploy a Project</title><link href="https://mukulkadel.com/github-actions-cicd-guide/" rel="alternate" type="text/html" title="GitHub Actions CI/CD: Build, Test, and Deploy a Project" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/github-actions-cicd-guide</id><content type="html" xml:base="https://mukulkadel.com/github-actions-cicd-guide/"><![CDATA[<p>Every time you push code and spend the next ten minutes manually running tests and deploying by hand, you’re doing work a machine could do for you. GitHub Actions lets you define exactly that automation in YAML files that live in your repository — no external CI service required. This guide walks through building a real pipeline that tests on every pull request and deploys on every merge to <code>main</code>.</p>

<h2 id="how-github-actions-works">How GitHub Actions Works</h2>

<p>Actions are triggered by <strong>events</strong> (push, pull_request, schedule, etc.) and run <strong>workflows</strong> — YAML files in <code>.github/workflows/</code>. Each workflow has <strong>jobs</strong>, each job runs on a <strong>runner</strong> (a fresh VM), and each job has <strong>steps</strong> that run sequentially.</p>

<pre><code>Event → Workflow → Jobs (parallel by default) → Steps (sequential)
</code></pre>

<h2 id="a-basic-ci-workflow">A Basic CI Workflow</h2>

<p>Create <code>.github/workflows/ci.yml</code>:</p>

<pre><code class="language-yaml">name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install dependencies
        run: |
          pip install --upgrade pip
          pip install -r requirements.txt

      - name: Run tests
        run: pytest --tb=short -q

      - name: Lint
        run: ruff check .
</code></pre>

<p>Push this file and GitHub immediately starts running the workflow. Every subsequent pull request will show a green or red status check before you can merge.</p>

<h2 id="caching-dependencies">Caching Dependencies</h2>

<p>By default, every run reinstalls dependencies from scratch. Caching the pip download cache cuts minutes off each run:</p>

<pre><code class="language-yaml">      - name: Cache pip packages
        uses: actions/cache@v4
        with:
          path: ~/.cache/pip
          key: $-pip-$
          restore-keys: |
            $-pip-
</code></pre>

<p>The cache key includes a hash of <code>requirements.txt</code>, so the cache is invalidated whenever dependencies change.</p>

<h2 id="matrix-builds-test-across-multiple-versions">Matrix Builds: Test Across Multiple Versions</h2>

<pre><code class="language-yaml">jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.10", "3.11", "3.12"]

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python $
        uses: actions/setup-python@v5
        with:
          python-version: $

      - name: Install and test
        run: |
          pip install -r requirements.txt
          pytest
</code></pre>

<p>This runs three parallel jobs — one per Python version — and reports them all in the pull request.</p>

<h2 id="a-deployment-workflow">A Deployment Workflow</h2>

<p>Separation of concerns: the CI workflow runs on every PR, the deploy workflow runs only on pushes to <code>main</code>. Create <code>.github/workflows/deploy.yml</code>:</p>

<pre><code class="language-yaml">name: Deploy

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production

    steps:
      - uses: actions/checkout@v4

      - name: Build Docker image
        run: docker build -t myapp:$ .

      - name: Log in to Docker Hub
        uses: docker/login-action@v3
        with:
          username: $
          password: $

      - name: Push image
        run: |
          docker tag myapp:$ myuser/myapp:latest
          docker push myuser/myapp:latest

      - name: Deploy to server
        uses: appleboy/ssh-action@v1
        with:
          host: $
          username: $
          key: $
          script: |
            docker pull myuser/myapp:latest
            docker compose up -d --no-deps api
</code></pre>

<h2 id="managing-secrets">Managing Secrets</h2>

<p>Never hardcode credentials in workflow files. Store them in <strong>Settings → Secrets and variables → Actions</strong> and reference them as <code>$</code>. Secret values are masked in logs.</p>

<p>For per-environment secrets (staging vs production), use <strong>Environments</strong> (<code>environment: production</code> in the job), which lets you require manual approval before a deployment job runs.</p>

<h2 id="useful-workflow-patterns">Useful Workflow Patterns</h2>

<p><strong>Run a job only on specific file changes:</strong></p>

<pre><code class="language-yaml">on:
  push:
    paths:
      - "src/**"
      - "requirements.txt"
</code></pre>

<p><strong>Share data between jobs:</strong></p>

<pre><code class="language-yaml">jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      image_tag: $
    steps:
      - id: tag
        run: echo "tag=$" &gt;&gt; $GITHUB_OUTPUT

  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - run: echo "Deploying $"
</code></pre>

<p><strong>Conditional steps:</strong></p>

<pre><code class="language-yaml">      - name: Notify Slack on failure
        if: failure()
        uses: slackapi/slack-github-action@v1
        with:
          payload: '{"text":"Deploy failed on $"}'
        env:
          SLACK_WEBHOOK_URL: $
</code></pre>

<h2 id="conclusion">Conclusion</h2>

<p>GitHub Actions gives you a full CI/CD pipeline without leaving your repository. The CI workflow on pull requests catches regressions before they merge, the deployment workflow automates what would otherwise be a manual, error-prone release process, and secrets management keeps credentials out of your codebase. Start with a simple test job and add deployment steps incrementally — a working pipeline you understand beats a sophisticated one that nobody maintains.</p>]]></content><author><name>Mukul Kadel</name></author><category term="Tutorials" /><category term="Programming" /><category term="github actions" /><category term="ci/cd" /><category term="devops" /><category term="automation" /><category term="yaml" /><category term="github" /><category term="deployment" /><category term="testing" /><category term="pipeline" /><summary type="html"><![CDATA[Set up a complete CI/CD pipeline with GitHub Actions that builds, tests, and deploys your app automatically on every push to main.]]></summary></entry><entry><title type="html">grep, egrep, and ripgrep — Pattern Matching Cheat Sheet</title><link href="https://mukulkadel.com/grep-egrep-ripgrep-cheat-sheet/" rel="alternate" type="text/html" title="grep, egrep, and ripgrep — Pattern Matching Cheat Sheet" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/grep-egrep-ripgrep-cheat-sheet</id><content type="html" xml:base="https://mukulkadel.com/grep-egrep-ripgrep-cheat-sheet/"><![CDATA[<p><code>grep</code> is one of the most-used tools on any Unix system. It searches text for lines matching a pattern — and while the name stands for “Global Regular Expression Print,” it’s really just a fast line filter. <code>ripgrep</code> (<code>rg</code>) is a modern replacement that’s dramatically faster for searching code, so this cheat sheet covers both.</p>

<h2 id="basic-syntax">Basic Syntax</h2>

<pre><code class="language-bash">$ grep "pattern" file.txt
$ grep "pattern" file1.txt file2.txt
$ command | grep "pattern"
</code></pre>

<h2 id="essential-flags">Essential Flags</h2>

<table>
  <thead>
    <tr>
      <th>Flag</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>-i</code></td>
      <td>Case-insensitive match</td>
    </tr>
    <tr>
      <td><code>-v</code></td>
      <td>Invert: print lines that do NOT match</td>
    </tr>
    <tr>
      <td><code>-n</code></td>
      <td>Show line numbers</td>
    </tr>
    <tr>
      <td><code>-c</code></td>
      <td>Count matching lines (don’t print them)</td>
    </tr>
    <tr>
      <td><code>-l</code></td>
      <td>Print only filenames that have a match</td>
    </tr>
    <tr>
      <td><code>-L</code></td>
      <td>Print filenames with NO match</td>
    </tr>
    <tr>
      <td><code>-r</code></td>
      <td>Recursive: search all files in a directory</td>
    </tr>
    <tr>
      <td><code>-R</code></td>
      <td>Recursive, following symlinks</td>
    </tr>
    <tr>
      <td><code>-w</code></td>
      <td>Match whole words only</td>
    </tr>
    <tr>
      <td><code>-x</code></td>
      <td>Match whole lines only</td>
    </tr>
    <tr>
      <td><code>-A N</code></td>
      <td>Print N lines after each match</td>
    </tr>
    <tr>
      <td><code>-B N</code></td>
      <td>Print N lines before each match</td>
    </tr>
    <tr>
      <td><code>-C N</code></td>
      <td>Print N lines before and after each match</td>
    </tr>
    <tr>
      <td><code>-E</code></td>
      <td>Extended regex (same as <code>egrep</code>)</td>
    </tr>
    <tr>
      <td><code>-P</code></td>
      <td>Perl-compatible regex (PCRE) — GNU grep only</td>
    </tr>
    <tr>
      <td><code>-F</code></td>
      <td>Fixed string, no regex (faster for literals)</td>
    </tr>
    <tr>
      <td><code>-o</code></td>
      <td>Print only the matched part, not the full line</td>
    </tr>
    <tr>
      <td><code>-q</code></td>
      <td>Quiet: exit 0 if match found, 1 otherwise</td>
    </tr>
    <tr>
      <td><code>--color</code></td>
      <td>Highlight matches (often default)</td>
    </tr>
  </tbody>
</table>

<h2 id="grep-vs-egrep-vs-fgrep">grep vs egrep vs fgrep</h2>

<ul>
  <li><code>grep</code> — basic regex (BRE): <code>\(</code>, <code>\)</code>, <code>\+</code> need backslashes</li>
  <li><code>egrep</code> / <code>grep -E</code> — extended regex (ERE): <code>(</code>, <code>)</code>, <code>+</code>, <code>|</code> work without backslashes</li>
  <li><code>fgrep</code> / <code>grep -F</code> — no regex, literal string matching; fastest</li>
</ul>

<p>In practice, always use <code>grep -E</code> instead of <code>egrep</code> — it’s the same behavior, just more explicit.</p>

<h2 id="basic-examples">Basic Examples</h2>

<pre><code class="language-bash">$ grep "ERROR" app.log
2024-01-15 10:23:11 ERROR Connection refused
2024-01-15 10:25:44 ERROR Timeout after 30s
</code></pre>

<p>Case-insensitive:</p>

<pre><code class="language-bash">$ grep -i "error" app.log
</code></pre>

<p>Count matches:</p>

<pre><code class="language-bash">$ grep -c "ERROR" app.log
14
</code></pre>

<p>Show surrounding context (5 lines before and after):</p>

<pre><code class="language-bash">$ grep -C 5 "segfault" dmesg.log
</code></pre>

<h2 id="recursive-search">Recursive Search</h2>

<pre><code class="language-bash">$ grep -rn "TODO" ./src/
./src/api/handler.go:42:  // TODO: add retry logic
./src/db/conn.go:17:      // TODO: use connection pool
</code></pre>

<p>Restrict to a file type with <code>--include</code>:</p>

<pre><code class="language-bash">$ grep -rn --include="*.py" "import requests" .
./scripts/fetch.py:3:import requests
./api/client.py:1:import requests
</code></pre>

<p>Exclude a directory:</p>

<pre><code class="language-bash">$ grep -rn --exclude-dir=".git" --exclude-dir="node_modules" "console.log" .
</code></pre>

<h2 id="extended-regex-with--e">Extended Regex with <code>-E</code></h2>

<p>Match lines containing “foo” or “bar”:</p>

<pre><code class="language-bash">$ grep -E "foo|bar" file.txt
</code></pre>

<p>One or more digits:</p>

<pre><code class="language-bash">$ grep -E "[0-9]+" file.txt
</code></pre>

<p>Match IP addresses:</p>

<pre><code class="language-bash">$ grep -E "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log
</code></pre>

<p>Anchors: <code>^</code> start of line, <code>$</code> end:</p>

<pre><code class="language-bash">$ grep "^ERROR" app.log        # lines starting with ERROR
$ grep "\.json$" filelist.txt  # lines ending with .json
</code></pre>

<h2 id="whole-word-and-whole-line-matching">Whole-Word and Whole-Line Matching</h2>

<pre><code class="language-bash">$ grep -w "log" file.txt     # matches "log" but not "logging" or "catalog"
$ grep -x "exact line" file.txt  # only lines that are exactly "exact line"
</code></pre>

<h2 id="printing-only-the-match--o">Printing Only the Match (<code>-o</code>)</h2>

<p>Extract just the matching portion from each line:</p>

<pre><code class="language-bash">$ grep -oE "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" access.log | sort | uniq -c | sort -rn
     342 192.168.1.1
      87 10.0.0.15
      12 172.16.0.3
</code></pre>

<p>This extracts IPs, counts occurrences, and sorts by frequency — all without <code>awk</code>.</p>

<h2 id="using-grep-in-scripts">Using grep in Scripts</h2>

<p><code>grep -q</code> exits silently with code 0 (match) or 1 (no match):</p>

<pre><code class="language-bash">if grep -q "ERROR" app.log; then
  echo "Errors found, alerting oncall"
fi
</code></pre>

<p><code>-l</code> returns only filenames — useful for batch processing:</p>

<pre><code class="language-bash">$ grep -rl "deprecated_function" ./src/ | xargs sed -i 's/deprecated_function/new_function/g'
</code></pre>

<h2 id="ripgrep-rg--the-faster-alternative">ripgrep (<code>rg</code>) — The Faster Alternative</h2>

<p>Install:</p>

<pre><code class="language-bash">$ brew install ripgrep    # macOS
$ sudo apt install ripgrep  # Ubuntu
</code></pre>

<p><code>rg</code> is 5–10x faster than <code>grep -r</code> on codebases because it:</p>

<ul>
  <li>Uses SIMD for matching</li>
  <li>Skips binary files automatically</li>
  <li>Respects <code>.gitignore</code> by default</li>
  <li>Searches in parallel across CPU cores</li>
</ul>

<p>The interface mirrors <code>grep</code> in most ways:</p>

<pre><code class="language-bash">$ rg "TODO" ./src/
src/api/handler.go:42:  // TODO: add retry logic
src/db/conn.go:17:      // TODO: use connection pool
</code></pre>

<h3 id="rg-specific-advantages">rg-specific Advantages</h3>

<p><strong>Respects <code>.gitignore</code></strong> — doesn’t search <code>node_modules</code>, <code>dist</code>, or <code>.git</code> by default. Override with <code>--no-ignore</code>.</p>

<p><strong>File type filtering</strong> with <code>-t</code>:</p>

<pre><code class="language-bash">$ rg -t py "import os"       # only .py files
$ rg -t js "console.log"     # only .js files
$ rg --type-list             # see all supported types
</code></pre>

<p><strong>Fixed string</strong> (faster, no regex):</p>

<pre><code class="language-bash">$ rg -F "function()" .
</code></pre>

<p><strong>Count per file:</strong></p>

<pre><code class="language-bash">$ rg -c "ERROR" logs/
logs/app.log:14
logs/worker.log:3
</code></pre>

<p><strong>Show only filenames:</strong></p>

<pre><code class="language-bash">$ rg -l "TODO" .
</code></pre>

<p><strong>Multiline mode:</strong></p>

<pre><code class="language-bash">$ rg -U "foo\nbar" .
</code></pre>

<p><strong>Replace in output</strong> (doesn’t modify files — use with <code>sed</code> or <code>perl</code> for that):</p>

<pre><code class="language-bash">$ rg -r "new_name" "old_name" .
</code></pre>

<h2 id="common-one-liners">Common One-Liners</h2>

<p><strong>Find all files containing a function name:</strong></p>

<pre><code class="language-bash">$ grep -rl "handleRequest" ./src/
</code></pre>

<p><strong>Count total lines of Python code (excluding blank lines and comments):</strong></p>

<pre><code class="language-bash">$ grep -r --include="*.py" -v "^[[:space:]]*#\|^[[:space:]]*$" . | wc -l
</code></pre>

<p><strong>Find lines with two or more consecutive spaces:</strong></p>

<pre><code class="language-bash">$ grep -P "  +" file.txt
</code></pre>

<p><strong>Check if a process is running:</strong></p>

<pre><code class="language-bash">$ pgrep -x nginx || grep -q nginx /proc/*/comm 2&gt;/dev/null &amp;&amp; echo "running"
</code></pre>

<p><strong>Find all TODO and FIXME comments:</strong></p>

<pre><code class="language-bash">$ rg "TODO|FIXME|HACK|XXX" --color always | less -R
</code></pre>

<p><strong>Search compressed logs:</strong></p>

<pre><code class="language-bash">$ zgrep "ERROR" /var/log/nginx/error.log.gz
</code></pre>

<h2 id="grep-exit-codes">grep Exit Codes</h2>

<table>
  <thead>
    <tr>
      <th>Code</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>0</code></td>
      <td>Match found</td>
    </tr>
    <tr>
      <td><code>1</code></td>
      <td>No match</td>
    </tr>
    <tr>
      <td><code>2</code></td>
      <td>Error (bad pattern, missing file)</td>
    </tr>
  </tbody>
</table>

<p>These are useful in shell scripts: <code>grep -q "pattern" file &amp;&amp; echo "found"</code>.</p>

<h2 id="conclusion">Conclusion</h2>

<p><code>grep</code> with <code>-E</code>, <code>-n</code>, <code>-r</code>, and <code>-C</code> covers most daily search needs. For code search, switch to <code>ripgrep</code> — it’s faster, <code>.gitignore</code>-aware, and has better defaults. The fundamentals — anchors, character classes, alternation — transfer directly between the two. Know how to extract just the match with <code>-o</code>, how to count with <code>-c</code>, and how to use exit codes in scripts, and you’ll have <code>grep</code> fully in your toolkit.</p>]]></content><author><name>Mukul Kadel</name></author><category term="wiki" /><category term="unix" /><category term="grep" /><category term="ripgrep" /><category term="regex" /><category term="unix" /><category term="linux" /><category term="command line" /><category term="search" /><category term="cheatsheet" /><category term="terminal" /><summary type="html"><![CDATA[A complete reference for grep, egrep, and ripgrep covering flags, regex syntax, recursive search, and real-world one-liners for searching code and logs.]]></summary></entry><entry><title type="html">gRPC vs REST: Which One Should You Use?</title><link href="https://mukulkadel.com/grpc-vs-rest-comparison/" rel="alternate" type="text/html" title="gRPC vs REST: Which One Should You Use?" /><published>2026-05-22T18:30:00+00:00</published><updated>2026-05-22T18:30:00+00:00</updated><id>https://mukulkadel.com/grpc-vs-rest-comparison</id><content type="html" xml:base="https://mukulkadel.com/grpc-vs-rest-comparison/"><![CDATA[<p>REST has been the default API style for over a decade, and for good reason — it’s simple, human-readable, and works everywhere. But REST has real limitations when you’re building internal microservices, need bidirectional streaming, or care deeply about payload size and latency. gRPC addresses those limitations with a strongly-typed, binary protocol built on HTTP/2. Knowing which to reach for saves you from both over-engineering and painting yourself into a corner.</p>

<h2 id="what-is-grpc">What is gRPC?</h2>

<p>gRPC is a remote procedure call framework from Google. Instead of defining routes and JSON schemas, you define a <strong>service contract</strong> in a <code>.proto</code> file using Protocol Buffers, and the framework generates client and server code in your language of choice.</p>

<pre><code class="language-protobuf">syntax = "proto3";

service UserService {
  rpc GetUser (GetUserRequest) returns (User);
  rpc ListUsers (ListUsersRequest) returns (stream User);
  rpc CreateUser (CreateUserRequest) returns (User);
}

message GetUserRequest {
  string user_id = 1;
}

message User {
  string id = 1;
  string name = 2;
  string email = 3;
  int64 created_at = 4;
}
</code></pre>

<p>Run <code>protoc</code> on this file and you get generated stubs in Python, Go, Java, TypeScript, or any other supported language. The client calls <code>stub.GetUser(request)</code> like a normal function call.</p>

<h2 id="side-by-side-comparison">Side-by-Side Comparison</h2>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>REST</th>
      <th>gRPC</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Protocol</strong></td>
      <td>HTTP/1.1 or HTTP/2</td>
      <td>HTTP/2 only</td>
    </tr>
    <tr>
      <td><strong>Data format</strong></td>
      <td>JSON (text)</td>
      <td>Protocol Buffers (binary)</td>
    </tr>
    <tr>
      <td><strong>Schema</strong></td>
      <td>Optional (OpenAPI)</td>
      <td>Required (<code>.proto</code>)</td>
    </tr>
    <tr>
      <td><strong>Code generation</strong></td>
      <td>Optional</td>
      <td>Built-in</td>
    </tr>
    <tr>
      <td><strong>Browser support</strong></td>
      <td>Native</td>
      <td>Requires grpc-web proxy</td>
    </tr>
    <tr>
      <td><strong>Streaming</strong></td>
      <td>SSE / WebSockets (bolted on)</td>
      <td>First-class (4 modes)</td>
    </tr>
    <tr>
      <td><strong>Payload size</strong></td>
      <td>Larger</td>
      <td>3–10× smaller</td>
    </tr>
    <tr>
      <td><strong>Latency</strong></td>
      <td>Higher</td>
      <td>Lower</td>
    </tr>
    <tr>
      <td><strong>Human readability</strong></td>
      <td>Easy to inspect with curl</td>
      <td>Binary (need grpcurl)</td>
    </tr>
    <tr>
      <td><strong>Ecosystem</strong></td>
      <td>Massive</td>
      <td>Growing</td>
    </tr>
  </tbody>
</table>

<h2 id="performance">Performance</h2>

<p>gRPC’s performance advantage comes from two sources:</p>

<ol>
  <li><strong>Binary encoding</strong> — Protocol Buffers are more compact than JSON. A <code>User</code> object with 5 fields might be 200 bytes as JSON and 40 bytes as protobuf.</li>
  <li><strong>HTTP/2 multiplexing</strong> — multiple requests share one TCP connection, eliminating the head-of-line blocking and connection overhead of HTTP/1.1.</li>
</ol>

<p>For high-frequency internal service calls — tens of thousands per second — this compounds into measurable throughput gains and latency reductions.</p>

<h2 id="the-four-streaming-modes">The Four Streaming Modes</h2>

<p>This is gRPC’s biggest differentiator over REST:</p>

<pre><code class="language-protobuf">service StreamingService {
  // classic request/response
  rpc UnaryCall (Request) returns (Response);

  // server sends a stream of responses
  rpc ServerStreaming (Request) returns (stream Response);

  // client sends a stream of requests
  rpc ClientStreaming (stream Request) returns (Response);

  // both sides stream simultaneously
  rpc BidirectionalStreaming (stream Request) returns (stream Response);
}
</code></pre>

<p>Bidirectional streaming enables real-time communication patterns that would require WebSockets over REST — chat, live data feeds, collaborative editing.</p>

<h2 id="a-minimal-grpc-server-in-python">A Minimal gRPC Server in Python</h2>

<pre><code class="language-python">import grpc
from concurrent import futures
import user_pb2
import user_pb2_grpc

class UserServicer(user_pb2_grpc.UserServiceServicer):
    def GetUser(self, request, context):
        # fetch from database
        return user_pb2.User(
            id=request.user_id,
            name="Mukul Kadel",
            email="mukul@example.com",
        )

    def ListUsers(self, request, context):
        users = fetch_users(limit=request.limit)
        for user in users:
            yield user_pb2.User(id=user.id, name=user.name, email=user.email)

def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    user_pb2_grpc.add_UserServiceServicer_to_server(UserServicer(), server)
    server.add_insecure_port("[::]:50051")
    server.start()
    server.wait_for_termination()
</code></pre>

<p>The client:</p>

<pre><code class="language-python">channel = grpc.insecure_channel("localhost:50051")
stub = user_pb2_grpc.UserServiceStub(channel)

user = stub.GetUser(user_pb2.GetUserRequest(user_id="abc123"))
print(user.name)  # Mukul Kadel

for user in stub.ListUsers(user_pb2.ListUsersRequest(limit=100)):
    print(user.name)
</code></pre>

<h2 id="inspecting-grpc-without-a-client">Inspecting gRPC Without a Client</h2>

<pre><code class="language-bash"># like curl for gRPC
$ brew install grpcurl

# list services
$ grpcurl -plaintext localhost:50051 list

# call a method
$ grpcurl -plaintext \
  -d '{"user_id": "abc123"}' \
  localhost:50051 \
  UserService/GetUser
</code></pre>

<h2 id="when-to-choose-rest">When to Choose REST</h2>

<ul>
  <li><strong>Public APIs</strong> where external developers will integrate — JSON over HTTP is universally understood</li>
  <li><strong>Browser-to-server</strong> communication without a proxy layer</li>
  <li><strong>Simple CRUD</strong> with standard tooling (Swagger, Postman, curl)</li>
  <li><strong>Small teams</strong> where the overhead of <code>.proto</code> files and code generation isn’t justified</li>
</ul>

<h2 id="when-to-choose-grpc">When to Choose gRPC</h2>

<ul>
  <li><strong>Internal microservices</strong> communicating at high frequency</li>
  <li><strong>Streaming</strong> — live feeds, bidirectional channels, large file uploads chunked over a stream</li>
  <li><strong>Multi-language</strong> systems where a shared <code>.proto</code> contract prevents schema drift</li>
  <li><strong>Mobile clients</strong> where payload size and battery life matter</li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>REST is the right default for most APIs — especially anything public-facing or browser-native. gRPC pays off when you’re building a mesh of internal services where performance, strong typing, and streaming matter more than ecosystem breadth. The good news is they’re not mutually exclusive: many systems use gRPC internally between services and REST externally for the public API, with a gateway layer translating between them.</p>]]></content><author><name>Mukul Kadel</name></author><category term="wiki" /><category term="Programming" /><category term="grpc" /><category term="rest" /><category term="api" /><category term="protobuf" /><category term="networking" /><category term="microservices" /><category term="backend" /><category term="performance" /><category term="http2" /><summary type="html"><![CDATA[A practical comparison of gRPC and REST APIs covering performance, tooling, streaming, and when each protocol is the right choice for your backend.]]></summary></entry></feed>