Intro to Unix and shell

Published

Wed, 7 of May, 2025

Modified

Wed, 7 of May, 2025

Reference

Great (paid) LinkedIn course by Kevin Skoglund “Unix essential training”

What is Unix?

  • an OS developed by AT&T employes at Bell Labs in the 1960s/70s
  • in 1972 it is rewritten in C programming, which made it portable. It then spread outside AT&T
  • it still powers devises all around
  • you can use it from the Command line or from GUI interfaces

What are the kernell and shell?

  • kernel is the core of the OS. it allocates time and memory to processes
  • shells is the outer layer of the OS. it sends reuqests to the kernel
    • you can choose from different shells (sh, bash, zsh, etc.)

Fundamental

  • Ctrl + C = Cancel/stop the current command (SIGINT)
  • Ctrl + Z = Pause current command (suspend, SIGTSTP)
    • fg Resume the last paused command
  • Ctrl + a = start of the command
  • Ctrl + e = end of the command
  • q = close dialogue
  • Ctrl + L → clears the terminal screen (but not history)

Files and directories

pwd

ls
# formatted differently
ls -l
ls -lh # size (5 col) is human readable
ls -a # include hidden
ls -a -lh
pwd
# User home directory 
cd ~ # /Users/luisamimmi
# go back to the "last" dir you were in !
cd -

Creating Files

cd shell
# open text editor
nano
# then write inside 
'ctrl+X' # to exit
'Y' # yes to save 
# write name
'enter' # to save

nano test2.text

Reading files

cd ..
ls
cat README.md

# pagination 
more 

# opens a viewer for scrolling 
less README.md
# then press `f` for forward
# or press `b` for backward
# or press `q` for done

Creating directories

  • -p = add NEW parent dir
pwd
mkdir  dir 
ls -l

mkdir shell/dir
ls -l

# to create a dir in a NEW Parent dir 
mkdir -p shell/newdir/dir

Move and rename

  • -f = force overwriting (default!)
  • -n = no overwriting
  • -i = interactive overwrigitn (ask!)
cd shell 

# Move <target> <destination file>
mv test.txt newdir/dir/test.txt

# Move <target> <directory>(no file) # OK as well
mv newdir/dir/test.txt newdir

# Rename (automatic)
touch second_test.txt 
# Move <target> <RENAMED destination file>
mv second_test.txt newdir/dir/2_test.txt
ls -l newdir
ls newdir/dir/

Copy

pwd
cd newdir
ls -l
cp test.txt dir/

cd dir
ls -l
  • -R = recursive (if a dir also its content )
pwd
cd ..
cp -R dir dir2

Deleting files and directories

In Unix when you delete something, it is gone immediately and forever!

pwd
ls -l
# file 
rm test.txt
# dir 
rm -R dir2

File/dir by name

find <path> <expression>

  • * = zero or more characters (glob)
  • ? = any one character
  • []= any character in the brackets
cd ~/Github/_nerd_help/
find shell -name "test.txt"
find . -name "test.txt"

find shell -name "*test*"
find shell -name "??test.txt"
find shell -name "*[.txt.qmd]"

grep <expression> <path>

  • r = recursive
  • l = follow symbolic links
  • i= case-insensitive matching
  • E = use extended regular expressions (like |)
# tutti i file .R in tutte le cartelle e sottocartelle, e mostrerà solo quelli che contengono la parola "library".
grep -rl "Google Drive" .

# Case-insensitive search
grep -rli "Google Drive" .

# Search for multiple words (logical OR)
grep -rEl "Google Drive|Github" .

File/dir by name + content word (only text files)

cd ~/Github/_nerd_help
find . -type f -name "*.R" -exec grep -l "color" {} +

Exe in Google Drive

cd ~/Library/CloudS*/GoogleDrive-lmm76@georgetown.edu/'Il mio Drive'
ls -ls

… based on name (file|dir)

find . -type d -name "*MUSIC*"      # directory
find . -iname "*MUSIC*"              # file + directory 

… based on name (file|dir) + word (in text file)

find + grep will only work with text files (R or txt)

  • find . → Start searching in the current directory
    • -type f → Only look for regular files
    • -name "*.R" → Restrict the search to files ending in .R
    • /( and \) → bc parenthesis must be escaped
    • -o → logical OR between conditions
  • -exec → For each matching file, execute the following command…
    • grep -l "library" {} + → search for the word “library” in the file ({} filename placeholder)
      • -l tells grep to only print the filename if a match is found
    • + → because -exec clause must be escaped (+ or ;)
# Search for files containing specific content and return the list of files
find . -type f -name "*.R" -exec grep -l "library" {} + # or 
find . -type f -name "*.R" -exec grep -l "library" {} \; 

find . \( -name "*.R" -o -name "*.qmd" -o -name "*.md" \) -type f -exec grep -El "feature|color" {} +

… based on name (file|dir) + word (in PDF file)

  • sh -c '...' → Run the given string as a shell script
  • for file do ... done loops over those filenames
  • grep
    • -q → be quiet (don’t print the match)
    • -i → ignore case
    • -E → use extended regex (feature|color)
  • {} are the filenames from find, passed as $1, $2, …, to the shell script
cd ~/Library/CloudS*/GoogleDrive-lmm76@georgetown.edu/'Il mio Drive'/SCP_mio
ls -ls

find . -type f -name "*.pdf" -exec sh -c '
  for file do
    if pdftotext "$file" - | grep -qiE "sussidiarietà"; then
      echo "$file"
    fi
  done
' sh {} +

… based on name (file|dir) + word (in WORD file)

NERDY set up step

# First time I got 
which brew
#/usr/local/bin/brew
/opt/homebrew/bin/brew --version
# Homebrew 4.5.0 OK

# configure zsh to use the correct brew
export PATH="/opt/homebrew/bin:$PATH"
# then 
source ~/.zshrc
which brew
#/opt/homebrew/bin/brew # OK!
brew install docx2txt
#docx2txt --help # doesnot work!?!?!
#... but I have it
which docx2txt
ls /opt/homebrew/Cellar/docx2txt/1.4/bin
# add symlink
ln -s /opt/homebrew/Cellar/docx2txt/1.4/bin/docx2txt.sh /opt/homebrew/bin/docx2txt
which docx2txt

Or to use PANDOC for word files

  • Quarto and R should use their own pandoc version (so this shouldn’t mess that up!)
brew upgrade pandoc
pandoc --version
brew link --overwrite pandoc # from pandoc 1. 2 pandoc 3.6.4

Check only 1 file’s content

cd ~/Library/CloudS*/GoogleDrive-lmm76@georgetown.edu/'Il mio Drive'/SCP_mio
ls -ls

# `docx2txt` often misses text in text boxes, shapes, footers, or other styled containers
docx2txt < ./'Appunti 2o tavolo.docx' - | grep -i "sviluppo sostenibile" # NOPE!
# `pandoc` uses the full internal structure of the .docx file and:
pandoc 'Appunti 2o tavolo.docx' -t plain | grep -i "sviluppo sostenibile" # YEP!

Get files list

  • Need to transform into text first
cd ~/Library/CloudS*/GoogleDrive-lmm76@georgetown.edu/'Il mio Drive'/SCP_mio
ls -ls
# Using docx2txt (NOPE!)
# Using pandoc (YEP!)
find . -type f -name "*.docx" -exec sh -c '
  for file do
    if pandoc "$file" -t plain | grep -qiE "sviluppo sostenibile"; then
      echo "$file"
    fi
  done
' sh {} +


# suppress errors (corrupt or *.doc files - NOT handled by pandoc)
find . -type f -name "*.docx" -exec sh -c '
  for file do
    if pandoc "$file" -t plain 2>/dev/null | grep -qiE "sviluppo sostenibile"; then
      echo "$file"
    fi
  done
' sh {} +

… based on name (file|dir) + word (in doc file)

pandoc ./"Ioratti_proposta_DOCUMENTO PROGRAMMATICO - spunti per l'impostazione.doc" -t plain # NOPE
# Unknown input format doc
antiword ./"Ioratti_proposta_DOCUMENTO PROGRAMMATICO - spunti per l'impostazione.doc" | grep -i "pavia" # YEP

find . -type f -name "*.doc" -exec sh -c '
  for file do
    if antiword "$file" 2>/dev/null | grep -qiE "PAVIA "; then
      echo "$file"
    fi
  done
' sh {} +