linux

Tags: computers

Currently Linux plumber and hobbiest occupy two different workds
- Desktop linux and DevOps middleware are paid employees, vs poeple who use suckless softawre, musl-based distros

Greybeard Qualification

https://www.youtube.com/watch?v=5osOHhBWKOQ&list=PLSIUOFhnxEiC3YTdxwqZqgEY5imVL8U8J&index=1

Process

Internal datastructure `struct task_struct {}`:

pid -> integer, will wrap around unused pids
parent -> every proces has a parent, each one has a children
tty -> in the old days, every process had an associated terminal called a tty
uid -> user the process is running as
gid -> group id
A process can change its ids (setuid)
From a sysadmin pov, what can you do to a process?
- create/send signals/ps or proc/strace
- each process has an address space, every process thinks it owns the entire theoretical memory of the computer
  - why is kernel and user mapped into the same (virtual) address space?
  - because it’s “cheaper” in mode switch, and “cheaper” context switch (kernel stays mapped)
    - was actually attempted to be changed in https://lwn.net/Articles/39283/

memory

initialized and uninitalized data
break system call is used to increase heap size
heap sits on top and grows up and the stack grows down
in between space is used for shared libraries
32 bit 4GB
environment variables and cmdline is at the very top
kernel puts all of its data the top of the data structure
permissions on the memory regions like file
physical memory structure (https://www.youtube.com/watch?v=pI-HRRh7-dU)
- icache vs dcache

limits
- used to be important during timeshare systems
- rlimit or ulimit for stack size
  - common one is the number of open files
process priority
- nice in shell
- static priority
- dynamic priority
- real-time priority
- all related to scheduling
- nice or getpriority() or setpriority() system calls

IPC

pipes
- internal buffer data structure
- two file descriptors, connected to each end of the pipe
- similar to a channel in go
- no temp file at all, all in memory
- one limitation: only exist between members of the same family within the same process three
  - ls | wc works because both ls and wc are children of the chell
- FIFOs/named pipes, exist in the filesystem, acts like a regular pipe
  - used for multiple processes
- cron uses a fifo in the kernel?

signals
- Killing a process can be done with kill -s <SIGNAL> pid, send SIGQUIT to request a craceful quit or SIGABRT to request immediate termination
  - 32 original flavors
  - 32 new flavors
  - list with kill -l
  - process can send signals to another process, or kernel sends it to your process
  - common
    - 9 - kill
    - 15 - term
    - 1 - HUP - hangup (for when you have a connection), now it’s used to reload config files
    - 2 - INT - interrupt
    - 19 - STOP - suspend in background
      - fg command to see
    - 18 - CONT
    - 14 - ALARM - timer in kernel, when the process is sent an alarm signal

System V stuff
- semget - semaphore
  - SystemV semaphores can be manipulated as sets, and provides more control than POSIX ones
- msget - message service get
- shmget - shared memory
  - most efficient is one setting data and another is reading
- RPC - literally a remote procedure call, sent over to a different process usually on another process
- mmap - a way to map a file into memory, then shared with another process, BSD trick
  - you map a chunk of memory out to a file, and then its propagated, then you can have a set of memory that’s not tied to a particular file, then another process can share this file
    - anon memory, not tied to a particular process, going to be paged out to swap space

Execution, Scheduling, Processes, and Threads

https://www.youtube.com/watch?v=hgb-1f6BlGI&list=PLSIUOFhnxEiC3YTdxwqZqgEY5imVL8U8J&index=3
fork()
- fork creates a new process that’s an exact copy of the original
- copies memory, open files, basically everything
- not inherited: file locks and pending signals
- how does it work? why duplicate?
- fork as the parent returns >0, the pid returns child pid
- 0 is child
- <1 is error

file descriptor

integer that represents a kernel data structure that has an inode, read position, etc
pipes
it’s possible to redirect the inputs and outputs of the program (another great idea of linux)
now the child can something do different, including exec() a new program
- execve() runs a new program, usually fork and exec is the pattern (although fork is bad now)
  - apache web server for example forks
- bash execs two programs and
- copy on write fork()
  - since you’re just gonna exec, why bother copying the whole thing?

`daemons`?

fork a child process, then the parent exits and leaves the child with parent as the init
How do become a daemon?
- close all open files
- become the process group leader
  - process group, by default, a processes of the login shells is in the group
  - signals get passed to everyone in the process group, such as shells hanging up
- set the umask
  - what are the permissions I want to be sure that are turned off by default?
- change the dir to a safe space
  - change your working directory to somewhere safe, the daemon should be in a place that’s somewhere safe

scheduling

linux is multi-processing, but they have to share one CPU
- At any time, each of the CPU’s in a system can be
  1. Not associated with any process, serving and hardware interrupt (Hard IRQs)
    - Timer ticks, network cards, keyboards
    - Handlers here have to be fast, most of the time it is an ack
  2. Not associated with any process, serving a softirq or tasklet (Software interrupt context)
    - Much of the real interrupt handling work is done here
    - Tasklets are dynamically registrable versions of software soft interrupt contexts, which means you can have as many as you want, and any tasklet will only run on one CPU at any time
  3. Running in kernel space, associated with a process (user context)
  4. Running a process in user space
- Bottom two can preempt each other, but above is a strict hierarchy, kernel process cannot preempt a softirq, a soft irq cannot preempt a hardware interrupt
  - https://www.kernel.org/doc/html/v5.0/kernel-hacking/hacking.html#the-players
scheduling is how the kernel manages these running processes and “take turns”
- “context switching”, take all the registers values out, load the register values
  - instruction pointer
  - where is the stack? this kind of stuff
  - who gets to run next?
    - who’s ready to run?
    - every process has a task state associated with it
    - when simply running, CPU-bound, the kernel will preempt a process
    - everything is in a run queue
    - process is in the task_running state, has a certain time quantum to run, by default 100ms
      - kernel checks the process time quantum at every tick, causes the kernel to check on the state, if the process hasn’t exited or slept, guarentees that a process doesn’t run forever
      - realistically it’s 1 ms now
    - preemptive multitasking
      - all TASK_RUNNING processes are part of the runqueue
        
        length of the runqueue is called the “load average”
        
        uptime shows load averaged command
        
        last minute, 5 minutes, 15 minutes averages
        
        00:43:10 up 122 days, 35 min, 3 users, load average: 0.10, 0.10, 0.09
        
        vmstat also does this with r
        procs -----------memory-------- -- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 65022788 1420664 60968808 0 0 0 9 0 0 3 0 97 0 0
      - high load average means you’re trying to do too much on the system probably
- states
  - TASK_RUNNING
  - TASK_INTERRUPTABLE
  - TASK_UNINTERRUPTABLE
  - TASK_STOPPED (control z)
  - TASK_TRACED (strace)
  - EXIT_ZOMBIE (the process has ended, but remains in the tree)
    - zombies are only leftover datastructures in the kernel
    - only reason to keep it around is so that the parent can retrieve its return value
    - is not running CPU or mem
    - usually a sign that the someone hasn’t cleaned up
    - orphaned zombies become children of init, but the init will clean it up
  - EXIT_DEAD
    - process has excited, but pages still need to be cleaned up

Memory Management

Virtual Memory

We’re not compressing images anymore
But still multiprocessing, so we need to maintain individual process space
Memory mapping is a little different for the 64 bit versions
Memory mapping gets changed all the times

Paging

Page sizes traditionally 4kb, but really if you’re a big boy we’re all huge pages now
One of the bigger issues with paging is that storing the page table and all it’s constitutant information can get quite large. So we avoid mapping all the pages
- When a process generates a virtual address, the OS and the hardware have to make it a meaningful real address
- note that if we store page tables in memory, hitting RAM would be prohibitively slow, so we do a TLB
There’s a page table, and we represent the address space of the program and how it’s mapped into the actual ram
page table entries change logical memory spaces into physical spaces, how they ordered, how they used, varies widely
you don’t have to map all the pages, but if they’re not in ram, where are they?
kernel allocates the pages, and mapped when a program is loaded at execute time
shared libraries (shared among various processes, chances are .so pages are already in RAM, so they map to another process address space)
- although now static binaries might be good enough since we have tons of ram now
context switching doesn’t involve memory changes
- means switching to different address spaces and different page tables
- address spaces can sit all in memory with enough ram, just change the register values, no freeing memory
- pages are copied out to swap (AND HIT DISK)
- reason why it’s called “swap”, to swap processes between context switches
- now you just move the pages out as you need to
typically a disk partition or a file, and we can have multiple swap spaces
- mkswap(8) and swapon(8)
- swapon -s in terms of unix blocks
  - unix blocks are 512 bytes, each block is half a K, although now there’s different
Possible to have memory on the swap space and in memory!
“dirty page” - means that one copy that has been created is not the same as a the one in swap anymore
- dirty pages need to reswap out
A “page fault” happens when you hit something that is not in RAM, which triggers a copy back from swap
happens on demand, which means paging in and out requires a very long wait
page ins are synchronous! the worst case
Linux KPTI:
- separate page table, isolates user space and kernel space (against Meltdown), 5-30% performance degredation, causes partial TLB flushes
- pmap gets you memory mappings

ASLR
- Where is it controlled? /proc/sys/kernel/randomize_va_space

48 bit addressing
- Linux uses the most significant bit for distinguishing between kernel space and user space (https://read.seas.harvard.edu/cs161/2018/doc/memory-layout/)

File Mapping and Demand Paging

old way: exec the process, create the VM space, copy the text, init data, then swap and copies as needed
now /bin file is also copied into swap
why not let the program on disk be its own swap space!
file mapping - original copy of the program is a disk copy of what’s in memory, and then we reload them back from the file
- file mapping is swap
then we can just map the file into memory, you don’t even need it in memory!
- big win: only bring in pages as needed
- mapping is instant, no waiting for the program to load
- overall a win, demand-paging is good for most usecases (ls for example, you don’t load all the code for all the options)
- same thing happens for shared libraries -> libc is used
- anon pages -> not represented by real world files, aka swap space is only needed for dynamically allocated files
- shared libraries ld.so
- copy on write
  - when you fork a process, copy on write is a flag set on each page
- demand paging, waiting for page faults
  - normally: yes, but the alternative is on startup loading
  - turn it off my setting mlock or mlockall, which assures stuff isn’t paged out

Monitoring VM

vmstat - first line is always an average (si so = page ins and page outs)
- block IO is usually paging
swpd - memory in swap space
free - RAM that’s free

File I/O

Solaris also does this
- solaris has a page scanner that frees up scanner and each version also show scan rate
- if the page scanner was actually running and desperately scanning then we know what’s the memory problem
All file I/O in linux is done via paging
Paging isn’t necessarily bad
/proc/pid/maps
/proc/meminfo
/proc/pid/smaps
- Shows you the size of the memory segment
- RSS -> how much of this segment is physically present in memory
- how much of it is shared
- how much of it isn’t shared
- how much of it is dirty

RAM and Swap

How do you manage memory and free it up?
When you’re running low on memory
- old times you do a clockhand algo, where you mark pages with one clockhand another was one that would free
- now there’s fancier algorithms, but we still want to get the least recently used ones
kswapd -> kernel thread that frees pages
now when things are dangerously low, now the OOMKiller comes in
- Uses an algorithm that tries to pick processes that can be killed
if you’re too low on RAM and keep paging in and out, we’re “thrashing”
- happens on windows more
in the old days
- swap >= ram
since now you only need it for anon memory
generally VM = RAM + swap
You want activity in RAM, with enough ram, you should have zero swap
Kernel still keeps page tables in memory if you have extra space
- I think this is the “cache”
- Everything that gets paged into memory and kept without looking at disk io

Startup and Init

Boot

x86 -> 80x86 architecture
raise the value of the RESET pin (on the processor)
- this goes through and intializes two important registers
- CS and EIP register
  - CS - address of the stack and some other flags
  - EIP - instruction pointer

BIOS

This is coreboot/oreboot/libreboot
Interrupt driven process that provides access to the hardware (DOS descended)
Not a major part of the linux operating system, only used to get the kernel loaded
Linux has its own device drivers and interrupt handlers
POST (power on self test)
ACPI (advanced config and power interfaces) - builds tables of devices with info
initializes hardware - no IRQ or I/O port conflicts. Tabluates PCI (Peripheral Component interconnect) devices
Looks for something to boot
Devices
- Have an I/O port that the CPU can do IO to via port addresses
- When the device is ready to be talked to, it raises an interrupt and the linux kernel starts the driver
- ISA days
  - part of the trick was picking the devices so that they didn’t conflict with the interrupts
  - PCI helps with this by automating it so that they don’t conflict with each other
- You can choose what to do boot from now, and what order
BIOS looks for the boot sector
- copy the first sector into RAM (512 bytes per sector), starting at 0x000 7c00
- then jumps to that address

Loaders LILO and GRUB

First sector of a disk is the Master Boot Record (MBR)
Containers a partition table and a program
Within Linux, this is a boot loader. earlier linuxes had a bootloader (512 bytes)
LILO (linux loader) and GRUB (Grand Unified Bootloader)
GRUB is much more general purpose, has it’s own shell and can read file systems
You can’t fit everything into 512 byte boot sector, so multistage
First part is 0x0000 7c00, then it jumps to 0x0009 6a00, which then sets up the Real Mode stack
Second part into 0x0009 6c00, which reads the OS list and lets the user choose the OS to boot
If it’s linux, it displays loading, then loads the 512 bytes at the kernel 0x0009 0000, which then loads setup() to 0x0009 0200, and then jumps to run setup()

Real and Protected Modes
- Addressing modes in 80x86.
- Real mode: segement + offset, the segement address is shifted left by four bits (10 bits)
- Protected mode: uses page tables
  - Now has a User/Supervisor flag
  - When set to 0, page can only be accessed in kernel mode
  - CPL (current privilage level) is 2 bits for the cs register
  - CPL = 3 for User mode, CPL = 0 for Kernel mode (there’s 4 rings in intel, but we only use two)

Setup

For ACPI systems, build a physical memory layout table in RAM
Sets up the keyboard and video card
Initializes the disk controllers
If EDD (enhanced disk drive) services, uses the BIOS to build a table of hard disks in RAM
Sets up the interrupt descriptor table (IDT)
- Different devices can raise interrupts and you want something to happen
- A table is a set of addresses that points to code that is run when an interrupt is run
Resets the FPU
Switches the CPU to protected mode
Jumps to startup_32() (or startup_64 now)
- Initalizes segmentation registers and a provisional stack
- Zeros out the kernel’s uninitialized data
Runs decompress_kernel()
- Kernel is decompressed and loaded into memory, and then jumps to 0x1000 0000
second startup_32()
- intializes segementation registers and a provisional stack
- intializes kernel page tables
- stores Page Global Dir address in cr3 register and starts paging, setting the bit PG in the cr0 register
- sets up the Kernel Mode stack for process 0
  - what if there’s no task waiting for the kernel to schedule? the kernel schedules PID 0, which halts the CPU
  - every millisecond, an interrupt happens, which then rechecks the run queues
Nulls all the interrupt handlers
Copies paramenters from the BIOS into the boot command into the first page frame (in RAM)
Identifies the processor model
Initializes the interrupt descriptor table to null handlers
Jumps to start_kernel()

Kernel Startup

Nulls all the interrupt handlers
Calls sched_init() to initialize the schedulers
Initializes memory allocation
Sets up interrupt handlers with trap_init() and IRQ_init()
System date and time are set with time_init
CPU clock speed determined by calibrate_delay
Then we run kernel_thread()
Finally we are at /sbin/init!

Init and Run Levels

Users run levels (0-6 typically)
Distinguises between single user mode, GUI’s, etc
Default values
- 0 - halt
- 1 - single user mode
  - Booting into root halfway
- 6 - reboot
- S is used when entering runlevle 1
- Change with telinit(8)
  - Unix applications know what name they’re called by, so egrep -> grep -E
You can specify the run level at the command line (GRUB or LILO)
File names
- In the unix world, .tab is usually table, and rc is “run command”
- /etc/init.d/rc
inittab entries are what starts

RC Scripts

Run command
Takes a run level as an arg
Then looks for a directory in /etc/rc/\/.d
Runs each scripts with an action arg
Run level 0, 6 action is stop
Others are start
To run a startup script, find the right run level, and look at the different scripts, and then set it there
But these are init rc and not systemd
Stopping processes - just run pkill
init came out of SystemV, which was the original version
- originally commercialized by AT&T
- Berkley did it with a different startup script rc.boot, rc, rc.local
upstart in ubuntu

Block Devices and Filesystems

Devices

Devices are special file
/dev is devices
When we need to do IO on a device, just go out and access it!
Block special files and character special files
- ls -l -> c for character file, b for block file
- major and minor numbers, represents type of the device, and reprsents the instance
  - 4k major numbers, 2^20 minor numbers
- mknod is used to make a device special file
- Nothing special about these files! You can make them anywhere
- udev is a program that sits there and creates files in the /dev directory
  - devices can register themselves or get a number assigned to them
- much effort has gone into making devices work together
- traditional device operations
  - open()
  - close()
  - read()
  - write()
  - seek()
  - ioctl
    - lets you do particular things to particular device
    - usually takes some integers (like everything in unix world)
- block devices
  - tend to do I/O in terms of blocks
  - like disk drives
  - usually more random access
  - disks for example
- character devices
  - operate in bytes
  - usually more seequential
  - mouse/audio card/printer/tape/terminal for example

Block I/O

How does block IO work?
We want to write some bytes a disk, eventually we must turn this into a block
Top level: Virtual file system
- Came along SystemV (release 4)
- Pre VFS, you had to mount a read/write for every single file system type
- VFS unifed all of this, generic read/write/open/close
- Cool thing that VFS brought: /proc file system! /cgroup file system! We can have file systems that don’t even represent a storage device
Page cache
- There used to a buffer cache for blocks
- and page caches for pages
- now we only have one page cache
- when you make a write call to the disk, the kernel is actually taking the data to write it at a later time
- rapid return of writes is a performance gain but a reliability loss
- if you’ve written a lot of data, you might need to get it again! hence why it’s held in the page cache
  - how pages stay around? We also page cache our writes
  - NOT the pages that represent the memory map
  - two different areas of memory
  - page cache for regular files, page cache for block IO
  - swapped data go into page cache, even special files
  - shared memory trick? Memory map fake files in the page cache, and forks can read it back out
- Each page is owned by an inode
  - also has an offset for that file
- We want to periodically scan the page cache for dirty pages and flush to disk
  - bdflush in the past
- We also want to track how long pages have been dirty and flush before too long in the cache
  - kupdate in the past
- Now kernel runs varying number of pdflush threads
Mapping layer
- Most of the action
- FS takes the pointer to the data, allocates appropriate pages, and then turns it into blocks to be written
- FS knows how to talk SCSI or SATA or NVMe
- A disk is represented as a sequential list of blocks, which then we map over
- We also have a “segement”, we should try to find the blocks that are together, which can be sent together
- Pages are 4kb, sectors are 512 B, blocks can range between these two sizes (although pages are huge now)
- once blocks are known, send the request to the generic block layer via submit_bh() and ll_rw_block() for multiple blocks
- arguments are direction (r/w) and buffer heads
Generic block layer
- Gets sent data from the mapping layer

procs ———–memory———- —swap– —–io—- -system– ——cpu—– r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 65022788 1420664 60968808 0 0 0 9 0 0 3 0 97 0 0

Generic block layer combines contingous blocks into segments called a bio (block IO unit)
calls generic_make_request() (send_bio_noacct now maybe?)
Then it sends it to the device driver using the strategy() routine

file systems
- Stores data on disk
- Data and metadata
- Metadata is permissions, ownership, timestamp, size
- Efficient data access is important
- Usually metadata access is the most important
- Metadata is cached on the system
- Journaling helps (logging)
  - Like the traditional DB two-phase commit
  - First write log, then append to last
  - Then you update the actual disk with the log
- If a disk is an inconsistent state, run fsck -> but this is SLOW
- Most journaling filesystems use it only for metadata, but some use it for all. This is usually configurable like ext2 and ext3
  - Ext3 without journalism is just ex2
  - You can put journals on a separate disk (since journal IO isn’t competing with disk as much)

Files

inodes
- Every file system has an inode (contains all the information returned by ls -l)
- device
- mode
- links
- uid and gid
- size in bytes
- size in blocks
  - Block addresses are stored too, 12 direct blocks, one indirect block address, one double-indrect block address, and one triple-indirect block address
  - 15 total slots
  - the reason there’s indirect is because the first 12k is stored in the file
    - MB files therefore have to use extra blocks
    - next address points to a block, that is filled with addresses
    - second indirect block points to a block that is filled with addresses that go to a different block!
    - third, so on
    - direct: 12kb
    - indirect: 256 blocks = 256kb
    - double: 256^2 = 64MB
    - triple: 256^3 = 16 TB
    - this goes back at least to version 7
- atime, mtime, ctime
  - atime -> access time, last time file was open()
  - mtime -> last time file was modified
  - ctime -> last time inode was changed (ctime is NOT creation time, most unix systems don’t track creation time)
- Multiple directions that point to the same inode is a hard link
- One inode per file
Unix timestamps
- Since Jan 1st 1970 in UTC
- 32 bits
- Jan 18th, 2038
Sparse files
- What if you write some bytes into the beginning, but then jump wayyy down and write later, where there’s no data in between.
- Logically, the bytes in between are filled with zeros
- Don’t allocate for stuff in between
- If you copy a sparse file, dumb copies will take a bad copy!
Superblock
- Information about the filesystem
- Much of what is in the df command

Networking, Configuring & Building a Kernel

OSI model -> physical, data link, network, transport, session, presentation, application - networks
What a socket?
- IP address and port on each end, which uniquely identifies a connection
Network layer implements IP to retransmit packets, but this is machine to machine
- Intermediate devices bridge this gap (usually a router)
Ethernet/data link layer is what’s next, also packet switched with ethernet address 00:11:22:33:44:55 (mac address), NIC to NIC
Networking - physical layer, signal level
- CSMA/CD (carrier sense multiple access with collision detect)
  - old: 4 way stop sign, everyone tries to transmit and hope that there’s nothing there
  - old: token rings, everyone gets a turn like a traffic light
- carrier modulates an IO signal
- twisted pair cables
- a network device is a special animal
- not a character or a block device
- no major/minor numbers
- hardware interface initiates IO
- not in /dev
- when a device driver gets registered, it adds the device to the internal list
- interface is also different!
  - open
  - stop
  - hard_start_xmit() - start transmit packets
  - rebuild_header - using ARP (address resolution protocol)
  - tx_timeout
  - net_device_status - when netstat gets called
  - set_config
- also ioctl
  - SIOCSIFADDR
  - SIOCSIFFLAGS
  - “socket IO control set InterFace Flags”
Packet transmission
- Data is placed on a socket buffer
- call hard_start_xmit
- OS checks to make sure you’re doing less than the Maximum Transmission Unit (MTU)
- Places in transmission queue
- Driver processes queue and sends to hardware
Packet Reception
- Two ways
  - Interrupt driven (most drivers use this one)
  - polled
- Recieves incoming data, places it in socket buffer
- If IO rate is too high, use polling
Address Resolution Protocol
- Associates IP with a MAC address
- Broadcasts an ARP request on the network: “who has 10.1.2.3”?
- Saves entries into the ARP cache
- Manipulate with arp
Configure Interface
- always boils down to ifconfig(8)
Linux Packet Filter
- Basically the “firewall”, now managed by iptables or ufw
  - ipchains and ipfwadm
- Sets a list of rules, accept a match and execute an action
- Three basic chains: input, forward, and output
  - Each packet only follows one chain!
  - there’s a default rule for every chain
- Connection tracking?
  - ip_conntrack.o module
- List all the rules with iptables -L -v
Router table
- Kernel has a routing function?
- Usually one of the multiple interfaces, so router table
- Order matters, and search until we match a pattern
- netstat -nr also shows you the router table (-n for hosts as IPs instead of having it do DNS)
- route command for checking the router table
Kernel Build
- go to kernel.org
- need bunzip2 to decompress
- can build as anyone, only need root to instal
- why?
  - add a new feature/driver, or your driver was left out
  - or to leave out specific drivers

Crashes

Breakpad and Crashpad from google allow for dump tracing
https://www.instapaper.com/read/1524786035
Linux uses signals to register SIGABRT/BUS/etc
By default, signal handlers are called on the same stack where the crash occured, which is fine for most cases except a stack overflow
Apple computers use Mach exceptions: https://flylib.com/books/en/3.126.1.109/1/
https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/KernelProgramming/Mach/Mach.html
https://docs.darlinghq.org/internals/macos-specifics/mach-ports.html
Core dumps can be enavbled with ulimit -c unlimited on a temp basis. Otherwise configure /etc/security/limits.conf
gcore -o filename PID can also be used to generate a core dump

Exceptions

Exceptions can happen because of invalid access or when a runtime check fails. This throws a SIGABRT

Tracing

Debian

`perf` performance issues

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911815
- when not linked against libbfd, perf calls out to addr2line for every address lookup, which is infeasible
- possible solution: https://michcioperz.com/post/slow-perf-script/
- patches: https://lore.kernel.org/linux-perf-users/[email protected]/

Linux Audio

Pipewire: http://pipewire.org
- provides low latency, graph based processing for audio

LD_PRELOAD and LD_LIBRARY_PATH

Different usages on os x, uses DYLD_LIBRARY_PATH and DYLD_FALLBACK_LIBRARY_PATH
- See https://stackoverflow.com/questions/3146274/is-it-ok-to-use-dyld-library-path-on-mac-os-x-and-whats-the-dynamic-library-s
- https://github.com/Jackett/Jackett/issues/5589

Linux Kernel Headers

https://lwn.net/ml/linux-kernel/[email protected]/

Cellular Modems

io_uring

https://news.ycombinator.com/item?id=36353845

Resources

Huge Pages

-https://www.evanjones.ca/hugepages-are-a-good-idea.html

Monitoring Tools

Guide: https://www.brendangregg.com/blog/2015-07-08/choosing-a-linux-tracer.html

Perf

https://perf.wiki.kernel.org/index.php/Tutorial#Live_analysis_with_perf_top

Netlink

https://docs.kernel.org/userspace-api/netlink/intro.html
Evolved procotol that talks over sockets instead of fixed format C structures

Mounting

https://dbohdan.com/clean-mount-lists

cgroups and names

Foundations of containers for kubernetes (k8s)

Memory

Objects

objdump to look at elf files

`readelf`

Allows you to read a binary and check which parts mapped to where
readelf -l userProgram

Malloc Behaviors

Malloc can use a data segement on the heap
Or use an anonmyous mmap()
Cares about the MMAP_THRESHOLD parameter
Actually very similar to the erlang binary heap vs heap memory
Heaps are not really a single region of memory

Syscalls (vsyscalls)

vsyscalls maps a fixed memory region into user space
- some non-dangerous functions, l=such as gettimeofday() are mapped into the user space
- turns out bad with ASLR, and then fixed sizes, so deprecated in favor of vDSO
vDSO
- kernel maps a small shared library into user space
- supports ASLR
- more syscalls space
vvar
- read only variables for VDSO, specific heap for it

Stack

how big is the stack? ulimit settings
on x86, the stack always grows downwards
- since the stack is always a hardware and software feature
- the CPU is aware of the stack
- some arches have the stack grow upwards
envp strings sit behind the argv strings
- int main(int argc, charg *argv[],char *envp[]) {
- auxillery vector sits behind theenvp

linux

Greybeard Qualification

Process

Internal datastructure struct task_struct {}:

memory

IPC

Execution, Scheduling, Processes, and Threads

file descriptor

daemons?

scheduling

Memory Management

Virtual Memory

Paging

File Mapping and Demand Paging

Monitoring VM

File I/O

RAM and Swap

Startup and Init

Boot

BIOS

Loaders LILO and GRUB

Setup

Kernel Startup

Init and Run Levels

RC Scripts

Block Devices and Filesystems

Devices

Block I/O

Files

Networking, Configuring & Building a Kernel

Crashes

Exceptions

Tracing

Debian

perf performance issues

Linux Audio

LD_PRELOAD and LD_LIBRARY_PATH

Linux Kernel Headers

Cellular Modems

io_uring

Resources

Related links

Huge Pages

Monitoring Tools

Perf

Netlink

Mounting

cgroups and names

Memory

Objects

readelf

Malloc Behaviors

Syscalls (vsyscalls)

Stack

Data Section

Links to this note

Internal datastructure `struct task_struct {}`:

`daemons`?

`perf` performance issues

`readelf`