What can we do with all these (and tons of other) tools?
- Record their output
- Get the error status
- Knit them together to build “pipelines”
- Running them over batches of files
What can we do with all these (and tons of other) tools?
POSIX* has 3 standard data streams:
STDIN
: Standard Input, can send data to a programSTDOUT
: Standard Output, output data from a programSTDERR
: Standard Error, errors/warnings from a program*Portable Operating System Interface: API in Unix
Streams can be redirected using arrows
> filename
Redirects STDOUT (>>
appends)< filename
Redirects STDIN2> filename
Redirects STDERR2>&1
Sends STDERR to STDOUT results in one output stream> /dev/null
Sends output to the data Nirvana$ ls -1 *.fastq.gz > fastq_list.txt $ cat fastq_list.txt sample_1.fastq.gz sample_2.fastq.gz [...] $ program sample_1.fastq.gz > result.txt 2> error.txt $ program sample_1.fastq.gz > result.txt 2>&1
This is the Unix philosophy:
$ cat file.txt | grep -v "^$" | sort k2,2n
$ grep ">" seq.fasta | tr -d ">" > out.txt
$ cat seq.fasta | rev | tr "ACGT" "TGCA" > rev_comp.fasta
Named pipes also known as FIFO:
mkfifo
creates the FIFOTwo separate processes can access the pipe by name one process can open it as a reader, and the other as a writer.
$ mkfifo R1 $ mkfifo R2 $ yara_mapper -e 3 -t 4 -f bam y_idx reads_R1.fq reads_R2.fq | \ samtools view -@ 4 -h -F 4 -b1 | \ tee R1 R2 > /dev/null & samtools view -@ 2 -h -f 0x40 -b1 R1 > mapped_1.bam & samtools view -@ 2 -h -f 0x80 -b1 R2 > mapped_2.bam & wait rm -f R1 R2
tee
reads standard input and writes it to both standard output and one or more files, effectively duplicating its input
ps
ps -e
$USER
in full format as tree:ps -efxu
ps -afxu
top
, atop
, htop
, or glances
du
, du -h -d <N>
, du -sh
spacereporter
(only on CePH)<CTRL>+<c>
<CTRL>+<z>
bg [jobspec]
fg [jobspec]
jobs
samtools sort -o sorted.bam unsorted.bam > sort.log 2>&1 &
nohup samtools sort -o sorted.bam unsorted.bam > sort.log 2>&1 &
disown <PID>
if you forgot nohup
kill [-signal] <PID>
-TERM
, -KILL
, -STOP
, -CONT
k
key in top
or atop
to send a signal to a selected process<F9>
to send signal to selected processes in htop
killall <processname>
killall -i <processname>
Note:
A user may only kill processes the s*he owns, only root can kill all processes
for value in {5,10,20,50} do run_simulation --iterations=$value > ${value}_iterations.log 2>&1 done for value in {10..100} do run_simulation --iterations=$value > ${value}_iterations.log 2>&1 done for file in *txt do echo $file grep .sam $file | wc -l done
If you need to use loops, consider switching straight to nextflow instead (later lecture!)
tmux
is a terminal multiplexer
tmux
session
tmux
it is easy to switch between multiple programs in one terminaltmux
sessions are persistentAll commands in Tmux start with a prefix, which by default is <CTRL>+b
.
Basic tmux
commands
$ tmux new -s <session_name>
<CTRL>+b c
Create a new window (with shell)<CTRL>+b w
Choose window from a list<CTRL>+b ,
Rename the current window<CTRL>+b %
Split current pane horizontally into two panes<CTRL>+b "
Split current pane vertically into two panes <CTRL>+b o
Go to the next pane<CTRL>+b ;
Toggle between the current and previous pane<CTRL>+b x
Close the current pane
List and attach tmux
sessions
$ tmux ls $ tmux a -t <session_name>
Using the public key the server can check that a request comes from your computer.
# generate a key pair ssh-keygen # copy the key pair on the server ssh-copy-id user@zeus