Nextflow - Recap

  • Main features

    • Domain specific language (DSL) for pipeline development
    • Reproducible workflows
    • Isolation of dependencies (conda, containers)
    • Portable - execution abstraction (local, SGE, AWS, …)
    • Parallelization is implicit
    • Use your scripting skills + DSL (i.e. groovy)
  • Nextflow pipelines consist of

    • Channels (asynchronous FIFO queues)
    • Processes
    • Config

Nextflow - Recap

Nextflow DSL2

Nextflow DSL2 is major evolution of the Nextflow language

  • New features
    • Functions
    • Modules
    • Subworkflows
    • enables the reuse of workflow components
    • reuse of Channels

To enable DSL 2 use the following declaration at the top of the script:

nextflow.enable.dsl=2

Nextflow DSL2 - functions

def <function name> ( arg1, arg, .. ) {
    <function body>
}

For example:

def foo() {
    'Hello world'
}

def bar(alpha, omega) {
    alpha + omega
}

Functions implicitly return the result of the last evaluated statement.
explicit: return <val>

Nextflow DSL2 - modules

A module is a Nextflow script containing one or more process definitions that can be imported from another Nextflow script.


Difference to legacy syntax


In DSL2 the process is not bound with specific input and output channels.

from and into channel declaration have to be omitted.

The new DSL separates the definition of a process from its invocation.

Nextflow DSL2 - modules

A process can be invoked as a function in the workflow scope, passing the expected input channels as parameters as it if were a custom function.

process bar {
    input:
      path x
    output:
      path 'bar.txt'
    script:
      """
      your_command $x > bar.txt
      """
}

workflow {
    data = channel.fromPath('/some/path/*.txt')
    bar(data)
}

Nextflow DSL2 - modules

Module components can be imported using the include keyword.

include { foo } from './some/module'

workflow {
    data = channel.fromPath('/some/data/*.txt')
    foo(data)
}

Multiple inclusion

include { foo; bar } from './some/module'

workflow {
    data = channel.fromPath('/some/data/*.txt')
    foo(data)
    bar(data)
}

Nextflow DSL2 - modules

Process outputs can either be assigned to a variable or accessed using the implicit .out attribute

include { INDEX; FASTQC; QUANT; MULTIQC } from './module/script.nf' 

read_pairs_ch = channel.fromFilePairs( params.reads)

workflow {
  INDEX( params.transcriptome )
  FASTQC( read_pairs_ch )
  QUANT( INDEX.out, read_pairs_ch )
  MULTIQC( QUANT.out.mix(FASTQC.out).collect(), multiqc_file )
}

Note: channels may now be used as inputs multiple times without the need to duplicate them

Nextflow DSL2 - named output

use the emit option to define a name identifier that can be used to reference the channel in the external scope.

process foo {
  output:
    path '*.bam', emit: samples_bam

  '''
  your_command --here
  '''
}

workflow {
    foo()
    foo.out.samples_bam.view()
}

Nextflow DSL2 - module aliases

… inclusion and invocation of components with the same name from different modules

include { foo } from './some/module'
include { foo as bar } from './other/module'

workflow {
    foo(some_data)
    bar(other_data)
}

… inclusion and the invocation of the same component multiple times

include { foo as foo_a; foo as foo_b } from './some/module'

workflow {
    foo_a(some_data)
    foo_b(other_data)
}

Nextflow DSL2 - parameters

Define one or more parameters in the module

params.foo = 'Hello'
params.bar = 'world!'

def sayHello() {
    println "$params.foo $params.bar"
}

Parameters are inherited from the including context

params.foo = 'Hola'
params.bar = 'Mundo'

include {sayHello} from './some/module'

workflow {
    sayHello()
}

Nextflow DSL2 - sub-workflows

In DSL 2 syntax sub-workflow libraries can be defined.

- sub-workflows can be used in the same way as processes
- include and reuse multi-step workflows as part of larger workflows

  • Requirements:
    • workflow name
    • inputs using the new take keyword
    • outputs using the new emit keyword

workflow subworkflow_name {
    take: data
    main:
        foo(data)
        bar(foo.out)
    emit:
        bar.out
}

Nextflow DSL2 - workflow

Workflows defined in your script or imported by a module inclusion can be invoked and composed as any other process in your application.

workflow flow1 {
    take: data
    main:
        foo(data)
        bar(foo.out)
    emit:
        bar.out
}

workflow flow2 {
    take: data
    main:
        foo(data)
        baz(foo.out)
    emit:
        baz.out
}

workflow {
    take: data
    main:
      flow1(data)
      flow2(flow1.out)
}

Nextflow - Links