Named Function

A function is a group of statements that is executed when called. It returns a value or multiple values. In DolphinDB, system built-in functions are not allowed to be overridden.

Both built-in functions and user-defined functions support specifying default parameter values. Note that the parameter with a default value cannot be mutable, and the default value must be a constant. All parameters after a parameter that is configured with a default value must also be configured with default values as well.

Syntax

def <functionName> ([parameters]) {statements}

or

def <functionName> ([parameters]): statement (can only have one statement here)

We will discuss user-defined aggregate functions at the end of this section. Their syntax is the same as the named functions except that their definitions start with “defg” instead of “def”.

Function parameters

  • Function parameters are always passed by references.

  • An input parameter can be modified within function body if and only if it is qualified by “mutable”.

Examples

Define a named function:

$ def f(a){return a+1};
$ f(2);
3

$ def f(a):a+1;
$ f(3);
4

$ def f(a=1, b=2){return a + b}
$ f(,3)
4

Assign a function or an array of functions to variables:

$ g=sin;
$ g(3.1415926);
5.358979e-008

$ g=[sin, cos, log];
$ g(1 2);

sin

cos

log

0.841471

0.540302

0

0.909297

-0.416147

0.693147

If a function and a variable have the same name, we can use the address operator (&) to indicate a function.

$ sum=15;
$ g=sum;
$ g;
15

$ g=&sum;
$ g;
sum

$ g(1..4);
10

Immutable parameters cannot be modified within the function:

$ def f(a){a+=1; return a};
Syntax Error: [line #1] Constant variable [a] can't be modified.

Mutable parameters can be modified within the function:

$ def f(mutable a){a+=1; return a};
$ x=1;
$ f(x);
2

$ f(x);
3

$ x;
3

$ x=1..6$3:2;
$ x;

#0

#1

1

4

2

5

3

6

$ def f(mutable a){if(a.rows()>=2 && a.cols()>=2){a[1,1]=0}; return a};
$ f(x);

#0

#1

1

4

2

0

3

6

Declare a function first then define it later:

$ def f2(a,b)
$ def f1(a,b){return f2(a,b)}
$ def f2(a,b){a pow b};
// ";" should not be put in-between the lines otherwise the system cannot tell f2 is a declaration without the follow up definition.
$ f1(2,3);
8

Return multiple values:

$ def summary(a){return sum(a), avg(a), std(a)};
$ x=1..11;
$ summary(x);
[66, 6, 3.316625]

A more complicated example:

Let’s write a function to calculate the covariance between input vectors of equal length.

(1) If neither a nor b has NULL elements, calculate the covariance between them.

(2) If either a or b contains NULL elements, we first retrieve the sub vectors that do not contain any NULL elements and then calculate the covariance.

$ def calcovar(a, b){
$     aNull=hasNull a;                                                  // return true if input vector a contains null values; otherwise, false
$     bNull=hasNull b;
$     if(!aNull && !bNull){                                             // if neither a or b contains null values
$             am=avg a;                                                 // calculate the mean of input vector a using function avg
$             bm=avg b;                                                 // calculate the mean of input vector b using function avg
$             ab=a ** b;                                               // calculate the inner product of vector a and b
$             n=count a;                                               // get the number of non-null values
$             return (ab-n*am*bm) \ (n-1);                             // return a covariance value
$    }
$         else{                                                         // get all positions in which values are not null in both a and b
$                if(!aNull)                                              // if a does not contain any null values
$                        index=!isNull b;                               // get the indices of non-null values in b
$                    else {
$                            if(!bNull)                                   // if b does not contain any null values
$                                index=!isNull a;                          // get the indices of non-null values in a
$                            else
$                                index=!((isNull a) || (isNull b));       // get the positions that are not nulls in both a and b
$                            }
$                     c=a[index];
$                     d=b[index];
$                     am=avg c;
$         bm=avg d;
$         ab=c ** d;
$         n=count c;
$         return (ab-n*am*bm) \ (n-1);
$       }
$ }

User-defined Aggregate Functions

A user-defined aggregate function is a function that restricts the data form of the output to be scalar. There are times when we would like to make sure a function returns a scalar. Aggregate functions serve this purpose.

User-defined aggregate functions have the same syntax as the named functions except that their definitions start with “defg” instead of “def”.

$ defg f(x, y){
$     a = sum(abs(x+y))
$     b=sum(abs(x))+sum(abs(y))
$     return a\b
$ };
$ x=-5..5; y=0..10;
$ f(x,y);
0.858824