Metaprogramming

Metaprogramming is a programming technique in which computer programs are treated as data. It means that a program can read, generate, analyze or transform other programs, or even modify itself while running.

DolphinDB supports metaprogramming for dynamic expression generation and delayed evaluation. With metaprogramming, users can generate SQL statements and evaluate them dynamically.

Metacode is objects or expressions within “<” and “>”.

Related functions

1. Function expr generates metacode from objects, operators, or other metacode.

Syntax: expr(X1, X2, ……) where X’s are objects, operators, or other metacode.

$ expr(6,<,8);
< 6 < 8 >

$ expr(sum, 1 2 3);
< sum [1,2,3] >

$ a=6;
$ expr(a,+,1);
< 6 + 1 >

$ expr(<a>,+,1);
< a + 1 >

$ expr(<a>,+,<b>);
< a + b >

$ expr(a+7,*,8);
< 13 * 8 >

$ expr(<a+7>,*,8);
< (a + 7) * 8 >

$ expr(not, < a >);
< ! a >

2. Function eval evaluates the given metacode.

Syntax: eval(<X>) where X is metacode.

$ eval(<1+2>);
3

$ eval(<1+2+3=10>);
0

$ eval(expr(6,<,8));
1

$ eval(expr(sum, 1 2 3));
6

$ a=6; b=9;
$ eval(expr(<a>,+,<b>));
15

3. Function sqlCol converts column names into metaexpressions.

Syntax: sqlCol(colNames), where colNames can be one column name or a vector of column names. Each of these column names need to be quoted.

$ sqlCol(`PRC);
< PRC >

$ sqlCol(["PRC", "RET", "DATE", "TICKER"]);
(< PRC >,< RET >,< DATE >,< TICKER >)

4. Function sqlColAlias uses metacode and an optional alias name to define a column. It is often used for calculated columns.

Syntax: sqlColAlias(<metacode>, [aliasName])

$ sqlColAlias(<x>, `y);
< x as y >

$ sqlColAlias(<avg(PRC)>, `avgPRC);
< avg(PRC) as avgPRC >

$ sqlColAlias(<avg(PRC)>);
< avg(PRC) as avg_PRC >

5. Function sql creates a SQL statement dynamically.

Syntax: sql(select, from, [where], [groupBy], [groupFlag], [csort], [ascSort], [having], [orderBy], [ascOrder], [limit], [hint])

$ symbol = take(`GE,6) join take(`MSFT,6) join take(`F,6)
$ date=take(take(2017.01.03,2) join take(2017.01.04,4), 18)
$ price=31.82 31.69 31.92 31.8  31.75 31.76 63.12 62.58 63.12 62.77 61.86 62.3 12.46 12.59 13.24 13.41 13.36 13.17
$ volume=2300 3500 3700 2100 1200 4600 1800 3800 6400 4200 2300 6800 4200 5600 8900 2300 6300 9600
$ t1 = table(symbol, date, price, volume);

$ t1;

symbol

date

price

volume

GE

2017.01.03

31.82

2300

GE

2017.01.03

31.69

3500

GE

2017.01.04

31.92

3700

GE

2017.01.04

31.8

2100

GE

2017.01.04

31.75

1200

GE

2017.01.04

31.76

4600

MSFT

2017.01.03

63.12

1800

MSFT

2017.01.03

62.58

3800

MSFT

2017.01.04

63.12

6400

MSFT

2017.01.04

62.77

4200

MSFT

2017.01.04

61.86

2300

MSFT

2017.01.04

62.3

6800

F

2017.01.03

12.46

4200

F

2017.01.03

12.59

5600

F

2017.01.04

13.24

8900

F

2017.01.04

13.41

2300

F

2017.01.04

13.36

6300

F

2017.01.04

13.17

9600

$ x=5000
$ whereConditions = [<symbol=`MSFT>,<volume>x>]
$ havingCondition = <sum(volume)>200>;

$ sql(sqlCol("*"), t1);
< select * from t1 >

$ sql(sqlCol("*"), t1, whereConditions);
< select * from t1 where symbol == "MSFT",volume > x >

$ sql(select=sqlColAlias(<avg(price)>), from=t1, where=whereConditions, groupBy=sqlCol(`date));
< select avg(price) as avg_price from t1 where symbol == "MSFT",volume > x group by date >

$ sql(select=sqlColAlias(<avg(price)>), from=t1, groupBy=[sqlCol(`date),sqlCol(`symbol)]);
< select avg(price) as avg_price from t1 group by date,symbol >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0);
< select symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, csort=sqlCol(`volume), ascSort=0);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date csort volume desc >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, having=havingCondition);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date having sum(volume) > 200 >

$ sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0);
< select * from t1 where symbol == "MSFT",volume > x order by date desc >

$ sql(select=sqlCol("*"), from=t1, limit=1);
< select top 1 * from t1 >

$ sql(select=sqlCol("*"), from=t1, groupBy=sqlCol(`symbol), groupFlag=0, limit=1);
< select top 1 * from t1 context by symbol >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0, hint=HINT_KEEPORDER);
< select [128] symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

A convenient and flexible way to generate complicated queries dynamically is to define a function that calls function sql.

$ def f1(t, sym, x){
$    whereConditions=[<symbol=sym>,<volume>x>]
$    return sql(sqlCol("*"),t,whereConditions).eval()
$ };

$ f1(t1, `MSFT, 5000);

symbol

date

price

volume

MSFT

2017.01.04

63.12

6400

MSFT

2017.01.04

62.3

6800

$ f1(t1, `F, 9000);

symbol

date

price

volume

F

2017.01.04

13.17

9600

$ def f2(t, sym, colNames, filterColumn, filterValue){
$    whereConditions=[<symbol=sym>,expr(sqlCol(filterColumn),>,filterValue)]
$    return sql(sqlCol(colNames),t,whereConditions).eval()
$ };

$ f2(t1,`GE, `symbol`date`volume, `volume, 3000);

symbol

date

volume

GE

2017.01.03

3500

GE

2017.01.04

3700

GE

2017.01.04

4600

$ f2(t1,`F, `symbol`date`volume,`price,13.2);

symbol

date

volume

F

2017.01.04

8900

F

2017.01.04

2300

F

2017.01.04

6300

6. Function partial creates a partial application.

Syntax: partial(func, args…)

$ partial(add,1)(2);
3

$ def f(a,b):a pow b
$ g=partial(f, 2)
$ g(3);
8

7. Function makeCall calls a function with the specified parameters to generate a piece of script. The difference between template functions call and makecall is that makecall doesn’t execute the script.

In the following example, we create a function generateReport that reports selected columns from a given table with specified format.

$ def generateReport(tbl, colNames, colFormat): sql(sqlColAlias(each(makeCall{format},sqlCol(colNames),colFormat),colNames), tbl).eval()
$ t = table(1..100 as id, (1..100 + 2018.01.01) as date, rand(100.0, 100) as price, rand(10000, 100) as qty, rand(`A`B`C`D`E`F`G, 100) as city);
$ t;

id

date

price

qty

city

1

2018.01.02

99.662328

6002

D

2

2018.01.03

24.965461

5836

B

3

2018.01.04

77.006418

5

G

4

2018.01.05

93.930603

5141

G

5

2018.01.06

70.994914

5778

C

6

2018.01.07

11.221769

9403

G

7

2018.01.08

59.387621

4632

G

8

2018.01.09

86.830752

4574

C

9

2018.01.10

53.928317

8397

G

10

2018.01.11

21.123212

6615

B

$ generateReport(t, ["id", "date","price","qty"], ["0", "MM/dd/yyyy", "0.00", "#,###"]);

id

date

price

qty

1

01/02/2018

16.32

9,972

2

01/03/2018

63.10

7,785

3

01/04/2018

30.43

3,629

4

01/05/2018

33.57

2,178

5

01/06/2018

61.12

9,406

6

01/07/2018

43.29

6,955

7

01/08/2018

54.97

9,914

8

01/09/2018

86.20

6,696

9

01/10/2018

90.82

6,141

10

01/11/2018

4.29

3,774

The equivalent SQL script for the previous execution is:

$ def generateReportSQL(tbl, colNames, colFormat): sql(sqlColAlias(each(makeCall{format},sqlCol(colNames),colFormat),colNames), tbl)
$ generateReportSQL(t, ["id", "date","price","qty"], ["0", "MM/dd/yyyy", "0.00", "#,###"]);
< select format(id, "0") as id,format(date, "MM/dd/yyyy") as date,format(price, "0.00") as price,format(qty, "#,###") as qty from t >

3 Ways to construct a function to execute a SQL statement:

Example 1:

$ def f1(t, sym, x): select * from t where name=sym, vol>x;
$ f1(t1, `MSFT, 7000);

name

date

PRC

vol

MSFT

2017.01.03

63.12

8800

MSFT

2017.01.03

62.58

7800

Alternatively, we can use the function eval with a block of metacode.

Example 2:

$ def f2(t, sym, x): eval(<select * from t where name=`MSFT, vol>x >);
$ f2(t1,`MSFT, 7000);

name

date

PRC

vol

MSFT

2017.01.03

63.12

8800

MSFT

2017.01.03

62.58

7800

We can also define a function to generate a SQL statement with function sql. The SQL statement will be executed when the function is called. For more complicated query generation, this is much more convenient and flexible.

Example 3

$ def f3(t, sym, x){
$ whereConditions=[<name=sym>,<vol>x>]
$ return sql(sqlCol("*"),t,whereConditions).eval()
$ };

$ f3(t1, `MSFT, 7000);

name

date

PRC

vol

MSFT

2017.01.03

63.12

8800

MSFT

2017.01.03

62.58

7800

$ f3(t1, `F, 9000);

name

date

PRC

vol

F

2017.01.04

13.17

Among the 3 methods, function sql is the most flexible. It can be used dynamically to construct SQL statements.

$ def f4(t, sym, colNames, filterColumn, filterValue){
$ whereConditions=[<name=sym>,expr(sqlCol(filterColumn),>,filterValue)]
$ return sql(sqlCol(colNames),t,whereConditions).eval()
$ };
$ f4(t1,`MSFT, `name`date`vol, `vol, 7000);

name

date

vol

MSFT

2017.01.03

8800

MSFT

2017.01.03

7800

$ f4(t1,`F, `name`date`vol,`PRC,13.2);

name

date

vol

F

2017.01.04

8900

F

2017.01.04

2300

F

2017.01.04

6300

8. Function binaryExpr generates metacode of binary expression.

Syntax: binaryExpr(X, Y, optr)

$ t = table(reshape(rand(1.0, 200), 20:10));
$ n=10;
$ weights = array(ANY, n).fill!(0..(n-1), 1..n)\n;
$ names = t.colNames();
$ tp = binaryExpr(sqlCol(names), weights, *);
$ tp;
(< col0 * 0.1 >,< col1 * 0.2 >,< col2 * 0.3 >,< col3 * 0.4 >,< col4 * 0.5 >,< col5 * 0.6 >,< col6 * 0.7 >,< col7 * 0.8 >,< col8 * 0.9 >,< col9 * 1 >)

9. Function unifiedExpr generates metacode of multivariate expression.

Syntax: unifiedExpr(objs, optrs)

The following example generates metacode of SQL select statement and calculates the weighted sum of each column.

$ selects = sqlColAlias(unifiedExpr(binaryExpr(sqlCol(names), weights, *), take(+, n-1)), "weightedSum");
$ re = sql(selects, t);
$ re;
< select (col0 * 0.1) + (col1 * 0.2) + (col2 * 0.3) + (col3 * 0.4) + (col4 * 0.5) + (col5 * 0.6) + (col6 * 0.7) + (col7 * 0.8) + (col8 * 0.9) + (col9 * 1) as weightedSum from tc0ccbd6a00000000 >
$ re.eval()

weightedSum

2.1239

3.5725

3.5009

3.3487

2.8268

1.9753

2.4287

2.9222

3.0469

2.9751

2.8848

2.993

2.7185

2.2578

2.6231

2.3871

3.0311

2.0183

2.897

2.5982

10. Function makeUnifiedCall generates metacode for function call. Different from higher-order functions unifiedCall and makeUnifiedCall , makeUnifiedCall doesn’t execute the metacode.

Syntax: makeUnifiedCall(func, args)

The following example generates metacode equivalent to that of Example 9:

$ selects = sqlColAlias(makeCall(flatten, makeCall(dot, makeUnifiedCall(matrix, sqlCol(names)), 1..n\n)), "weightedSum");
$ re = sql(selects, t);
$ re;
< select flatten(dot(matrix(col0, col1, col2, col3, col4, col5, col6, col7, col8, col9), [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1])) as weightedSum from tc0ccbd6a00000000 >
$ re.eval();

weightedSum

2.1239

3.5725

3.5009

3.3487

2.8268

1.9753

2.4287

2.9222

3.0469

2.9751

2.8848

2.993

2.7185

2.2578

2.6231

2.3871

3.0311

2.0183

2.897

2.5982

Use Metaprogramming in reporting

We can use function format for reporting. It has 2 arguments: a table column and the corresponding format string. The format string follows number and datetime formatting in Java. We can use 6 symbols (0/#/,/./%/E) to represent the format of a number and 9 symbols (y/M/d/H/h/m/s/S/n) to represent the format of a temporal object. For examples:

Input

Format

Output

2012.12.05T15:30:59.125

yyyy-MM-dd HH:mm

2012-12-05 15:30

2012.12.05

MMMddyyyy

DEC052012

0.1356

0.0%

13.6%

1234567.42

#,###.0

1,234,567.4

1234567.42

0.00####E0

1.234567E6

The example below generates a report dynamically given a table, a list of column names, and a list of column formats. Metaprogramming functions sqlCol, sqlColAlias, sql, and makeCall are used to generate a SQL statement and then function eval is called to generate a report. makeCall produces a piece of code that makes a function call. The first parameter is a function and the other parameters are required arguments.

$ def generateReport(tbl, colNames, colFormat){
$    colCount = colNames.size()
$    colDefs = array(ANY, colCount)
$    for(i in 0:colCount){
$       if(colFormat[i] == "")
$          colDefs[i] = sqlCol(colNames[i])
$       else
$          colDefs[i] = sqlCol(colNames[i], format{,colFormat[i]})
$    }
$    return sql(colDefs, tbl).eval()
$ }

Generate a sample table with 100 rows and 4 columns (id, date, price, and qty) and then call function generateReport to generate a report.

$ t = table(1..100 as id, (1..100 + 2018.01.01) as date, rand(100.0, 100) as price, rand(10000, 100) as qty);
$ t;

id

date

price

qty

1

2018.01.02

61.483376

8733

2

2018.01.03

21.254616

8365

3

2018.01.04

37.408301

444

4

2018.01.05

70.608384

9944

5

2018.01.06

80.361811

7185

$ generateReport(t, ["id", "date","price","qty"], ["0", "MM/dd/yyyy", "0.00", "#,###"]);

id

date

price

qty

1

01/02/2018

61.48

8,733

2

01/03/2018

21.25

8,365

3

01/04/2018

37.41

444

4

01/05/2018

70.61

9,944

5

01/06/2018

80.36

7,185

We can use metaprogramming to generate a filtering condition (i.e., where clause in a SQL statement).

$ def generateReportWithFilter(tbl, colNames, colFormat, filter){
$    colCount = colNames.size()
$    colDefs = array(ANY, colCount)
$    for(i in 0:colCount){
$       if(colFormat[i] == "")
$          colDefs[i] = sqlCol(colNames[i])
$       else
$          colDefs[i] = sqlCol(colNames[i], format{,colFormat[i]})
$    }
$    return sql(colDefs, tbl, filter).eval()
$ }
$ generateReportWithFilter(t, ["id","date","price","qty"], ["","MM/dd/yyyy", "00.00", "#,###"], < id>10 and price<20 >);

id

date

price

qty

11

01/12/2018

11.21

7,033

18

01/19/2018

09.97

6,136

29

01/30/2018

06.33

3,834

31

02/01/2018

05.52

6,851

38

02/08/2018

14.55

7,482