﻿ Example 5. Pivot: convenient for data manipulation

# Example 5. Pivot: convenient for data manipulation

DolphinDB offers a template function pivot, a function unpivot, and a SQL clause pivot by to change the dimensions of the data.

We will use the following table t1 to illustrate the usage of pivot, unpivot, and pivot by.

syms=`BIDU`MSFT`ORCL\$symbol

sym=syms[0 0 0 0 0 0 0 1 1 1 1 1 1 1 2 2 2 2 2 2 2]

price=172.12 170.32 172.25 172.55 175.1 174.85 174.5 36.45 36.15 36.3 35.9 36.5 37.15 36.9 40.1 40.2 40.25 40.15 40.1 40.05 39.95

qty=100* 10 3 7 8 25 6 10 4 5 1 2 8 6 10 2 2 5 5 4 4 3

trade_time=09:40:00+1 30 65 90 130 185 195 10 40 90 140 160 190 200 5 45 80 140 170 190 210

t1;

---- ------ ---- ----------

BIDU 172.12 1000 09:40:01

BIDU 170.32 300  09:40:30

BIDU 172.25 700  09:41:05

BIDU 172.55 800  09:41:30

BIDU 175.1  2500 09:42:10

BIDU 174.85 600  09:43:05

BIDU 174.5  1000 09:43:15

MSFT 36.45  400  09:40:10

MSFT 36.15  500  09:40:40

MSFT 36.3   100  09:41:30

MSFT 35.9   200  09:42:20

MSFT 36.5   800  09:42:40

MSFT 37.15  600  09:43:10

MSFT 36.9   1000 09:43:20

ORCL 40.1   200  09:40:05

ORCL 40.2   200  09:40:45

ORCL 40.25  500  09:41:20

ORCL 40.15  500  09:42:20

ORCL 40.1   400  09:42:50

ORCL 40.05  400  09:43:10

ORCL 39.95  300  09:43:30

To rearrange a table column as a matrix based on two other columns in the same table. we can use the template function pivot.

To count the number of observations within each minute for each stock:

BIDU MSFT ORCL

---- ---- ----

09:40m|2    2    2

09:41m|2    1    1

09:42m|1    2    2

09:43m|2    2    2

To display the last observation within each minute for each stock:

BIDU   MSFT  ORCL

------ ----- -----

09:40m|170.32 36.15 40.2

09:41m|172.55 36.3  40.25

09:42m|175.1  36.5  40.1

09:43m|174.5  36.9  39.95

We can also use the SQL clause pivot by for the example above.

select last(price) from t1 pivot by minute(trade_time), sym;

----------------- ------ ----- -----

09:40m            170.32 36.15 40.2

09:41m            172.55 36.3  40.25

09:42m            175.1  36.5  40.1

09:43m            174.5  36.9  39.95

The function unpivot can be useful when we would like to rearrange 2 columns in a table as 1 column. Please note that in the following example, a new column is created for the column names of col1 and col2.

t=table(1..3 as id, 4..6 as col1, 7..9 as col2);

t;

id col1 col2

-- ---- ----

1  4    7

2  5    8

3  6    9

t.unpivot(`id, `col1`col2);

id valueType value

-- --------- -----

1  col1      4

2  col1      5

3  col1      6

1  col2      7

2  col2      8

3  col2      9

Finally, we give an example of calculating average pairwise correlation of daily stock returns for 4 stocks with 5 days of observations with the help of the template function pivot.

x=2017.03.27..2017.03.31

date=take(x,20)

symbol=take(`A`B`C`D, 20).sort()

returns=0.0121 0.0059 -0.0215 -0.004 0.0312 0.0019 0.0004 -0.0321 -0.0098 -0.0001 0.0003 0.0078 -0.0276 -0.0021 0.0219 0.0087 -0.0003 -0.0127 -0.0022 0.0186

t = table(date, symbol, returns);

t;

date       symbol returns

---------- ------ -------

2017.03.27 A      0.0121

2017.03.28 A      0.0059

2017.03.29 A      -0.0215

2017.03.30 A      -0.004

2017.03.31 A      0.0312

2017.03.27 B      0.0019

2017.03.28 B      0.0004

2017.03.29 B      -0.0321

2017.03.30 B      -0.0098

2017.03.31 B      -0.0001

2017.03.27 C      0.0003

2017.03.28 C      0.0078

2017.03.29 C      -0.0276

2017.03.30 C      -0.0021

2017.03.31 C      0.0219

2017.03.27 D      0.0087

2017.03.28 D      -0.0003

2017.03.29 D      -0.0127

2017.03.30 D      -0.0022

2017.03.31 D      0.0186

We can use pivot to transform the column returns to a matrix with each row for a date and each column for a stock

retMat = pivot(last, t.returns, t.date, t.symbol);

retMat;

A       B       C       D

------- ------- ------- -------

2017.03.27|0.0121  0.0019  0.0003  0.0087

2017.03.28|0.0059  0.0004  0.0078  -0.0003

2017.03.29|-0.0215 -0.0321 -0.0276 -0.0127

2017.03.30|-0.004  -0.0098 -0.0021 -0.0022

2017.03.31|0.0312  -0.0001 0.0219  0.0186

Next, with the help of the template cross (:C), calculate the correlation between any of the two stocks. The result is a square matrix.

corrs = retMat corr :C retMat;

corrs;

A        B        C        D

-------- -------- -------- --------

A|1        0.844171 0.945889 0.98557

B|0.844171 1        0.882214 0.800108

C|0.945889 0.882214 1        0.891941

D|0.98557  0.800108 0.891941 1

Setting all diagonal elements of the correlation matrix to be NULL:

rfor(i in 0:ows(corrs)) corrs[i,i]=NULL;

corrs;

A        B        C        D

-------- -------- -------- --------

A|         0.844171 0.945889 0.98557

B|0.844171          0.882214 0.800108

C|0.945889 0.882214          0.891941

D|0.98557  0.800108 0.891941

Finally, calculate the average pairwise correlation. We need to rearrange the matrix corrs to a vector first.

corrs.flatten().avg();

0.891649