In-memory Computing

DolphinDB offers a built-in lightweight in-memory computing engine.

It is very suitable for the large concurrency computing tasks with high performance requirements. DolphinDB provides a rich data structure for in-memory computing. For details, please refer to Chapter 2: Data Types and Data Forms. This section mainly introduces the application of in-memory computing in various tables.

session

The session is a container. It has an unique ID and stores many defined objects, such as local variables, shared variables, and so on. There are many ways to create a new session, such as launching in command window, XDB connection, GUI connection or Web URL connection. All variables in a session are invisible to other sessions, unless use “share” to explicitly share variables between sessions. Currently DolphinDB only supports sharing table.

When the session is no longer in use, it needs to be closed to release the system resources. If you need to close the XDB session, use command close or set the connection variable to NULL; if you need to close the GUI or Web session, close the GUI window or browser window.

In-memory Table

The data in in-memory table is saved in memory. Query and writing into in-memory table is very fast, but the data will be lost after closing the node. It is used to save the temporary data.

There are 2 types of in-memory tables: unpartitioned and partitioned in-memory tables.

All tables created by function table are unpartitioned in-memory tables.

$ sym = `C`MS`MS`MS`IBM`IBM`C`C`C$symbol
$ price= 49.6 29.46 29.52 30.02 174.97 175.23 50.76 50.32 51.29
$ qty = 2200 1900 2100 3200 6800 5400 1300 2500 8800
$ timestamp = [09:34:07,09:36:42,09:36:51,09:36:59,09:32:47,09:35:26,09:34:16,09:34:26,09:38:12]
$ t1 = table(timestamp, sym, qty, price);

For more details about unpartitioned tables, please refer to Table.

Working with partitioned in-memory tables can utilize the parallel computing capacities of multi-core CPUs. We can create a partitioned in-memory table by specifying a partitioned in-memory database. Use function database and specify an empty string for database path.

$ n=10000
$ month=take(2000.01M..2000.12M, n)
$ x=rand(1.0, n)
$ t=table(month, x)

$ db=database("", VALUE, 2000.01M..2000.12M)
$ pt = db.createPartitionedTable(t, `pt, `month)
$ pt.append!(t)
$ select * from pt;

Use command undef to undefine an in-memory table. For example, to undefine table pt in the example above:

$ undef(`pt);

A major feature of DolphinDB in-memory computing is that SQL statements can not only manipulate tables, but also other data forms including scalars, vectors, sets, matrices and dictionaries. It significantly increases the flexibility of the scripting language.

In the following example, stock prices are stored in table quotes and stock weights are stored in the dictionary weights. A SQL statement can directly work on the table and the dictionary to calculate the value of the portfolio.

$ Symbol=take(`AAPL, 6) join take(`FB, 5)
$ Time=2019.02.27T09:45:01.000000000+[146, 278, 412, 445, 496, 789, 212, 556, 598, 712, 989]
$ Price=173.27 173.26 173.24 173.25 173.26 173.27 161.51 161.50 161.49 161.50 161.51
$ quotes=table(Symbol, Time, Price)
$ weights=dict(`AAPL`FB, 0.6 0.4)
$ ETF = select Symbol, Time, Price*weights[Symbol] as weightedPrice from quotes
$ select last(weightedPrice) from ETF pivot by Time, Symbol;

Time

AAPL

FB

2019.02.27T09:45:01.000000146

103.962

2019.02.27T09:45:01.000000212

64.604

2019.02.27T09:45:01.000000278

103.956

2019.02.27T09:45:01.000000412

103.944

2019.02.27T09:45:01.000000445

103.95

2019.02.27T09:45:01.000000496

103.956

2019.02.27T09:45:01.000000556

64.6

2019.02.27T09:45:01.000000598

64.596

2019.02.27T09:45:01.000000712

64.6

2019.02.27T09:45:01.000000789

103.962

2019.02.27T09:45:01.000000989

64.604

Shared Table

A variable in a session is invisible to other sessions unless it is shared among sessions with statement share . Currently only tables can be shared in DolphinDB. The rows or columns of a shared table cannot be deleted or revised. A shared table can be appended with new data.

$ share t1 as sharedT1;

In the example above, we share table t1 as sharedT1. In other sessions we can access t1 by visiting sharedT1.

Use command undef to undefine a shared in-memory table:

$ undef(sharedT1,SHARED)

MVCC (Multi Version Concurrency Control) Table

If we need to frequently conduct concurrent read and write on an in-memory table, but rarely need to update or delete records, DolphinDB provides a special type of in-memory table: MVCC table. An MVCC table records each operation in log files to achieve isolation of read and write. MVCC tables can be persisted to disk and loaded into memory later.

Use function mvccTable to create a multi-version concurrency control table.

$ n=5
$ syms=`IBM`C`MS`MSFT`JPM`ORCL`FB`GE
$ timestamp=09:30:00+rand(18000,n)
$ sym=rand(syms,n)
$ qty=100*(1+rand(100,n))
$ price=5.0+rand(100.0,n)
$ temp=table(timestamp,sym,qty,price)
$ t1= mvccTable(1:0,`timestamp`sym`qty`price,[TIMESTAMP,SYMBOL,INT,DOUBLE],"/home/DolphinDB/Data","t1")
$ t1.append!(temp);

In the example above, the multi-version concurrency control table t1 will be persisted to the disk under path “/home/DolphinDB/Data/t1 “.

Use function loadMvccTable to load the mvcc table into memory.

$ loadMvccTable("/home/DolphinDB/Data",t1);

timestamp

sym

qty

price

1970.01.01T00:00:39.091

MSFT

4500

99.808702

1970.01.01T00:00:35.293

FB

3600

26.644715

1970.01.01T00:00:36.334

MSFT

3800

66.754334

1970.01.01T00:00:40.362

ORCL

4800

15.480288

1970.01.01T00:00:35.565

MSFT

1700

23.107408