How to add a database in AppScale

This blog discusses how to add a datastore in AppScale ("datastore" and "database" are interchangeably used). There are three primary procedures which must be automated by the developer: installing, starting and stopping the datastore. Installation is done using shell scripts. Starting and stopping must be written in ruby (the AppController's language). Moreover, the AppScale DB interface must be implemented using a python interface.

Reference Code
There are currently nine different datastores already implemented in AppScale. Each one of these can serve as an example as to how to best integrate your given datastore. There is however a limitation with some datastores which do not have the capability to do range queries or the ability to get an entire table. For these datastores you must use the dhash interface. The dhash interface shards the key space amongst 16 special keys within the datastore to get around this limitation, but these datastores do not scale as well because each put must access these special keys.  
Datastore which use the dhash interface:
  • MemcacheDB (master/slave, written in C)
  • Voldemort (peer to peer, Java)
  • SimpleDB
  • Scalaris
Datastores which use the regular DB interface:
  • Cassandra (peer to peer, Java)
  • HBase (master/slave, Java)
  • Hypertable (master/slave, C++)
  • MongoDB (master/slave, C++)
  • MySQL (peer to peer, C++)
Code Locations
Starting, Stopping, and AppDB Interface paths:
appscale/AppDB/
appscale/AppDB/dbinterface.py
appscale/AppDB/dhash_datastore.py
appscale/AppDB/dbname/
appscale/AppDB/dbname/py_dbname.py
appscale/AppDB/dbname/dbname_helper.rb
appscale/AppDB/dbname/prime_dbname.py
appscale/AppDB/datastore_tester.py
appscale/AppDB/dbname/templates/
appscale/AppDB/dbname/patches/

Installation paths:
appscale/debian/appscale_install_functions.sh
appscale/debian/appscale_install.sh
appscale/debian/control.all
appscale/debian/makedeb_all.sh
appscale/debian/rules.dbname

Tools:
appscale-tools/bin/appscale-run-instances

Installing the Datastore
The scripts needed to install the datastore are to go in appscale/debian/. Here you will see shell scripts for automating installation. Grep the code in this folder for an example database for reference.

Initializing and Stopping the Datastore
The datastore you may be creating may need to have configuration files custom made for each spawning. All configuration files, or templates for them must go into appscale/AppDB/dbname/templates. The function in dbname_helper.rb named setup_db_config_files should use these templates. This function has the master ip, slave ips, and credentials (dictionary of additional args) passed to it. See a reference helper file for the functions which must be implemented.

AppScale DB Interface
The interface is a template for the following functions:
get_entity(table_name, row_key, column_names)
put_entity(table_name, row_key, column_names, cell_values)
get_table(table_name, column_names)
delete_entity(table_name, row_key)
get_schema(table_name)
delete_table(table_name)

The interface is very particular as to what is expected for each template function. Fully understand one of the reference implementations before implementing a new one.

AppScale Tools
Add the new database name into the run instance script.

Testing
Beyond trying out multiple applications and seeing if they behave correctly, there is also the datastore_tester.py in appscale/AppDB/.
Run this with args: -t <dbname>
This will check to make sure the peculiarities of the interface are correctly implemented.
§


Posterous theme by Cory Watilo