==============================================================================
This is the README for Type-ARQuE

                           Copyright (C) 2010 Sami Kiminki / Aalto University
==============================================================================


Type-ARQuE is an experimental SPARQL-to-SQL query compiler. It is written
mainly to demonstrate the use of intermediate language (AQL) as a part of
the translation process, and the related benefits such as the possibility to
perform low-level optimizations (e.g. type inference, CSE, low-level SQL
access optimizations, ...) which is hard or impossible using SPARQL and/or
SPARQL algebraic level.

As such, Type-ARQuE is intended as a scientific toy tool only. In other
words, it would probably be crazy to even think about production use of any
kind. However, it is the author's hope that some of the features could provide
new insights, and eventually, help the industry to make better products.

For an academic description of the tool's internals, see our paper
Kiminki, Knuuttila, Hirvisalo. SPARQL to SQL Translation Based on an
Intermediate Query Language. Submitted.


Compiling Type-ARQUE
====================

0. Prerequisites
Type-ARQuE has been written for Linux. The known to work systems:
- Gentoo 2010.0 (x86 / amd64)
- Ubuntu 10.04 LTS (x86-64)
- Mac OS X 10.6 (Snow Leopard, tested for Type-ARQuE 0.1)
Compilation should work with little or no porting effort for other modern
distributions, too. However, recent version of GNU C++ compiler might be
required and GNU Make is probably a must. The software was developed
using GCC 4.4.4 (Gentoo 4.4.4-r1).

1. Dependencies
The dependencies:
- Raptor (tested on 1.4.20)
- MySQL client-side libraries (optional, tested on 5.0.90)
- PostgreSQL client-side libraries (optional, tested on 8.4.4)
However, you need to have at least of SQL backend for the translator to
work.

2. Configuration
Run `./configure.sh'. This checks dependencies and writes out
`Makefile.config'. Run `./configure.sh --help' for a list of command-line
switches.

The script configure.sh looks for mysql_config, pg_config and
raptor-config to determine the required compile and link flags. Use
environment variables MYSQL_CONFIG, PG_CONFIG and RAPTOR_CONFIG to force
specific locations of the configuration helpers. For example:
  PG_CONFIG=/usr/local/bin/pg_config ./configure.sh
would force the location of pg_config.


3. Compilation
Run `make' or alternatively `make -j<N>' for parallel make, where <N>
should be replaced with the number of parallel jobs. Run `make help'
for a list of makefile targets.


Quick usage / PostgreSQL
========================

1. Create user and database for testing, e.g.:
postgres=# CREATE DATABASE aql;
postgres=# CREATE USER aql;

2. Obtain CREATE-statements for SQL schema
./typearque --sqllayout=simple-indexed --sqlbackend=postgres --sqlcreate

3. Run CREATE-statements in PostgreSQL

psql -Uaql
aql=>  <copy-paste create statements here>

4. Load data
./typearque --conn='user=postgres dbname=aql' --sqlbackend=postgres \
  --sqllayout=simple-indexed --load test/ssws2010/test-data-for-queries.nt

5. Run a test
./typearque --conn='user=postgres dbname=aql' --sqlbackend=postgres \
  --parser=sparql --sqllayout=simple-indexed \
  test/ssws2010/variable-alternatives.rq


Quick usage / MySQL
===================

1. Create user and database for testing, e.g.:
mysql> CREATE DATABASE aql;
mysql> GRANT ALL ON aql.* TO 'aql'@'localhost' identified by 'aql';

2. Obtain CREATE-statements for SQL schema
./typearque --sqllayout=simple-indexed --sqlbackend=mysql --sqlcreate

3. Run CREATE-statements in MySQL

mysql -uaql -paql aql
mysql> <copy-paste create statements here>

Note that you might need to manually remove PRIMARY KEY statements from some
CREATE statements, as MySQL does not like TEXT datatypes on keys, at least
not MySQL 5.0.90. This issue concerns probably only the simple-inline layout.

4. Load data
./typearque --conn='user=aql,pass=aql,db=aql2' --sqlbackend=mysql \
  --sqllayout=simple-indexed --load test/ssws2010/test-data-for-queries.nt

5. Run a test
./typearque --conn='user=aql,pass=aql,db=aql' --sqlbackend=mysql \
  --parser=sparql --sqllayout=simple-indexed \
  test/ssws2010/variable-alternatives.rq


KNOWN LIMITATIONS
=================
Query compiler:
- Does not compile every valid SPARQL 1.0 construct. In fact, only SELECT
  form is supported.
- UNION graph patterns are unsupported.
- Lots of functions, type check keywords (isURI, ...), and LANGTAGs are
  unsupported.

Test data loader:
- The triple loader (--load switch) does not check for duplicate triples, and
  bad things will happen if the test data contains duplicates.
- The triple loader assumes that no data is already loaded into the tables.
- Even moderate-sized test datas probably exceed transaction or other limits.

Performance:
- Database interaction is unoptimized, especially the result fetching. Don't
  use this program to benchmark SPARQL queries.
