Uploaded code to Codeberg
Sqlite has support for virtual tables which can be used for external datasources. There is an example csv data source that allows you to query a csv file as if it was a regular table.
.load ./csv
CREATE VIRTUAL TABLE temp.csv USING csv(filename=FILENAME);
SELECT * FROM csv;
I do not really want to write C for this, but I found the rusqlite project which allows writing a loadable_extension that provides virtual tables. Using their vtablog example (slightly modified) I was able to test from sqlite with a rust module.
.load target/debug/libsqlite_test.dylib
CREATE VIRTUAL TABLE temp.log USING vtablog(
schema='CREATE TABLE x(a,b,c)',
rows=5
);
SELECT * FROM log;
VTabLog::create(tab=1, args=[[118, 116, 97, 98, 108, 111, 103], [116, 101, 109, 112], [108, 111, 103], [115, 99, 104, 101, 109, 97, 61, 39, 67, 82, 69, 65, 84, 69, 32, 84, 65, 66, 76, 69, 32, 120, 40, 97, 44, 98, 44, 99, 41, 39], [114, 111, 119, 115, 61, 53]]):
VTabLog::best_index(1)
VTabLog::open(tab=1, cursor=1)
VTabLogCursor::filter(tab=1, cursor=1)
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=0): a0
VTabLogCursor::column(tab=1, cursor=1, i=1): b0
VTabLogCursor::column(tab=1, cursor=1, i=2): c0
VTabLogCursor::next(tab=1, cursor=1): rowid 0 -> 1
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=0): a1
VTabLogCursor::column(tab=1, cursor=1, i=1): b1
VTabLogCursor::column(tab=1, cursor=1, i=2): c1
VTabLogCursor::next(tab=1, cursor=1): rowid 1 -> 2
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=0): a2
VTabLogCursor::column(tab=1, cursor=1, i=1): b2
VTabLogCursor::column(tab=1, cursor=1, i=2): c2
VTabLogCursor::next(tab=1, cursor=1): rowid 2 -> 3
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=0): a3
VTabLogCursor::column(tab=1, cursor=1, i=1): b3
VTabLogCursor::column(tab=1, cursor=1, i=2): c3
VTabLogCursor::next(tab=1, cursor=1): rowid 3 -> 4
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=0): a4
VTabLogCursor::column(tab=1, cursor=1, i=1): b4
VTabLogCursor::column(tab=1, cursor=1, i=2): c4
VTabLogCursor::next(tab=1, cursor=1): rowid 4 -> 5
VTabLogCursor::eof(tab=1, cursor=1): true
VTabLogCursor::drop(tab=1, cursor=1)
a b c
-- -- --
a0 b0 c0
a1 b1 c1
a2 b2 c2
a3 b3 c3
a4 b4 c4
VTabLog::drop(1)
With the extension verified, the next step is to test the Django side of things. I am basing this off the sqlite driver, but I need to add a bit of additional code, so my first step is to update the constructor.
from django.db.backends.sqlite3.base import DatabaseWrapper as SqliteWrapper
from django.utils.asyncio import async_unsafe
from sqlite3 import Connection
class DatabaseWrapper(SqliteWrapper):
vendor = "customdb"
display_name = "Custom Database"
@async_unsafe
def get_new_connection(self, conn_params):
conn: Connection = super().get_new_connection(conn_params)
conn.enable_load_extension(True)
conn.load_extension("target/debug/libsqlite_git_table.dylib")
return conn
We could put additional code in our get_new_connection
method, but I decided to use the connection_created
signal to place code for our tables.
My first version is very naive and lacks any kind of check, but I add code here to do the table creation.
from django.db.backends.signals import connection_created
from django.dispatch import receiver
from django.apps import apps
@receiver(connection_created)
def define_tables(connection, **kwargs):
for app_model in apps.get_models():
names = []
for field in app_model._meta.fields:
names.append(field.name)
schema = ",".join(names)
connection.cursor().execute(f"""
CREATE VIRTUAL TABLE IF NOT EXISTS temp.{app_model._meta.db_table}
USING vtablog(
schema='CREATE TABLE x({schema})',
rows=5
);""")
In the future I would do more checks, but this creates the table for each of our models. We then create modules as expected.
from django.db import models
class TestModel(models.Model):
a = models.CharField(primary_key=True)
b = models.CharField()
c = models.CharField()
class Meta:
db_table = "log"
managed = False
class FooModel(models.Model):
d = models.CharField(primary_key=True)
e = models.CharField()
f = models.CharField()
class Meta:
db_table = "bar"
managed = False
With all this in place, we can now redo our test from above, but with django models.
uv run dbtest dumpdata
VTabLog::create(tab=1, args=[[118, 116, 97, 98, 108, 111, 103], [116, 101, 109, 112], [108, 111, 103], [115, 99, 104, 101, 109, 97, 61, 39, 67, 82, 69, 65, 84, 69, 32, 84, 65, 66, 76, 69, 32, 120, 40, 97, 44, 98, 44, 99, 41, 39], [114, 111, 119, 115, 61, 53]]):
VTabLog::create(tab=2, args=[[118, 116, 97, 98, 108, 111, 103], [116, 101, 109, 112], [98, 97, 114], [115, 99, 104, 101, 109, 97, 61, 39, 67, 82, 69, 65, 84, 69, 32, 84, 65, 66, 76, 69, 32, 120, 40, 100, 44, 101, 44, 102, 41, 39], [114, 111, 119, 115, 61, 53]]):
VTabLog::best_index(1)
VTabLog::open(tab=1, cursor=1)
VTabLogCursor::filter(tab=1, cursor=1)
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=1): b0
VTabLogCursor::column(tab=1, cursor=1, i=2): c0
VTabLogCursor::column(tab=1, cursor=1, i=0): a0
VTabLogCursor::next(tab=1, cursor=1): rowid 0 -> 1
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=1): b1
VTabLogCursor::column(tab=1, cursor=1, i=2): c1
VTabLogCursor::column(tab=1, cursor=1, i=0): a1
VTabLogCursor::next(tab=1, cursor=1): rowid 1 -> 2
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=1): b2
VTabLogCursor::column(tab=1, cursor=1, i=2): c2
VTabLogCursor::column(tab=1, cursor=1, i=0): a2
VTabLogCursor::next(tab=1, cursor=1): rowid 2 -> 3
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=1): b3
VTabLogCursor::column(tab=1, cursor=1, i=2): c3
VTabLogCursor::column(tab=1, cursor=1, i=0): a3
VTabLogCursor::next(tab=1, cursor=1): rowid 3 -> 4
VTabLogCursor::eof(tab=1, cursor=1): false
VTabLogCursor::column(tab=1, cursor=1, i=1): b4
VTabLogCursor::column(tab=1, cursor=1, i=2): c4
VTabLogCursor::column(tab=1, cursor=1, i=0): a4
VTabLogCursor::next(tab=1, cursor=1): rowid 4 -> 5
VTabLogCursor::eof(tab=1, cursor=1): true
VTabLogCursor::drop(tab=1, cursor=1)
VTabLog::best_index(2)
VTabLog::open(tab=2, cursor=1)
VTabLogCursor::filter(tab=2, cursor=1)
VTabLogCursor::eof(tab=2, cursor=1): false
VTabLogCursor::column(tab=2, cursor=1, i=1): b0
VTabLogCursor::column(tab=2, cursor=1, i=2): c0
VTabLogCursor::column(tab=2, cursor=1, i=0): a0
VTabLogCursor::next(tab=2, cursor=1): rowid 0 -> 1
VTabLogCursor::eof(tab=2, cursor=1): false
VTabLogCursor::column(tab=2, cursor=1, i=1): b1
VTabLogCursor::column(tab=2, cursor=1, i=2): c1
VTabLogCursor::column(tab=2, cursor=1, i=0): a1
VTabLogCursor::next(tab=2, cursor=1): rowid 1 -> 2
VTabLogCursor::eof(tab=2, cursor=1): false
VTabLogCursor::column(tab=2, cursor=1, i=1): b2
VTabLogCursor::column(tab=2, cursor=1, i=2): c2
VTabLogCursor::column(tab=2, cursor=1, i=0): a2
VTabLogCursor::next(tab=2, cursor=1): rowid 2 -> 3
VTabLogCursor::eof(tab=2, cursor=1): false
VTabLogCursor::column(tab=2, cursor=1, i=1): b3
VTabLogCursor::column(tab=2, cursor=1, i=2): c3
VTabLogCursor::column(tab=2, cursor=1, i=0): a3
VTabLogCursor::next(tab=2, cursor=1): rowid 3 -> 4
VTabLogCursor::eof(tab=2, cursor=1): false
VTabLogCursor::column(tab=2, cursor=1, i=1): b4
VTabLogCursor::column(tab=2, cursor=1, i=2): c4
VTabLogCursor::column(tab=2, cursor=1, i=0): a4
VTabLogCursor::next(tab=2, cursor=1): rowid 4 -> 5
VTabLogCursor::eof(tab=2, cursor=1): true
VTabLogCursor::drop(tab=2, cursor=1)
VTabLog::drop(2)
VTabLog::drop(1)
[
{"model": "dbtest.testmodel", "pk": "a0", "fields": {"b": "b0", "c": "c0"}},
{"model": "dbtest.testmodel", "pk": "a1", "fields": {"b": "b1", "c": "c1"}},
{"model": "dbtest.testmodel", "pk": "a2", "fields": {"b": "b2", "c": "c2"}},
{"model": "dbtest.testmodel", "pk": "a3", "fields": {"b": "b3", "c": "c3"}},
{"model": "dbtest.testmodel", "pk": "a4", "fields": {"b": "b4", "c": "c4"}},
{"model": "dbtest.foomodel", "pk": "a0", "fields": {"e": "b0", "f": "c0"}},
{"model": "dbtest.foomodel", "pk": "a1", "fields": {"e": "b1", "f": "c1"}},
{"model": "dbtest.foomodel", "pk": "a2", "fields": {"e": "b2", "f": "c2"}},
{"model": "dbtest.foomodel", "pk": "a3", "fields": {"e": "b3", "f": "c3"}},
{"model": "dbtest.foomodel", "pk": "a4", "fields": {"e": "b4", "f": "c4"}}
]%
Most of this code is temporary, and all the values are random, but at least the test shows so far, that we can create a custom virtual table implementation in rust, that django can call without a lot of additional complexity.
Now that I know this works, I can start planning out my actual goal. I would like to be able to query a directory of markdown files, as if they were a database table. This would allow me to do things like query an Obsidian vault or hugo site as if they were database tables, and maybe help me migrate a few more projects forward.