I have been continuing my earlier virtual table prototype and it is slowly coming along. I uploaded a version to codeberg while I was debugging python packaging. I do not have it quite working right, but I think I will be using setuptools-rust for now (pr ).
This post is partly to help document what I have discovered over time so will not be complete code examples in every case. In the future I will be uploading the project to Codeberg and later to PyPi.
Most of my rust code is likely not great, so would appreciate a mentor!
Django Updates
While digging into some of the Django classes a bit more, I found some methods under the schema editor I could use. This results in a more full table definition generated by Django itself that we can use.
@receiver(connection_created)
def define_tables(connection: DatabaseWrapper, **kwargs):
with connection.schema_editor() as schema_editor:
for app_model in apps.get_models():
# We only want to process models that require our specific vendor
# otherwise we assume it's a regular sqlite model
if app_model._meta.required_db_vendor == connection.vendor:
# We can use django to generate a sqlite ready create table statement
sql, _params = schema_editor.table_sql(app_model)
# We use our base directory setting plus the models label to
# as the search path for our markdown files
model_path = settings.CONTENT_DIR / app_model._meta.label
create_table = f"""
CREATE VIRTUAL TABLE IF NOT EXISTS temp.{app_model._meta.db_table}
USING {connection.vendor}(schema='{sql}', path='{model_path}');
""".strip()
schema_editor.execute(create_table)
class MyModel(models.Model):
...
class Meta:
required_db_vendor = "markdowndb"
managed = False
In general, there is not much that Django needs to know about our specialized database.
Implementing SELECT in Rust
xConnect
One of the first things we need to implement, is the xConnect
method.
This is where we parse our schema
and path
arguments
Internally, most things use numeric IDs, but our driver will be using column names.
We can use the sqlite3-parser
crate to parse out the schema and give us a mapping that.
Currently I’m using a Vec<String>
but I may do a different structure later that maintains more type information.
#[derive(Debug, Clone)]
pub struct Schema(Vec<String>);
impl Schema {
/// Given a schema declration, return a mapping of columns
pub fn parse(schema: &str) -> Result<Self> {
...
}
xOpen, xFilter, xNext
xOpen
is called for many of our queries.
This is where we create an instance of our cursor object to track where we are in a SELECT
statement.
xFilter
is part of the start of our SELECT
operation.
This is where we would figure out what our cursor needs to point to and do filtering.
In our case, we create a walker object that will search our path
for all *.markdown
files that we can find.
xNext will then be called each time we’re ready to read the next row.
Our basic operation looks like this now.
xOpen
to create an instance of our CursorxFilter
to get our list of files and open itxNext
each time we advance and open the next file
xColumn and reading data
Instead of returning a row at a time, the virtual table driver returns a column at a time.
Our xFilter
and xNext
are used to point us at the current markdown file.
Within the xColumn
method, we need to interpret the request for a numeric column, to a name field that we understand.
fn column(&self, ctx: &mut Context, i: c_int) -> Result<()> {
// Given our schema, convert the numeric column index to a
// key that we can look up
let key = self.vtab().schema.index(i as usize);
// Then use this key to look up the specific field in our open
// file handle
let value = self.handle()?.get(key);
// Then lastly we set this value in a sqlite context that will be
// Returned to the client
ctx.set_result(value);
}
Looking up a value in frontmatter
I am currently using the grey_matter crate to parse a files markdown frontmatter and return it in a format that we can use.
Most of the time, any of our key lookups will be for the markdown frontmatter itself, but there are also a few internal keys that I am implementing.
impl Post {
pub fn new(path: PathBuf) -> Self {
let contents = fs::read_to_string(&path).expect("Read File");
let mut matter = Matter::<YAML>::new();
matter.excerpt_delimiter = Some("<!--more-->".to_string());
Self {
path,
parsed: matter.parse(&contents),
}
}
pub fn get(&self, key: &String) -> Option<Pod> {
// First we check to see if we have a valid value in our parsed ata
match &self.parsed.data {
Some(Pod::Hash(hash)) => hash.get(key).cloned(),
_ => None,
}
// Otherwise, we look for a builtin value
// I am using inline functions here for readability
.or(match key.as_str() {
// Full Path to the markdown file
"path" => self.pod_path(),
// Relative directory path to our base directory
"dir" => self.pod_dir(),
// date of the file in case there is no date in frontmatter
"date" => self.pod_date(),
// This is the post body of document
"content" => self.pod_content(),
// A useful slug which is the filename without extension
"slug" => self.pod_slug(),
// This is anything before a <!--more--> tag
"excerpt" => self.pod_excerpt(),
// This returns the full frontmatter data
"metadata" => self.parsed.data.clone(),
// Returns the inode of a file as a potentially useful primary key
"inode" => self.pod_inode(),
_ => None,
})
}
}
The grey_matter crate uses a custom Pod
type to represent a dynamic structure (similar to a Value
type in a json crate), so in our xColumn
we need to convert this to a type that sqlite can handle.
match value {
// Send this back as a json list, but we might do a separated
// list or something else later
Some(Pod::Array(array)) => {
let json_str: Value = Pod::Array(array).deserialize().unwrap();
ctx.set_result::<String>(&json_str.to_string())
}
// Hash is easiest to send back as json
Some(Pod::Hash(hash)) => {
let json_str: Value = Pod::Hash(hash).deserialize().unwrap();
ctx.set_result::<String>(&json_str.to_string())
}
// Anything else we can typically use regular types
Some(Pod::Boolean(bool)) => ctx.set_result(&bool),
Some(Pod::Float(float)) => ctx.set_result(&float),
Some(Pod::Integer(integer)) => ctx.set_result(&integer),
Some(Pod::Null) => Ok(()),
Some(Pod::String(string)) => ctx.set_result(&string),
None => Ok(()),
}
Back in Django
Now that I have most of the initial bits implemented in my driver, I can now make a model like this in Django.
from django.db import models
class FrontmatterModel(models.Model):
inode = models.PositiveIntegerField(primary_key=True, editable=False)
path = models.FilePathField(unique=True, editable=False)
slug = models.SlugField()
# Full text body. Might use a MarkdownField in the future
content = models.TextField()
# Anything before the <!--more-->
excerpt = models.TextField(blank=True)
metadata = models.JSONField()
# I could also rename this if I wanted
# frontmatter = models.JSONField(db_column='metadata')
# as implemented in my Post::get in rust, any lookups will
# check the frontmatter first, and will fallback to other
# internal values if they exist.
class Meta:
managed = False
required_db_vendor = "markdowndb"
Future: Implementing DELETE, INSERT, DELETE
I am planning on adding in the rest of these later, and while I have made some initial progress, I will likely document those in a future post.
Similar to my demo project, I will likely upload my work-in-progress version to Codeberg in the near future, though it will likely be more time until I’m ready to upload something to PyPi.