Khumbu Icefall

Overview

What We're Building

As tradition goes we'll build and API for a To-do app but we'll go much further than the concept of tasks. We'll have users, clients, projects, tasks and comments, all the core models to build a complete API for a project management tool. We'll also have a codename for this secret project, "Khumbu Icefall", you get extra points if you can figure out why I chose the name.

Unlike most tutorials that just explain a few basic concept with a toy example, this one will take you through all the steps of building a real-world, complex API that is ready to go into production. You've probably heard (marketing) phrases like "build a production ready API in 5 minutes", tried it and found out it's ... marketing, a production ready toy, not a real system. This tutorial is not going to take 5 minutes but it's also not going to take days to complete (or even months, because usually, that is the timeframe you are looking at when building a compareable API in other stacks). We are looking at 1-2 hours of your time. If you are in a hurry or maybe you'd like to see why you should spend two hours on this, check out the git repo.

When presented with systems like subZero, in some cases the conversation goes like "Yes, I guess it's an excellent tool for prototyping but my API is not a 1:1 map of my data models and it has a lot of business logic so I can not use it in production". In this tutorial, we'll show how a subZero based API is not coupled to your tables and how you can implement your custom business logic and even integrate/interface with 3rd party systems (and not just by calling webhooks). For most of the tutorial we'll be using the REST interface to interact with the system because it's easier to copy/paste curl lines, but while you are growing your REST API, GraphQL is right next to it.

Installation

Follow the installation instructions and familiarize yourself a bit with the project structure Shut down the stack in case you started it.

docker-compose down

Edit the file .env file and give the project a distinct name

COMPOSE_PROJECT_NAME=khumbuicefall

Bring up the system to check everything works

docker-compose up -d
curl http://localhost:8080/rest/todos?select=id

The result should look something like this

[{"id":"dG9kbzox"},
 {"id":"dG9kbzoz"},
 {"id":"dG9kbzo2"}]
Bring up the subzero dashboard
subzero dashboard
Now that we have the API and subzero dashboard up and running, let's see what we can do.

Run this request again and see it reflected on the PostgreSQL logs tab in the subzero dashboard

curl http://localhost:8080/rest/todos?select=id

Now let's make an authorized request

export JWT_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjoxLCJyb2xlIjoid2VidXNlciJ9.uSsS2cukBlM6QXe4Y0H90fsdkJSGcle9b7p_kMV1Ymk
curl -H "Authorization: Bearer $JWT_TOKEN" http://localhost:8080/rest/todos?select=id,todo
[{"id":"dG9kbzox","todo":"item_1"},
 {"id":"dG9kbzoy","todo":"item_2"},
 {"id":"dG9kbzoz","todo":"item_3"},
 {"id":"dG9kbzo2","todo":"item_6"}]

Notice the first item in the list {"id":1,"todo":"item_1"}. Open the file db/src/sample_data/data.sql and change item_1 to updated and save the file.

Run the last request again

curl -H "Authorization: Bearer $JWT_TOKEN" http://localhost:8080/rest/todos?select=id,todo
[{"id":"dG9kbzox","todo":"updated"},
 {"id":"dG9kbzoy","todo":"item_2"},
 {"id":"dG9kbzoz","todo":"item_3"},
 {"id":"dG9kbzo2","todo":"item_6"}]

This process works for any *.sql *.lua *.conf file, you just make the change, save the file and you can run your next request, assuming your changes were valid (when working with sql, look at the PostgreSQL window).

API Core

Data Schema

We'll start by creating all the tables that will hold the data and place them in a separate schema called data. There is no particular requirement to have the tables in the data schema nor do they all have to be in a single schema, this is just a convention which we think works out well, it will give us an understanding of how your models are decoupled from the API you expose.

Usually, you would create one file per table, but to make it easier to copy/paste the code, we'll add it all in a single file db/src/data/tables.sql

create table client (
  id           serial primary key,
  name         text not null,
  address      text,
  user_id      int not null references "user"(id),
  created_on   timestamptz not null default now(),
  updated_on   timestamptz
);
create index client_user_id_index on client(user_id);

create table project (
  id           serial primary key,
  name         text not null,
  client_id    int not null references client(id),
  user_id      int not null references "user"(id),
  created_on   timestamptz not null default now(),
  updated_on   timestamptz
);
create index project_user_id_index on project(user_id);
create index project_client_id_index on project(client_id);

create table task (
  id           serial primary key,
  name         text not null,
  completed    bool not null default false,
  project_id   int not null references project(id),
  user_id      int not null references "user"(id),
  created_on   timestamptz not null default now(),
  updated_on   timestamptz
);
create index task_user_id_index on task(user_id);
create index task_project_id_index on task(project_id);

create table project_comment (
  id           serial primary key,
  body         text not null,
  project_id   int not null references project(id),
  user_id      int not null references "user"(id),
  created_on   timestamptz not null default now(),
  updated_on   timestamptz
);
create index project_comment_user_id_index on project_comment(user_id);
create index project_comment_project_id_index on project_comment(project_id);

create table task_comment (
  id           serial primary key,
  body         text not null,
  task_id      int not null references task(id),
  user_id      int not null references "user"(id),
  created_on   timestamptz not null default now(),
  updated_on   timestamptz
);
create index task_comment_user_id_index on task_comment(user_id);
create index task_comment_task_id_index on task_comment(task_id);

We added user_id column on each table because it will help us later with enforcing access rights for each row with a simple rule (instead of doing complicated joins). Having two separate tables for comments (task_comment, project_comment) is not the best schema design and it's a bit forced but it will be useful for this tutorial to showcase how the API can be decoupled from the underlying tables.

Change the last lines in db/src/data/schema.sql to look like this (the ... are meant to signify there is something above)

-- ...
-- import our application models
\ir user.sql
\ir todo.sql
\ir todo_relay_id.sql
\ir tables.sql -- add this line
Note: the todo.sql definition came with the starter kit, we'll remove it later

Api Schema

This schema (api) will be the schema that we will be exposing for REST and GraphQL. It will contain only views and stored procedures.

Once again, we'll place all definitions in a single file called db/src/api/views_and_procedures.sql

create or replace view clients as
select id, name, address, created_on, updated_on from data.client;

create or replace view projects as
select id, name, client_id, created_on, updated_on from data.project;

create or replace view tasks as
select id, name, completed, project_id, created_on, updated_on from data.task;

create or replace view project_comments as
select id, body, project_id, created_on, updated_on from data.project_comment;

create or replace view task_comments as
select id, body, task_id, created_on, updated_on from data.task_comment;

create or replace view comments as
select 
  id, body, 'project'::text as parent_type, project_id as parent_id, 
  project_id, null as task_id, created_on, updated_on
from data.project_comment
union
select id, body, 'task'::text as parent_type, task_id as parent_id,
  null as project_id, task_id, created_on, updated_on
from data.task_comment;

Edit the last lines of db/src/api/schema.sql to look like this.

-- ...
-- our endpoints
create type customer as (id int, name text, email text, role text);
\ir login.sql
\ir refresh_token.sql
\ir signup.sql
\ir me.sql
\ir todos.sql
\ir search_todos.sql
\ir views_and_procedures.sql -- add this line

Notice how we were able to expose only the fields we wanted and only the tables we wanted. We could have renamed the columns also, for example maybe your front end devs turn their faces in disgust when they see something like first_name, no worries, you can make them happy and rename that column in the view to firstName and everything will work just fine. We also inherited a data schema which has two separate tables to hold the comments but for the API, we were able to expose them using a single entity.

This is how you can decouple your implementation details (the underlying source tables) from the API you expose, using views or in more complex cases, stored procedures.

Sample Data

It's no fun testing with empty tables, let's add some sample data.

Info

If you have a lot of models/tables it can get tediouse to generate a meaningfull dataset to test your api with, especially when you want to test the performance with a couple of millions of rows in the tables. You can try datafiller utility to generate some data.

Add these statements at the end of db/src/sample_data/data.sql

-- ...
set search_path = data, public;
INSERT INTO client (id,name,address,user_id) VALUES
(1,'Apple','1 Infinite Loop Cupertino, CA 95014',1),
(2,'Microsoft','One Microsoft Way Redmond, WA 98052-6399',1),
(3,'Amazon','PO Box 81226 Seattle, WA 98108-1226',2)
;

INSERT INTO project (id,name,client_id,user_id) VALUES
(1,'MacOS',1,1),
(2,'Windows',2,1),
(3,'IOS',1,1),
(4,'Office',2,1)
;

INSERT INTO task (id,name,completed,project_id,user_id) VALUES
(1,'Design a nice UI',TRUE,1,1),
(2,'Write some OS code',FALSE,1,1),
(3,'Start aggressive marketing',TRUE,2,1),
(4,'Get everybody to love it',TRUE,3,1),
(5,'Move everything to cloud',TRUE,4,1)
;

INSERT INTO project_comment (id,body,project_id,user_id) VALUES
(1,'This is going to be awesome',1,1),
(2,'We still have the marketshare, we should keep it that way',2,1)
;

INSERT INTO task_comment (id,body,task_id,user_id) VALUES
(1,'Arn''t we awesome?',1,1),
(2,'People are going to love the free automated install when they see it in the morning',3,1)
;

While we are at it, let's also change the db/src/sample_data/reset.sql file that is used when running tests

BEGIN;
\set QUIET on
\set ON_ERROR_STOP on
set client_min_messages to warning;
set search_path = data, public;
truncate data.todo restart identity cascade;
truncate data.user restart identity cascade;
truncate data.client restart identity cascade;
truncate data.project restart identity cascade;
truncate data.task restart identity cascade;
truncate data.project_comment restart identity cascade;
truncate data.task_comment restart identity cascade;
\ir data.sql
COMMIT;

Since we have some data in the tables let's see how far did we get.

curl -H "Authorization: Bearer $JWT_TOKEN" http://localhost:8080/rest/clients?select=id,name
{"hint":null,"details":null,"code":"42501","message":"permission denied for relation clients"}

Believe it or not, this is a good thing :), we added a bunch of tables and views but this does not mean the API users can just access them, we need to explicitly state who can access what resource.

Securing your API

Access rights to API entities

Let's start by giving the authenticated API users the ability to access clients/projects/tasks endpoints. This means we need to grant them rights to the views in the api schema.

Add the following statements at the end of db/src/authorization/privileges.sql file

-- ...
grant select, insert, update, delete 
on api.clients, api.projects, api.tasks, api.project_comments, api.task_comments, api.comments
to webuser;

Let's try that request again

curl -H "Authorization: Bearer $JWT_TOKEN" http://localhost:8080/rest/clients?select=id,name
[{"id":1,"name":"Apple"},
 {"id":2,"name":"Microsoft"},
 {"id":3,"name":"Amazon"}]
Look at that :)!

So we have our API, nice and decoupled, but as things stand now, anyone (that can login) is able to modify the existing data and see all of it (Amazon is not the client of our currently logged in user alice). I would not want to use a system where everyone can see all my clients and projects. We will fix that but before we get to it, an interlude.

The magic "webusers" role and the mystery $JWT token

You are probably wondering "What's up with that, where does that role come from and why this hard coded $JWT token just works on my system? It does not seem very secure to me.".

Well, there is nothing magic about it, the webuser role is created in roles.sql. For this tutorial, the default suits us just fine, we only need to differentiate between logged-in users (webuser) and the rest (anonymous). Although you can use specific roles to mean specific users (alice, bob) you'll probably use database roles to mean groups of users (admin, employee, customer). Notice how in the user table we have a column user_type and its value is webuser, this is the value that will also be reflected in the JWT payload under the key role.

Now about the mystery JWT token. I was only able to generate a valid token (that does not expire) because I knew the secret with which PostgREST is started (which is reallyreallyreallyreallyverysafe and is defined in .env). Head over to jwt.io site and paste in the token

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjoxLCJyb2xlIjoid2VidXNlciJ9.uSsS2cukBlM6QXe4Y0H90fsdkJSGcle9b7p_kMV1Ymk
You will see it's payload decoded

{
  "user_id": 1,
  "role": "webuser"
}

We use the information in the JWT payload to see who is making the request. It's important to understand that anyone can look at the JWT payload, it's just a base64 encoded string (so don't put sensitive data inside it) but they can not alter its payload without knowing the secret.

Row Level Security

We left things in the state of every logged-in user having access to all the data rows in the main tables (through views).

So let's fix that. For this, we'll use a feature introduced in PostgreSQL 9.5 called "Row Level Security", RLS for short.

Before we start defining policies for each of the tables, because the users are accessing the data in the tables through views in the api schema, we will need to explicitly specify the owner of the view. This is needed because when someone accesses a table with policies through a view, the current_user switches from the (database) logged-in role to the role of the view owner.

Add these lines at the end of db/src/api/views_and_procedures.sql

-- ...
alter view clients owner to api;
alter view projects owner to api;
alter view tasks owner to api;
alter view comments owner to api;

Just as with the webuser, api is a simple role defined here

Now we get to the interesting part, defining the policies for each table and we'll use the information contained in the JWT token to restrict the rows a user can access.

Let's open the db/authorization/privileges.sql file and add the following policies

-- ...
set search_path = data, public;

alter table client enable row level security;
grant select, insert, update, delete on client to api;
create policy access_own_rows on client to api
using ( request.user_role() = 'webuser' and request.user_id() = user_id );


alter table project enable row level security;
grant select, insert, update, delete on project to api;
create policy access_own_rows on project to api
using ( request.user_role() = 'webuser' and request.user_id() = user_id );


alter table task enable row level security;
grant select, insert, update, delete on task to api;
create policy access_own_rows on task to api
using ( request.user_role() = 'webuser' and request.user_id() = user_id );


alter table project_comment enable row level security;
grant select, insert, update, delete on project_comment to api;
create policy access_own_rows on project_comment to api
using ( request.user_role() = 'webuser' and request.user_id() = user_id );

alter table task_comment enable row level security;
grant select, insert, update, delete on task_comment to api;
create policy access_own_rows on task_comment to api
using ( request.user_role() = 'webuser' and request.user_id() = user_id );
Now that we have the policies set, let's try the last request again

curl -H "Authorization: Bearer $JWT_TOKEN" http://localhost:8080/rest/clients?select=id,name
And the result is

[{"id":1,"name":"Apple"},
 {"id":2,"name":"Microsoft"}]
Much better!

Now, for this to work when new data is inserted, we need to give a default value for the user_id column in each table.

Have this column definition for all tables in the data schema look like this

user_id      int not null references "user"(id) default request.user_id(),

Authentication

Ok, so now everything is set up, everyone can get to his data and change it, but how do they get their foot in the door, how do they signup, how do they log in. This is where the stored procedures come into play, and by that I mean any stored procedure defined in the api schema. Whenever you have to implement something that can not be expressed using a single query, i.e. it's a sequential process, you would use a stored procedure like this. There is always a temptation to design your whole API as a set of stored procedures, it seems you have greater/more targeted control but you will be putting yourself in the corner and you will have to write a new function for each new feature. Try sticking to views whenever possible and make them as general as possible then let the clients select from them with additional filters. If you can, write those functions in PL/pgSQL (but you can also use other languages), it does not have a shiny syntax and it can be weird in places but because of it's close relation to the database and SQL, it can express some problems very neatly and concise, even if you've never looked at this language before, you'll get its meaning easely.

Check out the definition of the login function. Our api schema is already includding it so let's try and call it.

curl -i \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-d '{"email":"alice@email.com","password":"pass"}' \
http://localhost:8080/rest/rpc/login
HTTP/1.1 200 OK
Server: openresty
Date: Tue, 10 Sep 2019 08:21:10 GMT
Content-Type: application/vnd.pgrst.object+json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Access-Control-Allow-Methods: GET, POST, PATCH, DELETE, OPTIONS
Access-Control-Allow-Credentials: true
Content-Range: 0-0/*
Set-Cookie: SESSIONID=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJyb2xlIiA6ICJ3ZWJ1c2VyIiwgInVzZXJfaWQiIDogMSwgImV4cCIgOiAxNTY4MTA3MjcxfQ.cCpDjmMfPkGrajdRTofAj67zRDRgxf4RmtdmCbGoS7g; Expires=Tue, 10 Sep 2019 09:21:10 GMT; Max-Age=3600; Path=/; HttpOnly
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Content-Location: /rest
Cache-Engine: "nginx"

{"id":1,"name":"alice","email":"alice@email.com","role":"webuser"}

Input Validation

Back in the old days, the DB was the gatekeeper, it made sure who had access to what and most importantly it did not let any bad data get it (assuming the DBA knew what he was doing). It did not matter how you accessed the database and from what language, you could not write asdf in a field that was supposed to hold a phone number. Well, we are going to bring that back again, we are going to make databases great again :) and we are going to do that by using constraints. No more validators all over the place that check the user input (and that you always forget to implement in your new shiny client that accesses the database). By placing a constraint on a field, no one can bypass it, even if he really wants to.

You can read more about constraints here but for our case, I'll just paste the additional lines you need to add to the table definitions and notice how nice the syntax is. You can use in constraint definitions any valid SQL expression that returns a bool, even custom functions defined by you or even match against a regular expression, for example:

create table some_table (
  ...
  check (email ~* '^[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+[.][A-Za-z]+$')
);

Here are a few examples of constraints one might choose to set up. Add these changes to db/src/data/tables.sql

create table client (
  ...
  check (length(name)>2 and length(name)<100),
  check (updated_on is null or updated_on > created_on)
);

create table project (
  ...
  check (length(name)>2),
  check (updated_on is null or updated_on > created_on)
);

create table task (
  ...
  check (length(name)>2),
  check (updated_on is null or updated_on > created_on)
);

create table task_comment (
  ...
  check (length(body)>2),
  check (updated_on is null or updated_on > created_on)
);

create table project_comment (
  ...
  check (length(body)>2),
  check (updated_on is null or updated_on > created_on)
);

Let's check if the rules are applied

curl \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-H 'Prefer: return=representation' \
-d '{"name":"A"}'  \
http://localhost:8080/rest/clients
{"hint":null,"details":null,"code":"42501","message":"permission denied for sequence client_id_seq"}

Woops :) again, a good thing :) the database is very specific and strict about access rights.

Let's fix that in db/authorization/privileges.sql, add the lines.

-- ...
grant usage on sequence data.client_id_seq to webuser;
grant usage on sequence data.project_id_seq to webuser;
grant usage on sequence data.task_id_seq to webuser;
grant usage on sequence data.task_comment_id_seq to webuser;
grant usage on sequence data.project_comment_id_seq to webuser;

Now the request again

curl \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-H 'Prefer: return=representation' \
-d '{"name":"A"}'  \
http://localhost:8080/rest/clients
{"hint":null,"details":null,"code":"23514","message":"new row for relation \"client\" violates check constraint \"client_name_check\""}

curl \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-H 'Prefer: return=representation' \
-d '{"name":"Uber"}'  \
http://localhost:8080/rest/clients?select=id,created_on
{"id":4,"created_on":"2017-07-18T12:13:47.202074+00:00"}

Mutations on complex views

A lot of the times your views will be mostly a "mirror" of your underlying tables, maybe you'll rename or exclude a few columns, have some computed ones. In such cases, they automatically become "updatable views", meaning PostgreSQL knows how to handle an insert/update/delete on the view. However, there are cases when you create views that are a bit more complex like we did with comments. In such cases, you have to help out the database and tell it how to handle mutations on the view.

First, we'll create our own function that knows how to route the data to the appropriate table. Since this function has to go in some schema, let's add it to util schema. We don't want to put it in the api schema since it will look like an endpoint that clients could call and we also don't want it in the data schema because it's a good idea to keep just table definitions there

Add the function to db/src/libs/util/schema.sql

-- ...
create or replace function util.mutation_comments_trigger() returns trigger as $$
declare
    c record;
    parent_type text;
begin
    if (tg_op = 'DELETE') then
        if old.parent_type = 'task' then
            delete from data.task_comment where id = old.id;
            if not found then return null; end if;
        elsif old.parent_type = 'project' then
            delete from data.project_comment where id = old.id;
            if not found then return null; end if;
        end if;
        return old;
    elsif (tg_op = 'UPDATE') then
        if (new.parent_type = 'task' or old.parent_type = 'task') then
            update data.task_comment 
            set 
                body = coalesce(new.body, old.body),
                task_id = coalesce(new.task_id, old.task_id)
            where id = old.id
            returning * into c;
            if not found then return null; end if;
            return (c.id, c.body, 'task'::text, c.task_id, null::int, c.task_id, c.created_on, c.updated_on);
        elsif (new.parent_type = 'project' or old.parent_type = 'project') then
            update data.project_comment 
            set 
                body = coalesce(new.body, old.body),
                project_id = coalesce(new.project_id, old.project_id)
            where id = old.id
            returning * into c;
            if not found then return null; end if;
            return (c.id, c.body, 'project'::text, c.project_id, c.project_id, null::int, c.created_on, c.updated_on);
        end if;
    elsif (tg_op = 'INSERT') then
        if new.parent_type = 'task' then
            insert into data.task_comment (body, task_id)
            values(new.body, new.task_id)
            returning * into c;
            return (c.id, c.body, 'task'::text, c.task_id, null::int, c.task_id, c.created_on, c.updated_on);
        elsif new.parent_type = 'project' then
            insert into data.project_comment (body, project_id)
            values(new.body, new.project_id)
            returning * into c;
            return (c.id, c.body, 'project'::text, c.project_id, c.project_id, null::int, c.created_on, c.updated_on);
        end if;

    end if;
    return null;
end;
$$ security definer language plpgsql;

The last step is to attach the function to the view as a trigger in db/src/api/views_and_procedures.sql

-- ...
create trigger comments_mutation
instead of insert or update or delete on comments
for each row execute procedure util.mutation_comments_trigger();

Now the view looks just like another table with which we can work as usual. A different way to handle this would be to expose distinct stored procedures like insert_comment update_comment or even more specific like add_task_comment.

Let's try an insert

curl -s -X POST \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-H 'Prefer: return=representation' \
-d '{"body": "Hi there!","parent_type": "task","task_id":1}'  \
http://localhost:8080/rest/comments| \
python -mjson.tool

the output will look like

{
  "id":3,
  "body":"Hi there!",
  "parent_type":"task",
  "parent_id":1,
  "project_id":null,
  "task_id":1,
  "created_on":"2017-08-29T02:04:29.35094+00:00",
  "updated_on":null
}

and an update now

curl -s -X PATCH \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-H 'Prefer: return=representation' \
-d '{"body":"This is going to be awesome!"}'  \
"http://localhost:8080/rest/comments?id=eq.1&parent_type=eq.project" | \
python -mjson.tool

the output:

{
  "id":1,
  "body":"This is going to be awesome!",
  "parent_type":"project",
  "parent_id":1,
  "project_id":1,
  "task_id":null,
  "created_on":"2017-07-18T11:31:12+00:00",
  "updated_on":null
}

Progress

Let's stop for a second and see how much we've accomplished so far.

curl -s -G \
-H "Authorization: Bearer $JWT_TOKEN" \
http://localhost:8080/rest/clients \
--data-urlencode select="id,name,projects(id,name,comments:project_comments(id,body),tasks(id,name,comments:task_comments(id,body)))" | \
python -mjson.tool
What you see above is a request that will return all the clients and their projects and for each project, we ask also for tasks and comments. All that in a single request. At the same time, the access to those entities is checked and only the right rows are returned.

Let's see how much code we had to write for that.

cloc --include-lang=SQL db/src/api/views_and_procedures.sql db/src/data/tables.sql db/src/authorization/privileges.sql db/src/libs/util/
        4 text files.
       4 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 1.74  T=0.03 s (138.2 files/s, 10021.8 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
SQL                              4             43             27            220
-------------------------------------------------------------------------------
SUM:                             4             43             27            220
-------------------------------------------------------------------------------

In just about 200 LOC, out of which about half are table definitions, we implemented a REST API. We have an authorization mechanism and we can make sure each user sees only what he is supposed to see (his data). We've decoupled our api from our underlying data model by using a separate schema that has only views and stored procedures in it. We do our user input validation using constraints and for any process that is more complex and takes a few steps to complete (login/signup) we can use stored procedures in all the languages supported by PostgreSQL. To do all this we did not have to write a single line of imperative logic, we only defined our data and the rules controlling the access.

And that is all, we have the core of our API ready, the part that deals strictly with the data inside our database. It's time to take a break from all that coding and click around a bit and check out that API

Using the API

REST

At the heart of subZero sits PostgREST and as such, the REST interface is actually the same as the one exposed by PostgREST. What subZero does is put a proxy in front of PostgREST in order to be able to have control of the url prefix of the api, cache responses and provide a mechanism to restrict which columns are available for filtering plus a lot of other usefull features that are not the target of PostgREST. Other then that, anything describe in the PostgREST documentation, you can trust it's available in subZero REST interface. Rather then maintaining duplicate documentation here, you can consult the exelent docs that PostgREST provides. The only thing to remember is that whenever you see a url like /people, it's equivalent in a default subZero installation is /rest/people (but you can change that if you like)

Let's just check a few examples:

Get the projects of a specific client and all the active tasks for them

curl -s -G -X GET \
-H "Authorization: Bearer $JWT_TOKEN" \
http://localhost:8080/rest/projects \
--data-urlencode select="id,name,tasks(id,name)" \
--data-urlencode client_id="eq.1" \
--data-urlencode tasks.completed="eq.false"  | \
python -mjson.tool

Add a client and return it's new id and created_on field

curl -s -X POST \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-H 'Prefer: return=representation' \
-d '{"name":"Google","address":"Mountain View, California, United States"}'  \
http://localhost:8080/rest/clients?select=id,created_on | \
python -mjson.tool

Update a projects's name

curl -s -X PATCH \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-H 'Prefer: return=representation' \
-d '{"name":"Updated name"}'  \
"http://localhost:8080/rest/projects?select=id,name,created_on,updated_on&id=eq.1" | \
python -mjson.tool
You probably just noticed that after the operation above, updated_on field was still null. We forgot to give it a value. Easy fix, add this to db/src/data/tables.sql file

-- ...
create or replace function set_updated_on() returns trigger as $$
begin
  new.updated_on = now();
  return new;
end
$$ language plpgsql;

create trigger client_set_updated_on
before update on "client"
for each row execute procedure set_updated_on();


create trigger project_set_updated_on
before update on "project"
for each row execute procedure set_updated_on();

create trigger task_set_updated_on
before update on "task"
for each row execute procedure set_updated_on();

create trigger task_comment_set_updated_on
before update on "task_comment"
for each row execute procedure set_updated_on();

create trigger project_comment_set_updated_on
before update on "project_comment"
for each row execute procedure set_updated_on();
If you try that update again, you will see the property change each time.

Delete a task

curl -s -X DELETE \
-H "Authorization: Bearer $JWT_TOKEN" \
-H 'Prefer: return=representation' \
"http://localhost:8080/rest/tasks?select=id,created_on&id=eq.2" | \
python -mjson.tool

Info

If you try to delete task #1, you will get an error telling you:

Key is still referenced from table "task_comment"

This is another example of a safety net the database gives you when you leverage its features, it will not let you corrupt your data by accident. We will have to be specific using ON DELETE what happens to related comments when a task is deleted. It might seem annoying at first but this strictness will save you a lot of headake in the long run.

GraphQL

Some of you might have come here for GraphQL specifically and all we've shown you so far was SQL and REST. So without any further ado, open this address in your browser http://localhost:8080/explore/graphql.html

This is a nice IDE available in development mode to interact with your API. Use the auto-complete and the automatically generated documentation on the right to see what you can do with this api and by the way, the GraphQL endpoint for the api is http://localhost:8080/graphql/simple/ and http://localhost:8080/graphql/relay/ depending on the flavour you prefer.

Let's try and run a few queries:

Login, the browser will get back a session cookie that will allow us to execute other requests in an authenticated state

mutation {
  login(email: "alice@email.com", password: "pass"){
    id name email role
  }
}

Request a specific object by key

{
  task(id: 1){
    name
  }
}

Filtering results by columns on each level

{
  projects(where: {name: {ilike: "*o*"}}) {
    id
    name
    tasks(where: {completed: {eq: true}}){
      id
      name
      completed
    }
  }
}

Requesting multiple levels of data

{
  clients{
    id 
    name
    projects {
      id
      name
      tasks{
        id
        name
        tasks_comments {
          body
          created_on
        }
      }
    }
  }
}

Insert one client

mutation {
  insert {
    client(input: {name: "Apple", address: "Cupertino, California, United States"}) {
      id
      created_on
    }
  }
}

Insert multiple projects at the same time

mutation {
  insert {
    projects(input: [
      {name: "Project 1", client_id: 1},
      {name: "Project 2", client_id: 1}
    ]){
      id
      created_on
    }
  }
}

Info

Having the input parameter directly in the query is a bit ugly, GraphQL supports variables and you can separate the input from the query in your frontend code

Update a client

mutation {
  update {
    client(id: 1, input: {name: "Updated name"}){
      id
      name
      updated_on
    }
  }
}

Info

If you are familiar with the GraphQL ecosystem, you've probably figured out by now that this structure for the API will not work with Relay. subZero is capable of generating a compatible API but for the purpose of this tutorial we used this simple version to better see the relation between the GraphQL types and the entities in the api schema without the mental overhead of such concepts as viewer, connections, edges

Beyond Data

Overview

Most of us are used to building backend systems in a specific way. No matter the stack used, you usually have a web server in front that besides serving static content, routes all the requests to a collection of scripts that implement all the logic in the system (probably using some kind of framework to organize them). At the bottom of the stack sits the database that is only responsible for storing the data. This of course works but is not the only way to build systems and not necessarily the best way. The edges of this system (web server/database) are very powerful tools with a ton of functionality embedded in them and when properly leveraged, can result in a dramatic reduction of code complexity, performance gains and an increase in iteration speed. Just because your logic is in a single language/place does not mean the system is simpler. This starter kit takes a holistic approach when implementing an API and tries to leverage all features of the underlying components to achieve a particular goal.

Try to think of these components, not as separate processes (or even worse, distinct hardware) within your system but rather think of them as modules in your code. Just like in a traditional system control flows through a bunch of modules, an API request flows through these components and at each step you can take certain actions based on the current request. Just like in a traditional stack you would not implement caching in your auth module, or your "active record" module, it's the same here, you do not implement caching in the database but in front of it, even before the request hits the database. You do not send emails from within a stored procedure in the database, you send the email with a separate system that is only triggered by a database event.

Restricting filtering capabilities

By default, subZero will allow the client to filter by all columns and different operators. When your tables are small and you do not have a lot of users, this is not a problem but it can become one if you allow filtering on a text column using like operator for a table that has hundreds of thousands of rows, especially if you do not have an index on that column. As a general rule, you should not allow filtering by a column if that column is not part of an index. Let's see how we can accomplish this seemingly complicated task.

Open openresty/lualib/user_code/hooks.lua file and add a function that will be called before each request (add it at the top of the file)

local function check_filters()
   print 'I will be making sure no one does anything fishy'
end
-- ...

and make sure to call it in on_rest_request function.

In your dev console, switch to the OpenResty tab and observe the result of the following request

curl -s -G -X GET \
-H "Authorization: Bearer $JWT_TOKEN" \
"http://localhost:8080/rest/projects?name=like.*OS" \
| python -mjson.tool
You should see a few lines similar to these
2017/07/19 09:04:23 [notice] 53#53: *109 [lua] hooks.lua:2: check_filters(): I will be making sure no one does anything fishy, client: 172.18.0.1, server: _, request: "GET  /rest/projects?name=like.*OS HTTP/1.1", host: "localhost:8080"
2017/07/19 09:04:23 [info] 53#53: *109 client 172.18.0.1 closed keepalive connection
[19/Jul/2017:09:04:23 +0000] "GET /rest/projects?name=like.*OS HTTP/1.1" 200 204 "0.046 ms"

Now that we know our function is being called, let's see how we can accomplish our task. The best source for writing lua scripts in the context of OpenResty is the github repo README of the lua-nginx-module

Change the function check_filters to this definition

local function check_filters()
    local blacklist = {
        projects_name_like = true
    }
    local table = ngx.var.uri:gsub('/', '')
    local args = ngx.req.get_uri_args()
    for key, val in pairs(args) do
        local column = key;
        local operator, value = val:match("([^.]+)%.(.*)")
        if operator and blacklist[ table .. '_' .. column .. '_' .. operator ] then
            ngx.status = ngx.HTTP_BAD_REQUEST 
            ngx.say('filtering by ' .. column .. '/' .. operator .. ' is not allowed')
            ngx.exit(ngx.HTTP_OK)
        end
    end
end

If we run this request now

curl -i -s -G -X GET \
-H "Authorization: Bearer $JWT_TOKEN" \
"http://localhost:8080/rest/projects?name=like.*OS"

The result will be

HTTP/1.1 400 Bad Request
Server: openresty
Date: Wed, 19 Jul 2017 09:43:47 GMT
Content-Type: application/json
Transfer-Encoding: chunked
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Content-Location: /rest
Request-Time: 0.000

filtering by name/like is not allowed

This function is not perfect yet, the pattern matching could use a little more care but you get the idea how a few Lua lines in the OpenResty context can be a powerful thing when you want to change or alter the incoming request before it reaches the database.

Caching responses

All the components in a subZero system are fast and in most cases you don't need to worry about the load on your system and caching, especially if you were careful about restricting filtering capabilities in the section above. Don't go fixing problems you do not have. There are however cases when you have some entities, requests to which can and should be cached. Usually they will be views or stored procedures that implement a complex report that takes some time to generate. For this subZero provides a scriptable caching module. All you need to do is define two functions (usefull samples provides) and have the fully control of caching subsystem. The functions are get_cache_key and get_invalid_cache_tags defined in cache.lua, the first one allows you to say "I want this response cached" and the second one allows you to say "I want cached responses with these tags to be invalidated". Let's suppose we have our clients view and it does not change that much and we were asked to introduce a property for each client called profitability which we implemented using a computed column and it's a time consuming computation. We would like to cache the responses for the clients endpoint to reduce the the number of times that complex profitability function is called.

The first thing to do is to enable the cache subsystem

-- .env
...
ENABLE_CACHE=1
...

Let's see how the functions might look (note that the sample in the starter kit is more general).

-- lua/user_module.lua
local function get_cache_key()
    local endpoint = get_endpoint() -- this is a helper function defined in cache.lua
    if endpoint == 'clients' then
        local sess_id = ngx.var.http_authorization
        local key_parts = {ngx.var.uri, ngx.var.args or '', sess_id}
        local key = table.concat(key_parts,':')
        local ttl = 60 -- seconds
        local tags = {sess_id ..'_clients'}
        return key, ttl, tags
    end
end

local function get_invalid_cache_tags(ngx_vars, uri_args, headers, get_ast)
    local sess_id = ngx_vars.http_authorization
    local endpoint = get_endpoint()
    return {sess_id .. '_' ..endpoint}
end

Whenever there is a request to the clients endpoint, we compute a cache key for the requests. This key can be any string that uniquely identifies this request type. Notice we included the value of the Authorization header which identifies the current user so that we do not server a cache for one user to a different one. We also attach a tag to this cache instance so whenever a new client is inserted or a client is updated, we can invalidate it.

Try running this request a few times and see the cache working

curl -i \
-H "Authorization: Bearer $JWT_TOKEN" \
http://localhost:8080/rest/clients

First request

...
Cache-Engine: "nginx"
Cache-Status: MISS
Cache-Key: 6481abcee926439946f1e6f47e45a1b5
Cache-TTL: 60
Method: GET
Request-Time: 0.009
...

Second request

...
Cache-Engine: "nginx"
Cache-Status: HIT
Cache-Key: 6481abcee926439946f1e6f47e45a1b5
Cache-TTL: 60
Method: GET
Request-Time: 0.000
...

Note

In development mode, unless you are working specifically on the caching logic, you want to have the cache disabled so that each request goes through.

Reacting to database events

The dirty secret: We’re all just building CRUD apps.

@iamdevloper

Whether we admit it or not, most of the time, 90% of what an API does is CRUD, enforce access rights and validate input, and that is OK :)! Up to this point, we've shown how you can do all that. If it mostly concerns the actual data, PostgreSQL is very flexible and allows you to easily interact with it, PostgREST provides the bridge from HTTP to SQL and OpenResty acts as a router in front of everything and it gives you the possibility to inspect and alter HTTP requests and responses that are in-flight.

However, even the most simple APIs usually have a couple of "paths" that need to interact with some other system and not only the specific data stored in the database. A classic example is sending an email on signup. The first thought is usually that you will have to do it inside a database trigger and actually you can do that, send an actual email. This shows just how versatile Postgres is. But you shouldn't, talking to remote systems is slow and you will be blocking for a good amount of time one database connection/process. Unless you are building a system with only a handful of users, avoid talking to remote systems from functions running in the database. Another possible way to implement this is by leveraging OpenResty and the ability to run custom logic at different stages of an HTTP request. In this specific case, before we send back the response to the client when he accesses signup endpoint, we could look at the response the database sent (and PostgREST forwarded) and if it was a success, we could run our custom function that sends the "welcome" email.

Info

It's important to remember that when in the Lua land running inside OpenResty/Nginx, you have complete freedom, you can do whatever you could have done in Node/PHP/Ruby, you can talk to other systems, run additional queries, make requests to other APIs.

If you only have a handful of these type of cases than doing it in OpenResty, before or after the request reaches the database, is the path you should choose. It's just less moving parts.

But there is a third way! This method allows for greater flexibility, decoupling and separation of concern. When you have to react to all kinds of database events and implement custom logic that talks to other systems or maybe execute some long running computation, you can use this pattern to decouple the main task from additional work that needs to happen. The trick here is to first recognize that the expensive computation you need to run (send email, generate report, talk to remote system) is a thing that can be executed asynchronously from the main request, you don't need to actually send the email to the user before you reply to the request with "signup success", you can send the email a fraction of a second later. If you look carefully, you will notice that this pattern comes up often, a lot of the tasks you need to perform do not have to necessarily be executed before you respond to the API request. This is exactly where a Message queue is useful. Whenever an event occurs in your system (a user just signed up for your service), you send a message to this queue and let a completely separate system (usually a simple script) take care or performing the additional tasks.

While this sounds complicated and fragile it's actually neither. For most of the systems, this pattern can be easily implemented using an events table. Your triggers will insert rows in this table whenever an even occurs triggered by an API request. Another script will read rows one by one from this table and do the additional work. Here is an article explaining how to reliably do this using a table as a queue. You'll might hear that using the database as a queue is bad or does not scale, but this exact method is used by Stripe and it works well for loads up to 100 events/s and most of the time that's all you need.

For simple things like sending emails the table as a queue works great and does not require additional components and if you can get away with it, don't complicate things, use it.

But we all know that everyone wants live updates these days. It is next to impossible to do this with an events table. It would mean that every mutation in your database would have to be mirrored in this table. This is where the final piece of the puzzle comes in.

I am going to do a magic trick now and pull a RabbitMQ out of my hat. Just like the other components in our stack (OpenResty/PostgREST/PostgreSQL), it's an immensely powerful, flexible and battle tested piece of tech. It will give us yet another integration point and among other things will completely solve the "real-time updates" problem for us.

Before we get into that, we have not actually sent that "welcome" email yet. Let's do that.

Starter Kit already has everything in place to make sending messages from PostgreSQL to RebbitMQ easy. All we need to do now is to attach triggers to tables for which we want to monitor events/changes.

The user table already has such a trigger attached

So whenever something happens to a row, this function will take the row, turn it into JSON and generate a new message that will enter RabbitMQ. Before we look at where exactly that message ends up, we need to take a look at the routing_key since this is a very important part of what happens to the message when it is received by RMQ.

Info

If you are completely new to the concept of message brokers and how they work, you can check out RabbitMQ Tutorials

When a new signup event occurs, this will generate an event that will go through the message broker and every interested party can react to it. The message will look something like

row_change.table-user.event-INSERT.user-0
{
    "id":1,
    "firstname":"John",
    "lastname":"Smith",
    "email":"john.smith@gmail.com",
    "password":"encryptedpasshere",
    "user_type":"webuser",
    "created_on":"2017-02-20 08:56:44+00",
    "updated_on":null
}

The routing key tells us exactly what happened (a row got inserted in the user table) and the message payload contains that actual row. This message will be sent on the channel events, which by default is linked to amq.topic exchange in RMQ. The message proxying between PostgreSQL and RabbitMQ is done by pg-amqp-bridge utility which is configured here.

As you can imagine, all kind of events will flow through that exchange. When we write a script that wants to send emails on signup, it does not mean that this script will receive every event and have logic to ignore the ones that do not matter to this script. Your script (message consumer) can tell a message broker what type of messages it is interested in, and it will send only the ones that match your specification. In this particular case we are interested only when rows are added to the user table, so we bind to this exchange using the following routing key

row_change.table-user.event-INSERT.#

If we wanted to monitor all events that happen on the user table, the binding key can be

#.table-user.#

or maybe we want to inspect all the events that were triggered by a specific user, in this case, the binding key would be

#.user-10

Info

The format of the routing key is not something hard coded. You are free to change it to suit your needs. You are also free to send the events to any exchange you like, just remember to configure pg-amqp-bridge to forward them.

So at this stage, we have a system in place so that whenever someone signs up, this will trigger an event that will result in a message entering RabbitMQ. Let's see how we can react to those messages.

To be able to write these "worker" scripts faster, we have a docker image called subzerocloud/amqptools, it's job is to connect to RabbitMQ and invoke your script on each message. The image is defined here and is nothing more then a wrapper around amqpspawn command line utility. For each task, we only need to implement a on_message.sh script that processes a single message. You can implement this script in any language you like, you only need to make it executable, but to show how simple this can be, we'll do it using a shell script, just because we can. Let's create our worker script.

mkdir -p workers
touch workers/send_signup_email.sh
chmod +x workers/send_signup_email.sh
Here is the contens (rember to input your actual gmail credentials)

#!/bin/bash

MESSAGE_ROUTING_KEY=$2
MESSAGE_FILE=$3
MESSAGE=$(cat $MESSAGE_FILE)

# cleanup
rm $MESSAGE_FILE

# The the message is a JSON object, so we convert all the keys to vars like this
while read k v; do export $k=$v; done < <(echo  $MESSAGE | jq -c -r 'to_entries[] | [.key, .value] | @tsv')

# works starts here


printf "Sending signup email to ${name} <${email}>... "

# sendEmail -o tls=yes \
# -xu $GMAIL_USER \
# -xp $GMAIL_PASSWORD \
# -s smtp.gmail.com:587 \
# -f "$GMAIL_FROM" \
# -t "${name} <${email}>" \
# -u "Welcome to our service (Khumbu Icefall)" \
# -m "We hope you enjoy your stay."

You can start the docker image with this script as a parameter like this:

docker run --rm -it \
-v $(pwd)/workers/send_signup_email.sh:/send_signup_email.sh \
-e PROGRAM=/send_signup_email.sh \
-e AMQP_HOST=localhost \
-e AMQP_PORT=5672 \
-e AMQP_USER=admin \
-e AMQP_PASSWORD=adminpass \
-e ROUTING_KEY=row_change.table-user.event-INSERT.# \
-e AMQP_QUEUE=create.user \
-e AMQP_QUEUE_DURABLE=true \
--network container:khumbuicefall_rabbitmq_1 \
subzerocloud/amqptools

Now let's make the signup call

curl \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pgrst.object+json" \
-d '{"name":"John Smith","email":"john@smith.com","password":"pass"}' \
http://localhost:8080/rest/rpc/signup

In the window where the worker is running, you'll see

...
Sending signup email to John Smith <john@smith.com>...

The actual line that sends the email is commented out because subzerocloud/amqptools image does not contain sendEmail utility, but you can easely extend the image and install it.

Did we just create a microservice :)?

This script can run on completely separate hardware, decoupled from your main system. Another nice benefit is that even if the script is stopped or the hardware crashes because we specified -e AMQP_QUEUE_DURABLE=true, RabbitMQ will hold on to all the signup events that happened in the meantime so when we restart the script, it will process the events retroactively.

subZero maintains a fork for amqptools and provides subzerocloud/amqptools docker image which you can use to simplify writing workers. With this image, all you need to do is write the onmessage.sh script in order to implement the worker. You can add this worker to your docker-config.yml file like this.

signup_email_worker:
    image: subzerocloud/amqptools
    restart: unless-stopped
    volumes:
      - "./workers/send_signup_email.sh:/send_signup_email.sh"
    links:
      - rabbitmq
    environment:
      - AMQP_HOST=rabbitmq
      # - AMQP_PORT
      # - AMQP_VHOST
      - AMQP_USER=${RABBITMQ_DEFAULT_USER}
      - AMQP_PASSWORD=${RABBITMQ_DEFAULT_PASS}
      # - EXCHANGE="amq.topic"
      - ROUTING_KEY=row_change.table-user.#
      - AMQP_QUEUE=create.customer
      - AMQP_QUEUE_DURABLE=true
      # - AMQP_QUEUE_PASSIVE
      # - AMQP_QUEUE_EXCLUSIVE
      # - AMQP_QUEUE_DURABLE
      - PROGRAM=/send_signup_email.sh
      # - MAX_DELAY=30

When all else fails, take full control

subZero and this starter kit will take you 95% of the way when building your API. You will get there in no time and with an almost trivial amount of code. However, there are situations when it's hard to implement some use-cases within the constraints of a subZero based project. This is a general problem with systems that try to automate API creation, there are cases when it feels like you are fighting the tool and not solving your problem. You don't have to make that sacrifice here. It is possible to take full control of the flow.

An example when an automated API creation tool does not quite provide the flexibility needed is the case of integrating a payment processor. Usually, these types of systems provide their own SDKs that you have to use and they generate quite a lot of back and forth with a remote system. Some services embed in their core code integrations with specific well known 3rd party systems (slack, stripe, etc.) but you don't have the freedom to integrate with anything you want, you are limited to maybe triggering a few web hooks here and there.

Let's get back to our example. We've built our API (and frontend) and users are happy with it, it's time to get paid for our hard work. We'll charge our customers a one-time fee of $99 and we'll use Stripe for that. Our front end uses GraphQL (or REST) for communicating with the backend, however when integrating with Stripe, it is convenient to use it's recommended flow and that basically means you have to process a form submission.

Your frontend will display a page with a button like described in the Stripe docs. Once the user goes through the process of entering his credit card, Stripe will hand over control to our system where based on a token, we create the customer and charge his credit card.

Now comes the interesting part, we hijack this URL and handle the request ourselves instead of letting it reach PostgREST and the database.

-- openresty/nginx/conf/includes/http/server/locations/charge_customer.conf
location /rest/rpc/charge_customer {
    content_by_lua_file '../lualib/user_code/charge_customer.lua';
}

Tip

I will be using Lua directly to implement this custom endpoint but you are free you route this URL to any stack you chose. You can use your traditional stack alongside PostgREST to handle a few tricky bits and to your api users it will be indistinguishable.

Info

To have these types of endpoint also apear in our GraphQL api, create a dummy stored procedures in your api schema like this

create or replace function charge_customer() returns text as $$
    select 'dummy function'::text;
$$ stable language sql;

Here is a script (which could use some error handling) that will accomplish the task. Since we are taking over, we are responsible for checking the user is authenticated before making any additional work.

-- openresty/lualib/user_code/charge_customer.lua
local jwt = require 'resty.jwt'
local cjson = require 'cjson'
local jwt_secret = os.getenv('JWT_SECRET')
local stripe_key = 'sk_test_xxxxxxxxxxxxxxxx'
local stripe_endpoint = 'https://api.stripe.com/v1'
local stripe_auth_header = 'Basic ' .. ngx.encode_base64(stripe_key ..':')

-- check authorisation
local authorization = ngx.var.http_authorization
local token = authorization:gsub('Bearer ', '')
local jwt_obj = jwt:verify(jwt_secret, token)
if not (jwt_obj and jwt_obj.valid) then
    ngx.status = ngx.HTTP_UNAUTHORIZED
    ngx.exit(ngx.HTTP_UNAUTHORIZED)
end
local user_id = jwt_obj.payload.user_id

-- main script logic
ngx.req.read_body()
local args = ngx.req.get_post_args()
local stripe_token = args.stripeToken

-- interact with stripe
-- we could have used a nicer lua lib https://github.com/leafo/lua-payments but we can do this lo level also
local http = require "resty.http"
local http_client = http.new()

-- save the customer in stripe
local res = http_client:request_uri(stripe_endpoint .. '/customers', {
    method = "POST",
    headers = {
        ['Authorization'] = stripe_auth_header,
    },
    body = ngx.encode_args({
        ['metadata[user_id]'] = user_id,
        source = stripe_token
    })
})
local stripe_customer_id = cjson.decode(res.body)['id']

-- charge the client
local res = http_client:request_uri(stripe_endpoint .. '/charges', {
    method = "POST",
    headers = {
        ['Authorization'] = stripe_auth_header,
    },
    body = ngx.encode_args({
        amount = 9900,
        currency = 'usd',
        description = 'Khumbu Icefall Service',
        customer = stripe_customer_id
    })
})

ngx.say('true')
-- we could also redirect the user here

You don't have to fight subZero and jump through hoops to accomplish complicated flows, you can just sidestep it completely in a few specific cases and write your own custom code.