Excel gone wild - fetch data from PostgreSQL via Python

Excel gone wild - fetch data from PostgreSQL via Python

I am benchmarking quite a lot these days. My latest toy is a homemade HTAP workload based on TPC-C and TPC-H. Written in Python and PL/pgSQL. The output looks like this:

Screen Shot 2020-10-22 at 10.59.38.png

There are also statistics in a secondary PostgreSQL database. Being a sometimes-big fan of Excel, I want to turn these statistics into nice graphs. To get the data into excel, I would execute:

psql -U postgres -h localhost -d htap_stats -c "\
    copy (select * from oltp_stats \
    where id = '21abf906-6569-4fbb-bab4-412d53243afa' \
    order by ts asc) to stdout csv" | pbcopy

And paste the clipboard content into Excel to update the spreadsheet. This workflow works like a charm. And it is unsatisfactory at the same time.

The next thing I remember is having to spend an hour (or maybe three?) with ODBC. And then I went back to copy-paste.

Until today in the morning, when I remembered: one can query URLs within Excel with almost no effort. So, let's have a minimalistic REST API with Python!

#!/usr/bin/env python3

from io import StringIO
from http.server import HTTPServer, CGIHTTPRequestHandler
import psycopg2

PORT = 8000
DSN = 'postgresql://postgres@localhost/htap_stats'

class Handler(CGIHTTPRequestHandler):
    def do_GET(self):
        what, args = self.path[1:].split('?')
        uuid = args.split('=')[1]

        result = StringIO()
        conn = psycopg2.connect(DSN)
        with conn.cursor() as cur:
            cur.copy_expert(f'''
                COPY (
                  SELECT *
                  FROM {what}_stats
                  WHERE id = '{uuid}' ORDER BY ts ASC
                ) TO STDOUT CSV DELIMITER $$\t$$''', result)
        conn.close()

        result = result.getvalue().encode()

        self.send_response(200)
        self.send_header("Content-type", "text/csv")
        self.send_header("Content-length", len(result))
        self.end_headers()
        self.wfile.write(result)

def run():
    with HTTPServer(("", PORT), Handler) as httpd:
        print("serving at port", PORT)
        httpd.serve_forever()

if __name__ == '__main__':
    try:
        run()
    except KeyboardInterrupt:
        pass

This Python snippet opens an HTTP server on port 8000. It accepts GET requests and splits them into what (can be: oltp or olap) and the UUID for querying the table.

To make use of this in Excel, create a .iqy file to define how the URL above should be queried:

WEB
1
http://localhost:8000/["Source", "Use oltp or olap"]?uuid=["UUID", "Enter a UUID to use"]

Selection=1
Formatting=None
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

One can add this to the spreadsheet by calling "Run Web Query" from the menu. This will show a dialogue where you need to point it to the file above. Then, you can fill it out and, for instance, point the UUID to a cell. Followed by: watch how the data loads.

Now, I can click "Refresh All" and this fulfills both: works like a charm and is satisfactory.

Photo by energepic.com from Pexels