Caches are cheap, build a lot of them for the frontend

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

There was a stretch, maybe from 2005 to 2015, which will be remembered as the era of simple Ajax uses. The era started when Sam Stephenson released the Prototype library, but it really got going once jQuery was released. It was an era of Javascript fetching data via Ajax, and relying on the backend code to decide what data is sent in response to each request.

That era is evolving into something different. Frontends are now complicated enough that the frontend coders need to have control over what their queries fetch. GraphQL was invented to satisfy this need. Maybe GraphQL is the big wave of the future. I don’t know. For now, I favor a simpler approach.

I wrote a small translator app. It is less than 1,000 lines of Clojure. I hope to eventually clean it up and release it as Open Source. You can recreate it easily on your own. The idea is to pull data from PostGres and denormalize it to ElasticSearch. The crucial idea is that the SQL, and the reformatting of the data, should be controlled by a simple configuration file, which the frontenders can change at will. Therefore, whenever they need a new data structure in ElasticSearch, they just add a few more lines to the configuration file, and the Translator app picks up the changes in the configuration file and immediately starts creating a new data structure in ElasticSearch. And the Translator app is forever polling PostGres, so whenever new data appears in PostGres, the Translator app is quick to pick up the change and move it to ElasticSearch.

In the early days of Ajax, it was common to offer RESTful interfaces that exactly matched the database tables in a SQL database. So if your MySQL or PostGres had tables for Users, Businesses, Messages, then the RESTful interface offered endpoints for Users, Businesses, and Messages. And the Ajax interfaces were very busy, making a lot of calls. I can recall early Angular websites where the pages were making 20 or 30 or even 40 Ajax calls to fill in the page.

You should denormalize your SQL data into a form that is convenient for the frontend. Your frontend should only have to make 1 or 2 or maybe 3 Ajax calls to fill in a web page.

Because the Translator app is easy to use, the frontenders can create different kinds of documents for every page on the website. They can have some documents for the Profile page, different documents for the Rising Trends page, different documents for the Most Commented page. You can have a cache for Rising Trends page logout and a cache for Rising Trends page logged in. If a given page shows data from 8 different SQL database tables, then you need to fetch those 8 tables and combine the data into one JSON document, so the frontend can get everything it needs with 1 Ajax call. You can and should have as many cache collections as possible. The only limit is that the speed of writes to ElasticSearch. On a B2B site you could potentially have a cache for every user. On a B2C site you would probably have too many users to do so. But in the interest of convenience for the frontend, you should certainly push the limit of how many caches you can have before the write/updates begin to slow down to unacceptable levels.

When I first created my Translator app, it was at a company that had a PHP website which made straight calls to MySQL. They were trying to develop an API, so they could have some separation between the backend code and the frontend code. I built the Translator app to make it easy for frontenders to pull data from MySQL and get it to ElasticSearch. Here is an example of the config the frontenders could use to create 2 document collections:

{
    "denormalization" : [

        {
            "sql" : " SELECT id as profile_id, name as profile_name, headquarters_addr1, headquarters_city, headquarters_state_code, headquarters_country_code, headquarters_phone, page_views_rank, clean_name, clean_name2, visibility_name_search_auto, dtc FROM company_profile WHERE name is not null and name != '' ",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "recent_investments_page"
        },

        {
            "sql" : " SELECT id as profile_id, name as profile_name, headquarters_addr1, headquarters_city, headquarters_state_code, headquarters_country_code, headquarters_phone, clean_name, clean_name2, dtc FROM advisor_profile WHERE name is not null and name != '' ",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "recent_investments_page"
        },

        {
            "sql" : " SELECT id as profile_id, name as profile_name, headquarters_addr1, headquarters_city, headquarters_state_code, headquarters_country_code, headquarters_phone, clean_name, clean_name2, dtc FROM investor_profile WHERE name is not null and name != '' ",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "recent_investments_page"
        },

        {
            "sql" : " SELECT id as profile_id, name as profile_name, headquarters_addr1, headquarters_city, headquarters_state_code, headquarters_country_code, headquarters_phone, page_views_rank, clean_name, clean_name2, visibility_name_search_auto, dtc FROM company_profile WHERE name is not null and name != '' ",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "company_details_page"
        },

        {
            "sql" : "SELECT * FROM company_websites",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "company_details_page"
        },

        {
            "sql" : " SELECT  advisor_profile_id as profile_id, name as advisor_former_name_name FROM advisor_former_name ",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "company_details_page"
        },

        {
            "sql" : " SELECT * FROM company_board_members",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "company_details_page"
        },

        {
            "sql" : " SELECT * FROM company_investors",
            "key_to_denormalize_upon" : "profile_id",
	    "collection_name" : "company_details_page"
        }

        ]       

}

This creates 2 document collections:

recent_investments_page

company_details_page

(I used the word collection because I was originally going to use MongoDB, not ElasticSearch.)

This allows the frontenders to write whatever SQL they need, have the data combine around some key, and live in denormalized state in ElasticSearch. I believe this gives all the flexibility that frontenders seek from GraphQL, but setting this up is easier than setting up a GraphQL system.

.
.
.

On a related subject, you might want to read Why Don’t Software Developers Understand Kafka and Immutable Logs?

Source