#5 - Indexer: cached_post.py and feed_cache.py

in #steem4 years ago

Hivemind Deep Dive (new).png

What I am learning about Hivemind's design, as I am working on Native Ads

This post gives a technical overview of how these two modules operate within Hivemind. They are found in the hive/indexer subfolder of the codebase.

cached_post.py

This module hosts the CachedPost class, which contains various methods used to manipulate the hive_posts_cache table in the database and maintain data consistency between the blockchain and Hivemind. The following are just some of the operations handled by methods in this class (for sake of brevity):

  • insert, update and delete posts
  • update vote counts
  • do child recounts

An _sql() method generates the various SQL edit statements that will be used to update the database. The SQL statements are generated according to the "update level" triggered by the individual calls. For example, new post additions (inserts) are handled differently from upvote updates. Extra logic handles recounts that also affect parent posts.

Methods in this module are called from a number of modules, under various circumstances.

  • CachedPost.vote() is called from blocks.py when vote operations are detected within raw blocks
  • CachedPost.update_promoted_amount() is called from payments.py to record a new "promoted" balance" in the database for particular posts
  • CachedPost.vote() is also called from payments.py to trigger an update in vote stats
  • Various methods such as insert(), recount(), undelete(), delete() and update() are called from posts.py which handles core post operations and data. For more info, refer to the post I wrote on this module: #4 - Indexer: posts.py
  • sync.py also makes various calls to methods in the CachedPost class. They concern the recovery of missing posts and payout backlogs.

feed_cache.py

This module hosts the FeedCache class, which has various methods that help maintain the hive_feed_cache table in the database. This table effectively creates "views" of all accounts' feeds on a rolling basis and helps avail a quick and resource-friendly way to query feeds.

It has 3 simple class methods:

  • insert(): inserts posts and reblogs as entries associated with an account ID
  • delete(): removes a post entry from the table and therefore from the feed cache
  • rebuild(): rebuilds the feed cache when initial sync is complete. This method is called from hive/indexer/sync.py script in the initial() method handling the initial sync

What have I learned?

When implementing Native Ads, the hive_posts_cache table will hold JSON data that contains ad properties within every ad post's json_metadata field. This cache will make data retrieval easy and efficient for an ad's properties.

The feed cache, on the other hand, might be relevant to Native Ads, when one considers how ads can be integrated into feeds. This is an aspect that I am still thinking about and no definitive conclusions have been reached yet. There are also other considerations to make like handling legacy post promotions in the same way, or the effects on user experience flexibility that this move could introduce.

Another way is to allow different user interfaces (front-ends) to determine how and where they place the ads, their frequency and timing. Again, I am still weighing the pros and cons.

Studying these modules also led me to look at payments.py and how it is designed to track payments to @null with post URLs in the memos for legacy post promotion. Some modifications will need to be made there to accommodate ad payments made to community accounts or to @null, but that's a discussion for another post.

Posts in this series

#1 - Overview and opportunities
#2 - Indexer: blocks.py
#3 - Indexer: accounts.py
#4 - Indexer: posts.py
#5 - Indexer: cached_post.py and feed_cache.py



I am currently working on a new feature called Native Ads, that may be added to Hivemind Communities in a future update.

For an overview of the Native Ads feature and how it will work, read this doc.

If you would like to take a look at the code, check out my fork of Hivemind on GitHub. The project is still in pre-alpha and is dependent on the full release of Hivemind Communities, so I haven't started publishing code yet.