Perhaps the way to go is clustering and categorising posts using traditional nlp, before LLM. There also communities to split the data, eg: HIVE Gaming has:
- Reviews
- Stream Replays
- Lists of games
- Narratives of People playing through games
- Videos
- screenshot compilations
By having clusters and certainty scores for each post for each cluster- its a bit more simple and human readable. Author tags may also help in categorisation.