You are viewing a single comment's thread from:

RE: Text analytics reveal thirty two percent of comments on hive are not unique and at least ten percent add no value to discussion

in Hive Statistics5 months ago

PBI is useful, but can be a royal pain to use. I think for this, I'll have a data pipeline something like...

Each week, grab the comments table for the most recent period. (sql)
Append it to the existing data. (python)
Complete my feature engineering and analysis (pbi)
Join it to any other relevant data (python)
Identify any new edge cases. (my brain)
Run the analytical compute. (probably python)
Update table, refresh dashboard, publish interesting bits on hive. (not sure on what tech i'll use)