Store all posts from an author in markdown files

in #steemdev5 years ago (edited)

I would find it very convenient to have all my written posts in one folder on my hard drive. The content of each file should be identical to the blockchain data. I did not found a tool for this task, so I wrote a python script.

The script does the following:

  • it reads the blog section of the given author (limited to the newest 500 posts)
  • skip resteemed posts
  • extracts title, timestamp, permlink and store them as YAML extension at top of the md file
  • saves the content as markdown file
#!/usr/bin/python
from beem import Steem
from beem.comment import Comment
from beem.account import Account
import os
import io
import argparse


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("author")
    parser.add_argument("path")
    args = parser.parse_args()
    author = args.author
    path = args.path
    stm = Steem(node="https://api.steemit.com")
    account = Account(author, steem_instance=stm)
    for comment in account.get_blog(limit=500):
        if comment["author"] != author:
            continue
        markdown_content = comment.body
        title = comment.title
        timestamp = comment.json()["created"]
        author = comment["author"]
        permlink = comment["permlink"]
        yaml_prefix = '---\n'
        yaml_prefix += 'title: %s\n' % title
        yaml_prefix += 'date: %s\n' % timestamp
        yaml_prefix += 'permlink %s' % permlink
        yaml_prefix += 'author: %s\n---\n' % author
        filename = os.path.join(path, timestamp.split('T')[0] + '_' + permlink + ".md")
        
        with io.open(filename, "w", encoding="utf-8") as f:
            f.write(yaml_prefix + markdown_content)

Store this file as save_posts_as_md.py. beem need to be installed. The script needs two parameter:

python save_posts_as_md.py <author> <path>

The posts from the author are stored in the given path. The filename consists of the date and the permlink.

Result for holger80

python save_posts_as_md.py holger80 .

Viewing the markdown files works best using the great typora editor.

Storing one post as markdown file

#!/usr/bin/python
from beem import Steem
from beem.comment import Comment
from beem.account import Account
import io
import argparse


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("authorperm")
    parser.add_argument("filename")
    args = parser.parse_args()
    authorperm = args.authorperm
    filename = args.filename
    stm = Steem(node="https://api.steemit.com")
    comment = Comment(authorperm, steem_instance=stm)
    markdown_content = comment.body
    title = comment.title
    timestamp = comment["created"]
    author = comment["author"]
    yaml_prefix = '---\n'
    yaml_prefix += 'title: %s\n' % title
    yaml_prefix += 'date: %s\n' % str(timestamp)
    yaml_prefix += 'author: %s\n---\n' % author
    
    
    with io.open(filename, "w", encoding="utf-8") as f:
        f.write(yaml_prefix + markdown_content)

This script works when a authorperm and filename was given:

python save_post_as_md.py @holger80/how-to-post-using-typora-and-beempy how-to-post-using-typora-and-beempy.md
Sort:  

I like this idea, @holger80!

What would be a nice addition is maybe for the script to attempt to save local copies of all images used in the blog as well. So basically saving a backup of all markdown and images from a given Steem blog.

I've lost count of the number of image links I have used and last thing I would want to happen is the respective hosts serving up those images shutting down and then my blog being a graveyard of 'X' icons.

And, even if that does happen, with locally saved copies, we would be able to edit a good link back in no problem. :)

What do you think?

Posted using Partiko Android

Good idea, I will work on this.

FYI:

Due to default max_feed_size setting, get_blog only can return MAX 500 items for an author, So if an author has more than 500 articles, then the early article will can not be obtained by this method.

The history call would allow returning all posts from an account, but would take some more time to be finished.

Cool, I got all my posts saved in 3s.

it reads the blog section of the given author (limited to the newest 500 posts)

But I guess real bloggers would love to save their complete history (idea for v2.0 ;-) )

I will work on a account history version.

.

Thanks, I will try this.

I have one 0.18$ upvote for comment writer :)
$rewarding bounty 100% 2days

Congratulations to the following winner(s) of the bounty (The upvote value is distributed to the winner(s) by setting beneficiaries for this comment):

The bounty is set. When the post is 2.00 days old, a comment is created and upvoted with a 100.00% vote from holger80. The beneficaries of this comment is distributed to all top-level comment authors of this post. The comments are weighted by the creator and other reader by their upvotes. When no comment is created or no comment is upvoted, no comment from rewarding is created.

I will try to do as you suggest in your post for my blog, as I wanted to have a digest of my blog stored.
I just don’t have a clue how this markdown thing works. I can’t code, let alone, read code.
Wish me luck!

lovely, another great use case for your libs :)

This post has been just added as new item to timeline of beem on Steem Projects.

If you want to be notified about new updates from this project, register on Steem Projects and add beem to your favorite projects.

@holger80 I get the following when I try to run (I did pip3 -U beem):
randy@apollo:~/beemcode$ python3 save_post_as_md.py mytechtrail .
Traceback (most recent call last):
File "save_post_as_md.py", line 17, in
comment = Comment(authorperm, steem_instance=stm)
File "/home/randy/.local/lib/python3.6/site-packages/beem/comment.py", line 58, in init
[author, permlink] = resolve_authorperm(authorperm)
File "/home/randy/.local/lib/python3.6/site-packages/beem/utils.py", line 159, in resolve_authorperm
raise ValueError("Invalid identifier")
ValueError: Invalid identifier

Any ideas?
Is this the correct place to ask for help?

You mixed up both scripts.
The script save_post_as_md.py is for saving one post. You have to copy the first one in my post.

Oops, thanks for the quick response and catching my error.

Do you have a github repo for all these awesome python programs you are writing?

Hi @holger80!

Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your UA account score is currently 7.245 which ranks you at #69 across all Steem accounts.
Your rank has not changed in the last three days.

In our last Algorithmic Curation Round, consisting of 254 contributions, your post is ranked at #1. Congratulations!

Evaluation of your UA score:
  • Your follower network is great!
  • The readers appreciate your great work!
  • Great user engagement! You rock!

Feel free to join our @steem-ua Discord server

Hi, @holger80!

You just got a 2.52% upvote from SteemPlus!
To get higher upvotes, earn more SteemPlus Points (SPP). On your Steemit wallet, check your SPP balance and click on "How to earn SPP?" to find out all the ways to earn.
If you're not using SteemPlus yet, please check our last posts in here to see the many ways in which SteemPlus can improve your Steem experience on Steemit and Busy.

Now this is a really useful script! Perhaps next you could write an expansion to convert them to another file format for easy reading. There are tons of python libraries for file formats, even doc files and such.

You got a 64.01% upvote from @ocdb courtesy of @holger80!

@ocdb is a non-profit bidbot for whitelisted Steemians, current max bid is 20 SBD and the respective amount in Steem.
Check our website https://thegoodwhales.io/ for the whitelist, queue and delegation info. Join our Discord channel for more information.

If you like what @ocd does, consider voting for ocd-witness through SteemConnect or on the Steemit Witnesses page. :)