sperea.es
Published on

Creating a bot to autopost RSS feeds on Bluesky with GNU/Linux

Authors

Following our previous creating a bot to autopost RSS feeds on mastodon with GNU/Linux, we're taking the next step: creating the same bot for Bluesky. Ready to dive in? Let's go!

Given the growing popularity of Bluesky, many are interested in leveraging its platform for various purposes. In this guide, I'll walk you through the process of creating a Python script that automatically publishes posts from an RSS feed to Bluesky.

Pre-requisites:

  • A Bluesky account.
  • An RSS feed URL from which you'd like to fetch and post content.
  • Setting up App Password on Bluesky:
    • Open your account settings on Bluesky.
    • Navigate to the section named "App Passwords".
    • Click on "Create App Password".
    • Once presented with the app password, save it securely. If lost, you can always generate a new one.

Bluesky assigns a DID to every user for identification purposes:

import urllib3
import json

def get_did():
    http = urllib3.PoolManager()
    HANDLE = "your_bluesky_handle_here"
    DID_URL = "https://bsky.social/xrpc/com.atproto.identity.resolveHandle"
    did_resolve = http.request("GET", DID_URL, fields={"handle": HANDLE})
    return json.loads(did_resolve.data)["did"]

Obtaining an API Key:

Every action performed on the Bluesky platform, which requires user authentication, necessitates an API key. This key is unique, temporary, and acts as a token that authenticates the bot (or user) and authorizes it to make specific actions, such as creating posts.

It is, essentially, the bridge between your bot and Bluesky, ensuring that your bot's actions are secure and tied to your specific Bluesky account.

Why is it necessary?

Security is paramount in online interactions, especially when it comes to automated processes like bots. By requiring bots to use an API key, Bluesky ensures that:

  • Authentication: The bot is recognized as a legitimate entity with the rights to interact with the platform on the user's behalf.
  • Authorization: The bot can only perform actions it's allowed to, preventing potential malicious or unintended operations.
  • Session Management: API keys often have a lifespan. Once expired, they need to be renewed. This provides an extra layer of security by limiting the time window a key is valid.

How to get the API key?

The API key is obtained by combining two essential pieces of information:

  • DID (Decentralized Identifier): A unique identifier assigned to each Bluesky user.
  • App Password: A password generated from the Bluesky settings, specifically for application-level interactions.

These pieces of information are sent to Bluesky's server, which then returns an API key for the bot to use.


import urllib3
import json

def get_api_key(did, app_password):
    http = urllib3.PoolManager()  # Initializes a pool manager for HTTP requests
    API_KEY_URL = "https://bsky.social/xrpc/com.atproto.server.createSession"  # The endpoint to request the API key

    # Data to be sent to the server
    post_data = {
        "identifier": did,  # The user's DID
        "password": app_password  # The app password generated earlier
    }

    headers = {
        "Content-Type": "application/json"  # Specifies the format of the data being sent
    }

    # Send a POST request with the required data to obtain the API key
    api_key_response = http.request("POST", API_KEY_URL, headers=headers, body=json.dumps(post_data))

    # Parse the response to extract the API key
    return json.loads(api_key_response.data)["accessJwt"]

Fetching Content from the RSS Feed:

Now that our Bluesky bot is authenticated, it needs content to share. Ideally, this content should be current, relevant, and engaging for your audience. One way to source such content is through RSS feeds, which are a chronologically ordered stream of content from blogs, news websites, and other online publishers.

Understanding RSS Feeds:

RSS, or "Really Simple Syndication," is a format for delivering regularly changing web content. Many content publishers provide an RSS feed to allow people to subscribe to it easily.

Each item within an RSS feed can contain a full or summarized text, plus metadata, like publishing date, author's name, etc.

Why Use RSS Feeds?

  • Automation and Efficiency: Instead of manually seeking out blog posts or articles to share, your bot can automatically fetch this content from a predetermined RSS feed. This not only saves time but ensures regular activity on your Bluesky account.

  • Fresh, Relevant Content: RSS feeds are typically updated more frequently, ensuring your bot shares the most recent posts. This relevance is vital for maintaining audience engagement.

  • Customization: You can select RSS feeds that align with your interests or your audience's preferences, ensuring that shared content resonates with your followers.

Fetching the RSS Feed with Python:

To retrieve content from an RSS feed, we'll use Python's 'feedparser' library, an excellent tool for parsing RSS feeds:


import feedparser

def get_rss_content():
    # The URL of the RSS feed you want to connect to
    feed_url = "your_rss_feed_url_here"

    # Parse the RSS feed
    feed = feedparser.parse(feed_url)

    # If you plan to post the latest content, it's usually the first entry in the feed
    latest_post_title = feed.entries[0].title
    latest_post_link = feed.entries[0].link

    # You can further expand this by including other details like the post's published date,
    # author, summary, etc., depending on your needs.

    return latest_post_title, latest_post_link

This function fetches and parses the RSS feed, retrieving the latest post's title and link. Depending on your bot's purpose, you might want to share additional information or choose a different criterion for selecting which post to share.

Handling Multiple Feeds and Posts:

For a more advanced bot, you might want to fetch multiple posts or subscribe to several RSS feeds. In such cases, you'd iterate over the entries in the 'feed.entries' list or over a list of RSS feed URLs, collecting the content you want to share on Bluesky. Be mindful of potential rate limits or content volume to ensure your bot shares content at an appropriate and respectful pace.

By automating content curation, your Bluesky bot becomes a channel for consistent, interesting content that can captivate your audience. With the basics of RSS feed parsing in place, the bot is one step closer to functioning autonomously, providing valuable engagement on your Bluesky account.


import feedparser

def get_rss_content():
    # The URL of the RSS feed you want to connect to
    feed_url = "your_rss_feed_url_here"

    # Parse the RSS feed
    feed = feedparser.parse(feed_url)

    # If you plan to post the latest content, it's usually the first entry in the feed
    latest_post_title = feed.entries[0].title
    latest_post_link = feed.entries[0].link

    # You can further expand this by including other details like the post's published date,
    # author, summary, etc., depending on your needs.

    return latest_post_title, latest_post_link

Publishing the Content on Bluesky:

Once we've fetched the desired content from the RSS feed, the next step is to structure and publish this content on Bluesky. With our established authentication and understanding of how Bluesky's API functions, this process becomes a seamless extension of our bot's tasks.

Structuring the Content for Bluesky:

Bluesky, like many social platforms, expects content in a specific structure. This means we need to take the raw content we fetched from the RSS feed and format it in a way that Bluesky's API can accept and publish.

A Simple Post on Bluesky:

Given the rich media capabilities of Bluesky, a typical post might include a title, link, and a brief description or excerpt from the content. Let's consider a function that prepares the post:

def prepare_post_for_bluesky(title, link):
    """Convert the RSS content into a format suitable for Bluesky."""

    # The post's body text
    post_text = f"{title}\n\nRead more: {link}"

    # The post structure for Bluesky
    post_structure = {
        "text": post_text,
        "embed": {
            "$type": "app.bsky.embed.links",   # Bluesky's format for embedding links
            "links": [link]
        }
    }

    return post_structure

This function takes the title and link from our RSS content and creates a text post with an embedded link.

Publishing on Bluesky:

Now that we have the post structured correctly, it's time to send it to Bluesky:

def publish_on_bluesky(post_structure, did, key):
    """Publish the structured post on Bluesky."""

    http = urllib3.PoolManager()   # Initializes a pool manager for HTTP requests
    post_feed_url = "https://bsky.social/xrpc/com.atproto.repo.createRecord"  # The endpoint to post on Bluesky

    # The complete record for the Bluesky post, including our structured content
    post_record = {
        "collection": "app.bsky.feed.post",
        "repo": did,    # The unique DID of our account
        "record": post_structure
    }

    headers = {
        "Content-Type": "application/json",       # Specifies the format of the data being sent
        "Authorization": f"Bearer {key}"          # The API key for authenticated posting
    }

    # Send a POST request to publish the post on Bluesky
    post_request = http.request("POST", post_feed_url, body=json.dumps(post_record), headers=headers)

    # Parse the response for any necessary information, such as post ID or confirmation status
    response = json.loads(post_request.data)

    return response

With these two functions, we've taken content from an RSS feed, structured it, and then published it on Bluesky.

Things to Keep in Mind:

  • Rate Limiting: Ensure that your bot respects any rate limits imposed by Bluesky to avoid being flagged or banned.
  • Engagement: While automation is efficient, ensure the content remains relevant and engaging to your audience. Consider adding functionality to curate or filter the fetched content before posting.
  • Error Handling: Ensure robust error handling. For instance, if there's an issue with the RSS feed or with Bluesky's API, your bot should be able to identify the problem and retry or fail gracefully.

Bringing it All Together:

The objective is to create a Bluesky bot that fetches the latest content from an RSS feed and then publishes it on the platform. We've established different components, each responsible for a specific task. Let's now stitch them together into a cohesive workflow.

  1. Initialization:

Start by importing the necessary libraries and setting global constants:

import feedparser
import urllib3
import json

# Constants
FEED_URL = "your_rss_feed_url_here"
BLUESKY_API_ENDPOINT = "https://bsky.social/xrpc/com.atproto.repo.createRecord"
  1. The Master Function:

This function will govern the bot's actions, calling upon the other functions to fetch content from the RSS feed, structure it for Bluesky, and publish it:

def bluesky_rss_bot():
    # Authenticate and obtain necessary credentials
    app_password = get_app_password()
    did = get_did()
    key = get_api_key(did, app_password)

    # Fetch content from the RSS feed
    post_title, post_link = get_rss_content()

    # Prepare the fetched content for Bluesky
    post_structure = prepare_post_for_bluesky(post_title, post_link)

    # Publish the content on Bluesky
    response = publish_on_bluesky(post_structure, did, key)

    # Optional: Return the response or post ID for logging or further actions
    return response
  1. Scheduling the Bot:

To make the bot run periodically, consider using a scheduler. If you're deploying on AWS Lambda, you can set a CloudWatch Events trigger to run your function at specified intervals. Alternatively, you can use Python's built-in schedule library or any other cron job system to execute the bot periodically.

  1. Monitoring and Logging:

Ensure that the bot is performing as expected by implementing logging. Any issues, such as failed posts or RSS feed fetch errors, should be logged and, if possible, notified to the administrator. This helps in maintaining the bot's health and ensuring consistent content posting.

  1. Further Enhancements:

    1. Content Variety: Fetch more content types, like images or videos, from the RSS feed and adjust the Bluesky post structure accordingly.
    2. Interactive Features: Implement features like replying to comments or using hashtags for better discoverability on Bluesky.
    3. Error Recovery: If a post fails, store the content and retry later or queue it for the next scheduled post.
def main():
    app_password = get_app_password()
    did = get_did()
    key = get_api_key(did, app_password)
    title, link = get_rss_content()
    response = post_to_bluesky(did, key, title, link)
    return response

This script will fetch the latest post from your RSS feed and publish it to Bluesky. Run the main() function whenever you want to autopost the latest content!

And there you have it—a simple bot that autopublishes posts from an RSS feed to Bluesky! If you have any questions or need further assistance, feel free to reach out.