SNIPPET
2 Upvotes

Scrape Reddit Posts Using PMAW & Python

Python
Import and Export

In this snippet the PMAW library is used to scrape 60,000 posts from the technology subreddit between 1st October 2019 and 1st October 2021. The results are used to create a dataframe which is then output to a csv file.

import pandas as pd
from pmaw import PushshiftAPI
import datetime as dt
import os.path as path

api = PushshiftAPI()

before = int(dt.datetime(2021,10,1,0,0).timestamp())
after = int(dt.datetime(2019,10,1,0,0).timestamp())

subreddit="technology"
limit=60000
posts = api.search_submissions(subreddit=subreddit, limit=limit, before=before, after=after)
posts_df = pd.DataFrame(posts)
filepath = path.abspath(path.join(__file__ ,'../..','data/raw/technology_posts.csv'))
posts_df.to_csv(filepath,index=False)

By GregHe1979 - Last Updated Nov. 18, 2021, 6:39 p.m.

COMMENTS
RELATED SNIPPETS
Search Snippets by Tag: