2 Upvotes

Remove Usernames & HTTP Links From Tweet Data

Here we have tweet data in a dataframe column. We use declare a function that uses regex to remove any words the start with '@' (usernames) or 'http' (links). We then use Pandas apply to pass each tweet in the dataframe to the function to process the data.

import re

def remove_usernames_links(tweet):
    tweet = re.sub('@[^\s]+','',tweet)
    tweet = re.sub('http[^\s]+','',tweet)
    return tweet
df['tweet'] = df['tweet'].apply(remove_usernames_links)

By detro - Last Updated Jan. 17, 2021, 8:42 p.m.

Did you find this snippet useful?

Sign up to bookmark this in your snippet library

COMMENTS
RELATED SNIPPETS
Top Contributors
75