Remove Usernames & HTTP Links From Tweet Data


Here we have tweet data in a dataframe column. We use declare a function that uses regex to remove any words the start with '@' (usernames) or 'http' (links). We then use Pandas apply to pass each tweet in the dataframe to the function to process the data.

 1|  import re
 3|  def remove_usernames_links(tweet):
 4|      tweet = re.sub('@[^\s]+','',tweet)
 5|      tweet = re.sub('http[^\s]+','',tweet)
 6|      return tweet
 7|  df['tweet'] = df['tweet'].apply(remove_usernames_links)
Did you find this snippet useful?

Sign up for free to to add this to your code library