Quantcast
Viewing all articles
Browse latest Browse all 10

How to parse a tweet text from Twitter using Ruby to parse-out ‘@’ and ‘#’

Well lot of us love @twitter and also Ruby, and some time work on both Image may be NSFW.
Clik here to view.
:)

And often we need to do the folowing with a tweet

Well I had to do the following quite often:-

Take out the ‘@’ (i.e. @replies )and ‘#’ (i.e. hashtags ) from a tweet and separate it from the text part.

For example, we have a tweet:

@myfriend1 @myfriend2 this is a sample text #link #text

Now I want this tweet to be seperated into the following Array:

['myfriend1','myfriend2']

['link','text']

and the text only – ["this is a sample text "]

So first had to build a RegE, and then using the ever useful .gsub method of Ruby, created the following:

parsed_text = tweet.text.gsub(/ ?(@\w+)| ?(#\w+)/) { |a| ((a.include?(‘#’)) ? tags : replies) << a.strip.gsub(/#|@/,”); ” }

So the parsed_text has the final text only.  tags is an Array which will contain the hashtags and replies is an Array which will contain the @replies.

The RegEx / ?(@\w+)| ?(#\w+)/ extracts and seperates the hashtags & the @replies and place them in two seperate arrays.

The RegEx /#|@/,” only reples the ‘@’ and ‘#’ symbols in the extracted array elements.

And you can download it from Gist here http://gist.github.com/78498

Also while working on creating the above regular expressions, I found this interesting RegEx testing site called www.rubular.com which will help you write regular expressions very easily.


Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.

Viewing all articles
Browse latest Browse all 10

Trending Articles