In this project, I used Python to do some text analysis on Twitters people sent. This project can be divided into three parts: First, I used a vocabulary of sentiment words to score people’s Twitters. Second, I used the previous sentiment scores for Twitters to enrich the vocabulary. Third, I did some analysison the locations where people sent Twitters and Twitters’ sentiment scores. It turned out that people in Switzerland tend to be more positive than people in other places when they sent Twitters.
For the first step, I first fetched data through Twitter API and extracted Twitter’s contents through Python. Then I put Twitters and vocabulary of sentiment words in Dictionary format, and used keys of Dictionary to match each word in Twitter and the correspondent score. So the final score for each Twitter would be the average scores for each word.
For the second step, I used sentiment scores for Twitters to enlarge vocabulary of sentiment words. For example, soccer appeared many times in Twitters, but were not in the vocabulary. So I first selected out those Twitters with soccer, and I calculated whether there were more positive scores or negative scores. I used the average scores of the majority as the score for the new word and put it in the vocabulary.
For the third step, I used Twitter API to extract location information for each Twitter and calculated the average scores for each location. Because the data I got from Twitter API were data streams, so I loaded about 2000 Twitters to do the analysis.
Data Source: Twitter API.
Code: click here.