Week 10

In the final week of the program, I ran the coref-resolution codes from https://github.com/shon-otmazgin/lingmess-coref. It returns a list of lists of number pairs, with the first being the start index and second being the end index. The number spans a noun/pronoun/noun phrase, and everything in the same list refers to the same entity. The following is an example (after matching the index with actual tokens):

Week 9

This week I tested on how temperature impacts the gpt annotation performance. I maneuvered the temperature form 0.1 to 1.0. The following figure plots the data(score is the kohen kappa score), from which I observe that no optimal temperature works best for the gpt annotation. Image Alt Text

Week 8

I set up another experimental task allowing only “in” and “out” options. This time human and gpt agree on annotation 78% of the time. Meanwhile, the cohen kappa score among multiple runs of gpt annotation is 0.58. Since we are masking and prediting ingroup/outgroup/third-party entities, I calculate the number of usable comments in the dataset. To provide enough context, we keep comments with at least five words. There are in total 5019538 comments from paired posts. Among them, there are 190963 third-party mentions, 225591 ingroup mentions, and 201075 outgroup mentions. 124230 comments contain only third-party entity, 177301 comments contain only ingroup entity, 162211 comments contain only outgroup entity. 17606 comments have both ingroup and outgroup entities, 12832 comments have ingroup and third-party entities, 10384 comments have outgroup and third-party entities. There are 1888 comments that include all three types of entities. Finally, there are 506452 comments that have at least one type of masked entities. Those statistics demonstrate that using only the team name, there are sufficiently many comments that we can take advantage of.

Week 7

This week I did some meta data calculatation as all the data for 32 nfl teams have been scraped:

Week 6

This week I finally finished scraping all the Football data for past two game seasons. I thus know the proportion of comments that contain some team names (which by our hypothesis can be used as gold labels for computing the accuracy and model training). Take Texans’ statistics for an instance: there are in total 51861 comments scraped from the web. 5791 of them mention at least one team name. More over, since we want to know how fans of competing teams respond to the same game, we pair the game posts of competing teams together. There are 527 game day pairs, 461 post game pairs, and 27 pregame pairs.

Week 5

This past week, we experimented on human annotation. We used the same comments as we used last time on gpt4 and set up a task on MTurk. The issues with this experiment were 1) for each comment, there were more than one masked entities; 2) the comments chosen are not balanced in ingroup/outgroup masked entities. Therefore, we sampled a second group of comments and set up another pilot experiment, allowing answers from ingroup, outgroup, and unknown. Two members finished the task, with worker1 guessing correctly on 53.33% labels, and worker2 on 60% labels. They aligned on 53.33% answers. For the same masked entity annotation task, gpt4 achieved an accuracy of 46%. One potential bias in this experiment is that human designs the task so we know there’s no unknown label, but gpt4 doesn’t know.

Week 4

I sampled 30 comments from panthers’ dataset, masked/highlighted the teamname (as teamnames can be used as gold labels), and fed it into the gpt4 for model annotation. To instruct the gpt4 for the task, I formulated prompts. The following one is an example for the masked task.

Week 3

For the remaining 20 teams whose game posts are not takem in charge by the nfl_gdt_bot, I used keyword like “game day thread”, “post game thread”, “pregame thread”, found poster users, and located channels to scrape the posts. Scraping is still going on. Meanwhile, I also start to write codes that interacts with gpt4 to do the annotation task.

Week 2

This week I have succesfully gathered data for 8 teams out of 12 whose game posts are submitted by nfl_gdt_bot thus far. Each data point contains post id, comment id, parent id, raw comment, timestamp, subreddit name, poster username, flair, and upvote/downvote score. I also look into codes that mask the entities and convert the raw texts into html files for online annotation. The upcoming task is to employ GPT for annotating the anonymized entities.

Week 1

Over the last week, I understood the pmaw codes used previously for scraping the Reddit data. However, due to the Reddit regulation, it is no longer effective. I am developing scripts leveraging the praw module to scrape comments from the reddits.