I set up another experimental task allowing only “in” and “out” options. This time human and gpt agree on annotation 78% of the time. Meanwhile, the cohen kappa score among multiple runs of gpt annotation is 0.58. Since we are masking and prediting ingroup/outgroup/third-party entities, I calculate the number of usable comments in the dataset. To provide enough context, we keep comments with at least five words. There are in total 5019538 comments from paired posts. Among them, there are 190963 third-party mentions, 225591 ingroup mentions, and 201075 outgroup mentions. 124230 comments contain only third-party entity, 177301 comments contain only ingroup entity, 162211 comments contain only outgroup entity. 17606 comments have both ingroup and outgroup entities, 12832 comments have ingroup and third-party entities, 10384 comments have outgroup and third-party entities. There are 1888 comments that include all three types of entities. Finally, there are 506452 comments that have at least one type of masked entities. Those statistics demonstrate that using only the team name, there are sufficiently many comments that we can take advantage of.