Predicting Raid Success: Forecasting Raid Damage with Player Analytics

Introduction

In this project, I will explore a dataset consisting of attacks on 496 bosses from r/kickopenthedoor. The scope of this project includes data cleaning, an initial exploratory analysis, and a focused exploration of the primary research question.

The central question guiding this project is “What is the impact of players and teams on boss raids?”.

KickOpenTheDoor

The subreddit r/kickopenthedoor (KOTD) is Reddit’s largest boss fight arena. Players participate in boss fights by commenting their “attacks” on the subreddit’s boss posts. These comments are processed by the KOTD Bot (u/KickOpenTheDoorBot), which acts like a game engine, and updates the bosses and players data accordingly.

Six primary teams compete for boss kills. These teams are divided and regrouped approximately every three months to form alliances. These alliances then perform “raids” on dying bosses to secure the kill for their group.

The Dataset

I collected the raw data by using Python and the PRAW (Python Reddit API Wrapper) to scrape the subreddit. The data was then stored into a SQL database, which included columns for boss_id, comment_id, user, unix, flair, and the body of the comment. After this initial collection, I used a Python script to process and refine the data into the raw DataFrame used for general analysis.

The initial raw DataFrame consists of 79,268 rows, each representing a reply from the KOTD Bot, and includes 33 columns. For this data analysis and prediction model, I will concentrate on a select number of key columns.

Column Description
boss_id Unique identifier for the Reddit post associated with the data
comment_id Unique identifier for the Reddit comment associated with the data
user Username of the person who triggered the bot command
unix Timestamp in unix indicating when the user commented
flair KOTD Team Reddit flair of the user
total_dmg Total damage dealt in an attack
bossHP Remaining Health of the boss after the attack
base_roll Initial damage roll determined by a six sided dice roll
type_dmg Damage dealt based on user’s level in the attack type
element Element of the attack
element_dmg Bonus damage based on the weapon’s element and the boss’s element weaknesses or resistances
bloodlust_mod Damage modifier applied from having the bloodlust effect
weakened_mod Damage modifier when the boss is weakened for an attack
parried_mod Damage modifier when the boss parries an attack

Data Cleaning

There are six main teams that compete for boss kills: Demon, Dragon, Gnome, Kobold, Plant, Undead.

The Slime team includes new players who have not been assigned to a specific team yet, while the Troll team represent the game administrators. Since only administrators can have distinctive flairs, I cleaned the flair column to ensure that users in the admin group are accurately placed into the Troll team.

I am interested in analyzing time from kill. The column time_from_kill will be calculated by subtracting the current time, unix, from the earliest kill time. I will use an inner merge to ensure that the DataFrame only includes bosses with a recorded kill comment.

For the purposes of this analysis, I will classify a comment as part of a raid if it was posted within 30 seconds of the kill. A boolean column raid will indicate whether a comment is included in the raid.

I created a column isAttack by determing if a comment has a base_roll. By the bot’s programming, any comments without a base_roll is not an attack

I combine the basic damage columns (base_roll, type_dmg) into one column bare_dmg.

I combine the weapon damage columns (weapon_dmg, weapon_type_bonus, element_dmg) into one column total_weapon_dmg.

Since the comments only reflect the boss’s HP after the attack, I used the attack’s total_dmg amount to determine the boss’s HP prior to the attack. This value is stored in the column bossHP_before.

Then, I only kept relevant columns from the DataFrame. I also only kept comments that were within ±1 hour of the kill.

Below is the head of the cleaned DataFrame.

boss_id comment_id bossHP_before total_dmg bare_dmg total_weapon_dmg element element_dmg bloodlust_mod weakened_mod parried_mod kill user flair unix unix_from_kill isAttack raid
1dw63pr lbvjbxr 307 20.4 12 8.4 cursed 2.4 nan nan nan nan bookseer Gnome 1720261981 -2767 True False
1dw63pr lbvk5a5 287 18.9 18 0.9 moon nan nan nan nan nan SeeMyDarkness Plant 1720262539 -2209 True False
1dw63pr lbvkryz 268 16.7 15 1.7 cursed 0.6 nan nan nan nan linden_slam Troll 1720262964 -1784 True False
1dw63pr lbvls7c 251 14.9 14 0.9 earth nan nan nan nan nan CoffeeWanderer Plant 1720263615 -1133 True False
1dw63pr lbvmqri 236 12 12 0 nan nan nan nan nan nan Fralien2610 Kobold 1720264239 -509 True False

Univariate Analysis

I am interested in analyzing how the number of comments varies according to the time from kill.

The figure indicates a sharp increase in the number of comments as the Seconds from Kill approaches 0. This increase is likely due to players behavior: players stop attacking bosses to prepare for raids, and players coordinate during raids to concentrate their attacks in order to prevent other teams from killing the boss. These patterns of attack provide insight to the behaviors that players can use to influence raids.


I want to explore the distribution of teams and their comment frequencies. I will use a pie chart to visually represent which teams contribute the most to the total number of comments.

The figure shows that Plants have the highest proportion of comments at 23.2%, reflecting a higher level of activity compared to the other teams. This suggests that Plants are likely the dominant force within this dataset, with players on this team having a higher impact in raids.

Bivariate Analysis

I want to explore the relationship between Total Damage and Boss HP, two metrics that impact the overall outcome of a raid. I expect there to be an increase in damage with lower Boss HP as users will use more powerful weapons to secure a kill for their team. Due to the large amount of comments in the dataset, I will use a heatmap for analysis.

The figure shows that the most total damage dealt occurs when the boss has low HP. This is likely a “raiding” zone, where players use stronger weapons to quickly kill the boss. Players who deal more damage in raids are likely to have a greater impact on the outcome of the raids.


I aim to analyze the contributions of individual users to their team’s performance during raids, using the raid column to identify those who frequently participate. I limited the figure to only show the top 15 users as there are over 100 unique users in the dataset.

The figure shows that SpaghetGaming has attended the most raids, just over 300, indicating a high level of engagement. Additionally, players from the Kobold, Plant, Dragon, and Demon Teams make up a majority of the top 15, suggesting that these teams are more dominant in the dataset due to having more members that frequently participate in raids.

Interesting Aggregates

I want to analyze the statistics of attacks within raid range (30 seconds from the kill) for each team. I grouped by flair and then performed an aggregate function to get the number of unique user, the sum of kill, and the mean & standard deviation of both bare_dmg and total_weapon_dmg.

flair Number of Unique Users Total Kills Avg. Barehand Damage Barehand Damage St. Dev Avg. Total Weapon Damage Total Weapon Damage St. Dev
Demon 12 61 14.5804 3.00232 19.4161 9.10537
Dragon 15 111 15.0994 3.19147 18.9948 7.6664
Gnome 19 43 13.2328 2.99049 17.7944 8.17878
Kobold 17 112 14.603 3.26737 17.905 7.28729
Plant 18 131 13.942 3.44775 20.3111 8.28258
Slime 1 0 5 nan 0 nan
Troll 8 3 12.9298 2.66498 19.6035 9.67927
Undead 11 35 13.9954 3.40246 17.8954 8.19568
[deleted] 1 1 15.2 5.84808 9.9 9.26876

The aggregate highlights key metrics such as unique users, total kills, and average damage (for barehand and weapon). We see that the main teams have similar numbers of unique raiders, or users. Plants appear to be the strongest team, having the highest mean total weapon damage and number of kills.


In KOTD, players are assigned to one of the main teams after completing the tutorial. It is well known within the community that each team has a few significant players who influence their team’s strength at different hours of the day. I am interested in exploring the impact of the time of day on the activity of various teams.

To analyze this, I grabbed all attacks from the cleaned dataset. Then, I grouped by boss_id, hour, and flair and calculated the count of flair for each boss and hour. Then for each hour, I calculated the average count of flair found. The resulting hour variable in the chart is in the UTC timezone.

The figure provides a comprehensive view of the average number of attacks (within 1 hour of the kill) on a bosses for each team, broken down by the hour of the day. Notably, the Plant team exhibits consistently higher activity levels compared to the other teams. Their activity is particularly higher between 10:00 and 00:00 UTC. In contrast, the other five teams display fluctuating levels of activity. For example, the Gnome team experiences peak acitivty between 08:00 and 22:00 UTC, with a noticeable decline from 23:00 to 04:00 UTC. This suggests that time may have an impact on raids, as specific teams will be weaker or stronger at different times.

Imputation

For the columns involving data related to the attack (bossHP_before, bare_dmg, total_weapon_dmg, dmg_bonus, dmg_mod, attack_type, element), I chose not to impute any missing values. This decision was based on the fact that rows with missing values in this columns indicate non-attacks. Imputing values for these rows would be illogical and could skew the results. I kept these rows to ensure that failed attack attempts were considered in my analysis, providing a more comprehensive understanding of the data.

Framing a Prediction Problem

In the previous section, I found that players (and by extension, their team) have a significant influence on the outcome of raids. As a player, I am keen on predicting our team’s damage to ensure confidence when participating in a raid. By leveraging certain variables available before a raid, would it be possible to predict the total damage inflicted during the raid?

The prediction problem involves predicting the dmg_dealt variable, which represents the damage dealt during a raid. This is a regression problem, so I will be using a regression model to solve the problem.

The response variable predicted is dmg_dealt. I opted for this variable because teams need to know the expected damage dealt before starting a raid. If the predicted damage dealt is less than the boss’s HP, the teams will avoid raiding to prevent a likely failure in killing the boss. Ensuring an accurate prediction helps in decision making about when to proceed with the raid.

To evaluate the performance of the regression model, I will use Mean Squared Error (MSE). MSE is chosen because it measures the average squared difference between the predicted values and the actual values. It is a good metric for regression problems as it penalizes larger errors more significantly, providing a clear indication of the model’s accuracy.

When coordinating a raid, players generally know the boss’s element weaknesses and resistances; parry and weakened status; and HP. Additionally, players know approximately how many players will be raiding. To reflect this, I reorganized the dataset so that each row captures this information for each boss. The head of the processed DataFrame is shown below.

boss_id dmg_dealt num_bloodlust W P Demon_nunique Dragon_nunique Gnome_nunique Kobold_nunique Plant_nunique Troll_nunique Undead_nunique element_air element_blessed element_cursed element_earth element_fire element_moon element_organic element_sun element_synthetic element_water
1dw63pr 239.7 0 False False 3 0 0 0 2 0 2 neutral neutral neutral neutral weak neutral neutral neutral neutral neutral
1dw8okb 251.7 0 False False 0 0 5 2 0 0 0 weak neutral neutral neutral neutral weak neutral neutral neutral neutral
1dwdnif 253.8 2 False False 2 0 0 0 2 0 2 neutral neutral neutral weak resist neutral neutral neutral neutral weak
1dwgenp 199.1 0 False False 0 3 0 3 0 0 0 neutral weak neutral neutral neutral neutral neutral neutral neutral neutral
1dwgg79 37 0 False False 0 0 1 0 0 0 0 neutral neutral neutral neutral neutral neutral neutral neutral neutral weak

The elements (air, blessed, cursed, earth, fire, moon, organic, sun, synthetic, water) will be represented by 1 if the boss is weak to the element, -1 if the boss resists the element, and 0 otherwise.

W represents if the boss is weakened, meaning that there is a chance for players to deal double damage.

P represents if the boss parries, meaning that there is a chance for players to deal half their usual damage.

Baseline Model

The baseline model I will be using is a Linear Regression model. The objective of the model is to predict the total damage dealt in a raid, given the number of unique users participating in the raid for each team.

There will be 6 quantitative features in the model. As there were no categorical features used, no encoding was made.

Features in the Baseline Model

Feature Description
Demon_nunique Integer variable representing the number of unique users participating in the raid from the Demon team.
Dragon_nunique Integer variable representing the number of unique users participating in the raid from the Dragon team.
Gnome_nunique Integer variable representing the number of unique users participating in the raid from the Gnome team.
Kobold_nunique Integer variable representing the number of unique users participating in the raid from the Kobold team.
Plant_nunique Integer variable representing the number of unique users participating in the raid from the Plant team.
Undead_nunique Integer variable representing the number of unique users participating in the raid from the Undead team.

I performed a train-test split (3:1) on the dataset. I will use MSE to evaluate the model’s performance.

Training Set Performance: The MSE for the training set is approximately 3585.

Testing Set Performance: The MSE for the testing set is approximately 6759.

The significant difference between the training MSE and the testing MSE suggests that the model may be overfitting the training data and is failing to generalize to unseen data. A good model would show more consistent performance across different folds for both training and testing data. Given that the current baseline model struggles to generalize to the testing data, I do not believe that the current model is “good”.

Final Model

Before I started working on the final prediction model, I wanted to explore the relationship between the various features and our result, dmg_dealt.

Here, the figure shows that (in most cases) the number of users with bloodlust coorelates to higher damages.

As a result of imputing “neutral” for missing effectiveness values, the “neutral” category appears to have a wide range of data. However, it is clear that bosses that are weak to certain elements are dealt more damage compared to those that are resistant. It may be worth investigating the impact of the number of weak elements on the total damage dealt.

Here, it appears that (in general) as bosses have more weak elements, more damage is dealt to a boss.

According to the figure, it seems that with more unique users in a raid, more damage is dealt. However, the difference in damages when separated by team does not appear to vary much by team. Like with elements, perhaps it’s best to look at the general number of users in a team rather than at each team.

Like with the elements, as the number of unique users participating in a raid increases, so does the damage dealt.

Here, the figure shows that the modifier (Weakened or Parry) provides an effect on the damage dealt if the modifier is active. Weakened bosses take slightly more damage while parry bosses take less damage.


Features Added in the Final Model

In addition to the features present in the baseline model, the final model includes the following features:

Feature Description
num_bloodlust Integer variable representing the number of users attacking with bloodlust.
W Boolean variable representing if a Boss has the weakened status.
P Boolean variable representing if a Boss has the parry status.
element_air Nominal variable representing the effectiveness of air weapons against the boss.
element_blessed Nominal variable representing the effectiveness of blessed weapons against the boss.
element_cursed Nominal variable representing the effectiveness of cursed weapons against the boss.
element_earth Nominal variable representing the effectiveness of earth weapons against the boss.
element_fire Nominal variable representing the effectiveness of fire weapons against the boss.
element_moon Nominal variable representing the effectiveness of moon weapons against the boss.
element_organic Nominal variable representing the effectiveness of organic weapons against the boss.
element_sun Nominal variable representing the effectiveness of sun weapons against the boss.
element_synthetic Nominal variable representing the effectiveness of synthetic weapons against the boss.
element_water Nominal variable representing the effectiveness of water weapons against the boss.

Justification for Added Features The feature num_bloodlust captures the presence of a significant damage-enhancing status. Players with bloodlust contritbute disproportionately to the total damage due to their individual doubled damage output.

The features W and P capture the boss’s status, which directly affect how much damage they take. Weakened bosses havee a chance to take 2.0x damage from players while Parry bosses have a chance to take 0.5x damage instead. These statuses are known at the start of a raid and can be critical for predicting the potential damage dealt.

The set of elemental effectiveness features indicate which weapons would be most effective against the boss. Bosses with weaknesses to certain elements are more vulnerable to attacks using those elements, which leads to higher damage dealt. Additionally, bosses with more weaknesses provide a larger pool of effective weapons.

By incorperating these features, the model is better equipped to capture the complex features comprising a raid, leading to a more accurate prediction of the total damage dealt.

Modeling Algorithm and Hyperparameters

The final modeling algorithm chosen is Linear Regression. Linear regression is a straightforward and interpretable regression method that fits a linear model to minimize the residual sum of squares between the observed and predicted values.

I created a weakness pipeline to determine the number of weak elements on a boss and create a polynomial feature from the result. I also created a raiders pipeline to determine the number of unique raiders on a boss and used the result to create a polynomial feature. I then used the features num_bloodlust, W, P to create a final polynomial feature.

These three features are then pipelined into the final LinearRegression model.

Hyperparameters

I used GridSearchCV to select the best hyperparameters. The algorithm performs a search over a specified parameter grid, evaluating each combination through a 10-fold cross validation. This ensures that the model’s hyperparameters are fine-tuned for optimal performance.

The hyperparameters that ended up performing the best, as identified by GridSearchCV, are:

Final Model Performance Improvement

Baseline Model Performance

For reference, the baseline model achieved:

These results suggested that the baseline model was overfitting to the training data, as indicated by the substantial difference between the training and testing MSE values.

Final Model Performance

After tuning the hyperparameters with GridSearchCV, the final model achieved:

Improvement Analysis

The significant reduction in both the training and testing MSE in the final model demontrates an overall improvement in the model’s performance.

The final model has a much lower training MSE and testing MSE, indicating that it generalizes better to both seen and unseen data compared to the baseline model. Additionally, the closer values of training and testing MSE suggest that the model is less prone to overfitting the data. Additionally, the lower MSE values reflect that the overall final model provides more accurate predictions, both on the training and when evaluated on new data.