Predicting Raid Success: Forecasting Raid Damage with Player Analytics

Introduction

In this project, I will explore a dataset consisting of attacks on 496 bosses from r/kickopenthedoor. The scope of this project includes data cleaning, an initial exploratory analysis, and a focused exploration of the primary research question.

The central question guiding this project is “What is the impact of players and teams on boss raids?”.

KickOpenTheDoor

The subreddit r/kickopenthedoor (KOTD) is Reddit’s largest boss fight arena. Players participate in boss fights by commenting their “attacks” on the subreddit’s boss posts. These comments are processed by the KOTD Bot (u/KickOpenTheDoorBot), which acts like a game engine, and updates the bosses and players data accordingly.

Six primary teams compete for boss kills. These teams are divided and regrouped approximately every three months to form alliances. These alliances then perform “raids” on dying bosses to secure the kill for their group.

The Dataset

I collected the raw data by using Python and the PRAW (Python Reddit API Wrapper) to scrape the subreddit. The data was then stored into a SQL database, which included columns for boss_id, comment_id, user, unix, flair, and the body of the comment. After this initial collection, I used a Python script to process and refine the data into the raw DataFrame used for general analysis.

The initial raw DataFrame consists of 79,268 rows, each representing a reply from the KOTD Bot, and includes 33 columns. For this data analysis and prediction model, I will concentrate on a select number of key columns.

Column	Description
`boss_id`	Unique identifier for the Reddit post associated with the data
`comment_id`	Unique identifier for the Reddit comment associated with the data
`user`	Username of the person who triggered the bot command
`unix`	Timestamp in unix indicating when the user commented
`flair`	KOTD Team Reddit flair of the user
`total_dmg`	Total damage dealt in an attack
`bossHP`	Remaining Health of the boss after the attack
`base_roll`	Initial damage roll determined by a six sided dice roll
`type_dmg`	Damage dealt based on user’s level in the attack type
`element`	Element of the attack
`element_dmg`	Bonus damage based on the weapon’s element and the boss’s element weaknesses or resistances
`bloodlust_mod`	Damage modifier applied from having the bloodlust effect
`weakened_mod`	Damage modifier when the boss is weakened for an attack
`parried_mod`	Damage modifier when the boss parries an attack

Data Cleaning

There are six main teams that compete for boss kills: Demon, Dragon, Gnome, Kobold, Plant, Undead.

The Slime team includes new players who have not been assigned to a specific team yet, while the Troll team represent the game administrators. Since only administrators can have distinctive flairs, I cleaned the flair column to ensure that users in the admin group are accurately placed into the Troll team.

I am interested in analyzing time from kill. The column time_from_kill will be calculated by subtracting the current time, unix, from the earliest kill time. I will use an inner merge to ensure that the DataFrame only includes bosses with a recorded kill comment.

For the purposes of this analysis, I will classify a comment as part of a raid if it was posted within 30 seconds of the kill. A boolean column raid will indicate whether a comment is included in the raid.

I created a column isAttack by determing if a comment has a base_roll. By the bot’s programming, any comments without a base_roll is not an attack

I combine the basic damage columns (base_roll, type_dmg) into one column bare_dmg.

I combine the weapon damage columns (weapon_dmg, weapon_type_bonus, element_dmg) into one column total_weapon_dmg.

Since the comments only reflect the boss’s HP after the attack, I used the attack’s total_dmg amount to determine the boss’s HP prior to the attack. This value is stored in the column bossHP_before.

Then, I only kept relevant columns from the DataFrame. I also only kept comments that were within ±1 hour of the kill.

Below is the head of the cleaned DataFrame.

boss_id	comment_id	bossHP_before	total_dmg	bare_dmg	total_weapon_dmg	element	element_dmg	bloodlust_mod	weakened_mod	parried_mod	kill	user	flair	unix	unix_from_kill	isAttack	raid
1dw63pr	lbvjbxr	307	20.4	12	8.4	cursed	2.4	nan	nan	nan	nan	bookseer	Gnome	1720261981	-2767	True	False
1dw63pr	lbvk5a5	287	18.9	18	0.9	moon	nan	nan	nan	nan	nan	SeeMyDarkness	Plant	1720262539	-2209	True	False
1dw63pr	lbvkryz	268	16.7	15	1.7	cursed	0.6	nan	nan	nan	nan	linden_slam	Troll	1720262964	-1784	True	False
1dw63pr	lbvls7c	251	14.9	14	0.9	earth	nan	nan	nan	nan	nan	CoffeeWanderer	Plant	1720263615	-1133	True	False
1dw63pr	lbvmqri	236	12	12	0	nan	nan	nan	nan	nan	nan	Fralien2610	Kobold	1720264239	-509	True	False

Univariate Analysis

I am interested in analyzing how the number of comments varies according to the time from kill.

The figure indicates a sharp increase in the number of comments as the Seconds from Kill approaches 0. This increase is likely due to players behavior: players stop attacking bosses to prepare for raids, and players coordinate during raids to concentrate their attacks in order to prevent other teams from killing the boss. These patterns of attack provide insight to the behaviors that players can use to influence raids.

I want to explore the distribution of teams and their comment frequencies. I will use a pie chart to visually represent which teams contribute the most to the total number of comments.

The figure shows that Plants have the highest proportion of comments at 23.2%, reflecting a higher level of activity compared to the other teams. This suggests that Plants are likely the dominant force within this dataset, with players on this team having a higher impact in raids.

Bivariate Analysis

I want to explore the relationship between Total Damage and Boss HP, two metrics that impact the overall outcome of a raid. I expect there to be an increase in damage with lower Boss HP as users will use more powerful weapons to secure a kill for their team. Due to the large amount of comments in the dataset, I will use a heatmap for analysis.

The figure shows that the most total damage dealt occurs when the boss has low HP. This is likely a “raiding” zone, where players use stronger weapons to quickly kill the boss. Players who deal more damage in raids are likely to have a greater impact on the outcome of the raids.

I aim to analyze the contributions of individual users to their team’s performance during raids, using the raid column to identify those who frequently participate. I limited the figure to only show the top 15 users as there are over 100 unique users in the dataset.

The figure shows that SpaghetGaming has attended the most raids, just over 300, indicating a high level of engagement. Additionally, players from the Kobold, Plant, Dragon, and Demon Teams make up a majority of the top 15, suggesting that these teams are more dominant in the dataset due to having more members that frequently participate in raids.

Interesting Aggregates

I want to analyze the statistics of attacks within raid range (30 seconds from the kill) for each team. I grouped by flair and then performed an aggregate function to get the number of unique user, the sum of kill, and the mean & standard deviation of both bare_dmg and total_weapon_dmg.

flair	Number of Unique Users	Total Kills	Avg. Barehand Damage	Barehand Damage St. Dev	Avg. Total Weapon Damage	Total Weapon Damage St. Dev
Demon	12	61	14.5804	3.00232	19.4161	9.10537
Dragon	15	111	15.0994	3.19147	18.9948	7.6664
Gnome	19	43	13.2328	2.99049	17.7944	8.17878
Kobold	17	112	14.603	3.26737	17.905	7.28729
Plant	18	131	13.942	3.44775	20.3111	8.28258
Slime	1	0	5	nan	0	nan
Troll	8	3	12.9298	2.66498	19.6035	9.67927
Undead	11	35	13.9954	3.40246	17.8954	8.19568
[deleted]	1	1	15.2	5.84808	9.9	9.26876

The aggregate highlights key metrics such as unique users, total kills, and average damage (for barehand and weapon). We see that the main teams have similar numbers of unique raiders, or users. Plants appear to be the strongest team, having the highest mean total weapon damage and number of kills.

In KOTD, players are assigned to one of the main teams after completing the tutorial. It is well known within the community that each team has a few significant players who influence their team’s strength at different hours of the day. I am interested in exploring the impact of the time of day on the activity of various teams.

To analyze this, I grabbed all attacks from the cleaned dataset. Then, I grouped by boss_id, hour, and flair and calculated the count of flair for each boss and hour. Then for each hour, I calculated the average count of flair found. The resulting hour variable in the chart is in the UTC timezone.

The figure provides a comprehensive view of the average number of attacks (within 1 hour of the kill) on a bosses for each team, broken down by the hour of the day. Notably, the Plant team exhibits consistently higher activity levels compared to the other teams. Their activity is particularly higher between 10:00 and 00:00 UTC. In contrast, the other five teams display fluctuating levels of activity. For example, the Gnome team experiences peak acitivty between 08:00 and 22:00 UTC, with a noticeable decline from 23:00 to 04:00 UTC. This suggests that time may have an impact on raids, as specific teams will be weaker or stronger at different times.

Imputation

For the columns involving data related to the attack (bossHP_before, bare_dmg, total_weapon_dmg, dmg_bonus, dmg_mod, attack_type, element), I chose not to impute any missing values. This decision was based on the fact that rows with missing values in this columns indicate non-attacks. Imputing values for these rows would be illogical and could skew the results. I kept these rows to ensure that failed attack attempts were considered in my analysis, providing a more comprehensive understanding of the data.

Framing a Prediction Problem

In the previous section, I found that players (and by extension, their team) have a significant influence on the outcome of raids. As a player, I am keen on predicting our team’s damage to ensure confidence when participating in a raid. By leveraging certain variables available before a raid, would it be possible to predict the total damage inflicted during the raid?

The prediction problem involves predicting the dmg_dealt variable, which represents the damage dealt during a raid. This is a regression problem, so I will be using a regression model to solve the problem.

The response variable predicted is dmg_dealt. I opted for this variable because teams need to know the expected damage dealt before starting a raid. If the predicted damage dealt is less than the boss’s HP, the teams will avoid raiding to prevent a likely failure in killing the boss. Ensuring an accurate prediction helps in decision making about when to proceed with the raid.

To evaluate the performance of the regression model, I will use Mean Squared Error (MSE). MSE is chosen because it measures the average squared difference between the predicted values and the actual values. It is a good metric for regression problems as it penalizes larger errors more significantly, providing a clear indication of the model’s accuracy.

When coordinating a raid, players generally know the boss’s element weaknesses and resistances; parry and weakened status; and HP. Additionally, players know approximately how many players will be raiding. To reflect this, I reorganized the dataset so that each row captures this information for each boss. The head of the processed DataFrame is shown below.

boss_id	dmg_dealt	num_bloodlust	W	P	Demon_nunique	Dragon_nunique	Gnome_nunique	Kobold_nunique	Plant_nunique	Undead_nunique	element_air	element_blessed	element_cursed	element_earth	element_fire	element_moon	element_organic	element_sun	element_synthetic	element_water
1dw63pr	239.7	0	False	False	3	0	0	0	2	2	neutral	neutral	neutral	neutral	weak	neutral	neutral	neutral	neutral	neutral
1dw8okb	251.7	0	False	False	0	0	5	2	0	0	weak	neutral	neutral	neutral	neutral	weak	neutral	neutral	neutral	neutral
1dwdnif	253.8	2	False	False	2	0	0	0	2	2	neutral	neutral	neutral	weak	resist	neutral	neutral	neutral	neutral	weak
1dwgenp	199.1	0	False	False	0	3	0	3	0	0	neutral	weak	neutral	neutral	neutral	neutral	neutral	neutral	neutral	neutral
1dwgg79	37	0	False	False	0	0	1	0	0	0	neutral	neutral	neutral	neutral	neutral	neutral	neutral	neutral	neutral	weak

The elements (air, blessed, cursed, earth, fire, moon, organic, sun, synthetic, water) will be represented by 1 if the boss is weak to the element, -1 if the boss resists the element, and 0 otherwise.

W represents if the boss is weakened, meaning that there is a chance for players to deal double damage.

P represents if the boss parries, meaning that there is a chance for players to deal half their usual damage.

Baseline Model

The baseline model I will be using is a Linear Regression model. The objective of the model is to predict the total damage dealt in a raid, given the number of unique users participating in the raid for each team.

There will be 6 quantitative features in the model. As there were no categorical features used, no encoding was made.

Features in the Baseline Model

Feature	Description
`Demon_nunique`	Integer variable representing the number of unique users participating in the raid from the Demon team.
`Dragon_nunique`	Integer variable representing the number of unique users participating in the raid from the Dragon team.
`Gnome_nunique`	Integer variable representing the number of unique users participating in the raid from the Gnome team.
`Kobold_nunique`	Integer variable representing the number of unique users participating in the raid from the Kobold team.
`Plant_nunique`	Integer variable representing the number of unique users participating in the raid from the Plant team.
`Undead_nunique`	Integer variable representing the number of unique users participating in the raid from the Undead team.

I performed a train-test split (3:1) on the dataset. I will use MSE to evaluate the model’s performance.

Training Set Performance: The MSE for the training set is approximately 3585.

Testing Set Performance: The MSE for the testing set is approximately 6759.

The significant difference between the training MSE and the testing MSE suggests that the model may be overfitting the training data and is failing to generalize to unseen data. A good model would show more consistent performance across different folds for both training and testing data. Given that the current baseline model struggles to generalize to the testing data, I do not believe that the current model is “good”.

Final Model

Before I started working on the final prediction model, I wanted to explore the relationship between the various features and our result, dmg_dealt.

Here, the figure shows that (in most cases) the number of users with bloodlust coorelates to higher damages.

As a result of imputing “neutral” for missing effectiveness values, the “neutral” category appears to have a wide range of data. However, it is clear that bosses that are weak to certain elements are dealt more damage compared to those that are resistant. It may be worth investigating the impact of the number of weak elements on the total damage dealt.

Here, it appears that (in general) as bosses have more weak elements, more damage is dealt to a boss.

According to the figure, it seems that with more unique users in a raid, more damage is dealt. However, the difference in damages when separated by team does not appear to vary much by team. Like with elements, perhaps it’s best to look at the general number of users in a team rather than at each team.

Like with the elements, as the number of unique users participating in a raid increases, so does the damage dealt.

Here, the figure shows that the modifier (Weakened or Parry) provides an effect on the damage dealt if the modifier is active. Weakened bosses take slightly more damage while parry bosses take less damage.

Features Added in the Final Model

In addition to the features present in the baseline model, the final model includes the following features:

Feature	Description
`num_bloodlust`	Integer variable representing the number of users attacking with bloodlust.
`W`	Boolean variable representing if a Boss has the weakened status.
`P`	Boolean variable representing if a Boss has the parry status.
`element_air`	Nominal variable representing the effectiveness of air weapons against the boss.
`element_blessed`	Nominal variable representing the effectiveness of blessed weapons against the boss.
`element_cursed`	Nominal variable representing the effectiveness of cursed weapons against the boss.
`element_earth`	Nominal variable representing the effectiveness of earth weapons against the boss.
`element_fire`	Nominal variable representing the effectiveness of fire weapons against the boss.
`element_moon`	Nominal variable representing the effectiveness of moon weapons against the boss.
`element_organic`	Nominal variable representing the effectiveness of organic weapons against the boss.
`element_sun`	Nominal variable representing the effectiveness of sun weapons against the boss.
`element_synthetic`	Nominal variable representing the effectiveness of synthetic weapons against the boss.
`element_water`	Nominal variable representing the effectiveness of water weapons against the boss.

Justification for Added Features The feature num_bloodlust captures the presence of a significant damage-enhancing status. Players with bloodlust contritbute disproportionately to the total damage due to their individual doubled damage output.

The features W and P capture the boss’s status, which directly affect how much damage they take. Weakened bosses havee a chance to take 2.0x damage from players while Parry bosses have a chance to take 0.5x damage instead. These statuses are known at the start of a raid and can be critical for predicting the potential damage dealt.

The set of elemental effectiveness features indicate which weapons would be most effective against the boss. Bosses with weaknesses to certain elements are more vulnerable to attacks using those elements, which leads to higher damage dealt. Additionally, bosses with more weaknesses provide a larger pool of effective weapons.

By incorperating these features, the model is better equipped to capture the complex features comprising a raid, leading to a more accurate prediction of the total damage dealt.

Modeling Algorithm and Hyperparameters

The final modeling algorithm chosen is Linear Regression. Linear regression is a straightforward and interpretable regression method that fits a linear model to minimize the residual sum of squares between the observed and predicted values.

I created a weakness pipeline to determine the number of weak elements on a boss and create a polynomial feature from the result. I also created a raiders pipeline to determine the number of unique raiders on a boss and used the result to create a polynomial feature. I then used the features num_bloodlust, W, P to create a final polynomial feature.

These three features are then pipelined into the final LinearRegression model.

Hyperparameters

I used GridSearchCV to select the best hyperparameters. The algorithm performs a search over a specified parameter grid, evaluating each combination through a 10-fold cross validation. This ensures that the model’s hyperparameters are fine-tuned for optimal performance.

The hyperparameters that ended up performing the best, as identified by GridSearchCV, are:

Num Weaknesses PolynomialFeatures Degree: 2
Num Raiders PolynomialFeatures Degree: 3
Bloodlust/Weakened/Parry PolynomialFeatures Degree: 2
Linear Regression Fit Intercept: False

Final Model Performance Improvement

Baseline Model Performance

For reference, the baseline model achieved:

Training Set MSE: 3584.90
Testing Set MSE: 6758.79

These results suggested that the baseline model was overfitting to the training data, as indicated by the substantial difference between the training and testing MSE values.

Final Model Performance

After tuning the hyperparameters with GridSearchCV, the final model achieved:

Training Set MSE: 1583.34
Testing Set MSE: 3413.23

Improvement Analysis

The significant reduction in both the training and testing MSE in the final model demontrates an overall improvement in the model’s performance.

The final model has a much lower training MSE and testing MSE, indicating that it generalizes better to both seen and unseen data compared to the baseline model. Additionally, the closer values of training and testing MSE suggest that the model is less prone to overfitting the data. Additionally, the lower MSE values reflect that the overall final model provides more accurate predictions, both on the training and when evaluated on new data.