A Two-Stage Spatio-Geometrical Clustering of Football Team Shape for Post-Match Review

Research Article

A Two-Stage Spatio-Geometrical Clustering of Football Team Shape for Post-Match Review

  • Ali Zare Zardiny *
  • Zahra Bahramian

School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran.

*Corresponding Author: Ali Zare Zardiny, School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran.

Citation: Zardiny. A. Z, Bahramian. Z. (2025). A Two-Stage Spatio-Geometrical Clustering of Football Team Shape for Post-Match Review, Academic Journal of Clinical Research and Reports, BioRes Scientia Publishers. 1(1):1-14. DOI: 10.59657/brs.25.ajcrr.006

Copyright: © 2025 Ali Zare Zardiny, this is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Received: October 21, 2024 | Accepted: December 31, 2024 | Published: January 08, 2025

Abstract

The existence of a significant amount of spatio-temporal data in a football match creates a good potential for Post-Match Review and team analysis. These analyses can be done by focusing on the whole team or individual players. The purpose of this paper is to analyze the team general behavior, from a spatio-geometrical point of view. This process starts by defining a convex hull as the team shape in each time frame. Then, a set of spatial, geometric, zone-based, and event-based parameters are introduced and extracted to describe the team shape. These descriptors are the basis of the two-stage spatio-geometrical clustering of the team during the match. Clustering helps identify similar patterns in a team shape during both in-possession and out-of-possession situations. Evaluating these clusters in the Post-Match Review process, based on good and mediocre team performance at different times of the match, can determine the team's technical strategies. No need to transfer the shape to the image space, no need for image processing techniques for analysis, introducing a new geometric descriptor, and performing clustering in two stages for a better and more meaningful interpretation of the team shape are the distinguishing points of this article.


Keywords: data; football; post-match review; team shape; clustering; shape descriptors

Introduction

Nowadays, with the development of technologies in recent years, a wide range of data is collected in every football match. A significant part of these data, including the location of the players on the pitch at each frame of the match, the ball location, as well as the event data of the match, have a spatio-temporal nature. The huge volume of these data has created a good potential for performing various technical analyses by football analysts. These analyses can provide a deep insight of spatial and temporal behavior patterns of teams and movement patterns of players during matches for coaches. Various methods have been presented to analyze the performance of a football team in a match. In a general classification, these analyses can be examined at two levels, micro and macro (Araújo et al., 2015). Micro-level analyses evaluate the team performance by focusing on the activity of individual players. The distance of running, players speed, the amount of participation in different events, the quality of the game without the ball are among these analyses. But macro-level analyses examine the entire team as a single entity (Zhang, 2022). Among these analyses, we can mention the overall arrangement of the team, the amount of coverage of the team on the pitch, and the overall movement of the team. In the macro level analyses, the geometric shape of the team is generally less considered and usually the main focus is on quantitative parameters such as area, perimeter, length and width of the team. Also, in a part of the research, the geometric shape of the team has been analyzed independently of the scale and direction, while the area covered by the team and the orientation of the team are important parameters in the analysis. The main objective of this paper is to conduct a Post-Match Review by evaluating overall behavior of the team from a spatial perspective. This evaluation is performed at the macro level and on the spatio-temporal data collected during the match. In fact, instead of point analysis of data (for each player), the whole team is defined in the form of a single geometry. 

For post-match review, in the first step, the general shape of the team is defined in each time frame. Then, it is necessary to extract and analyze the different characteristics of the team's shape based on a set of spatial, geometric, zone-based, and event-based descriptors. In the next step, based on the events descriptors, all frames are partitioned into two states of in-possession and out-of-possession. Finally, by clustering the shapes of the team during the match based on the descriptors, similar behavior patterns are identified. This clustering is done in two stages. In the first stage, the main clusters are determined according to the location of the team and the overlap of the team's shape with different pitch zones. Then, in the second stage, the shapes placed in each main cluster are clustered based on the geometric descriptors and sub-clusters are formed. Considering the good or mediocre game of the team at different times of match, indicators are presented to check the effectiveness of these identified clusters on the quality of the play. The general strategy of the team in the times leading to the goals of the match, the relationship of the team's shape with the number of passes and the number of wins and losses in the challenges and the general behavior of the team in managing the time and space during the match are among these indicators. The volume and variety of data in football has provided good potential for analysis to data science experts and football analysts, and therefore, numerous researches have been conducted in relation with the analysis of individual and collective behavior (Liu et al., 2015;Carling, Williams, and Reilly, 2005; Kotzbek and Kainz, 2015; Kotzbek and Kainz, 2018; Caetano et al., 2021). However, our contributions in this paper include the clustering process based solely on vector data and without the need for image processing techniques, considering more meaningful parameters in describing the team shape, introducing a new geometric descriptor, considering the events of the match and performing the clustering in two stages and presenting some indicators to check the effects of clusters on the quality of the game. These cases have made it possible to provide a more detailed and comprehensive analysis of the team's overall behavior during the post-match review. This paper is organized in five sections. In the second section, previous studies are reviewed in the field of team shape modeling and analysis. The third section expresses the proposed method for team shape describing and spatio-geometrical clustering. In the fourth section, the explanations related to the implementation of the proposed method and its results are presented and interpreted from the spatio-geometrical dimensions. Finally, the last section deals with conclusions and suggestions for future studies.

Background

Post-match reviews by analyzing the performance of the entire team or individual players can give coaches a deep insight into the game (Carling, Williams, and Reilly, 2005). This information can be used in various topics such as performance analysis, tactical adjustments, player development, injury prevention and recovery and team cohesion to help the coaches (Sarmento et al., 2014; Teixeira et al., 2021; Querido et al., 2022).  In recent years, the importance of this topic has attracted the attention of researchers. According to the wide range of subjects, researches that analyzed the behavior of football teams from a geometrical and spatial point of view are examined here. Zhang (2022) has evaluated the technical behavior of the team in different phases of attack, defense, transfer of the game from attack to defense and from defense to attack based on the player’s positional data. In this research, the convex hull method is considered to define the team shape. Then the status of the team is determined based on the data of the match events. After that, based on the team shape, the parameters of the geometric center, length, width and area of ​​the team and the displacement of the team in the four phases have been calculated. Through the statistical analysis of these data over time, it has been determined that the tactical behavior of the team is significantly different in the ball possession situation from other phases of the match, and the team covers a larger area of ​​the field. At the time of phase change between out-of-possession to in-possession, the width of the team increases more compared to the length. Also, when the game transitions from the attack phase to the defense phase, more width is observed compared to the time when the ball is lost.

Based on the location of the players, Shaw and Glickman (2020) proposed a method to determine the defensive and offensive arrangement of the team by using cumulative hierarchical clustering of the players. In this research, data has been collected and analyzed in certain time periods and tactical summaries of each match has been produced. Finally, a Bayesian model selection algorithm has been used to estimate the probability of placing players in the identified patterns. An example of the created clusters is shown in Figure 1.

Figure 1: A view of the hierarchical clustering results performed in (Shaw and Glickman, 2020).

Bueno et al. (2021) used the Multiscale Fractal Curve (MFC) method as a region-based shape analysis method to describe the team shape. In this method, the shape is defined by a convex hull and then this shape is converted into a raster image. After that, based on the pixel characteristics of the images, the clustering process has been done with the K-Means method. The main focus of this research has been on the geometrical dimension, and the locational parameters of the team have not been included in it. Also, due to the use of MFC, which is a method independent of scale, rotation, and displacement, it cannot provide an analysis of the quality of the game, and is only satisfied with the geometric clustering of the team shape. Figure 2 shows a view of the clustering performed in this research.

Figure 2: A view of the clustering results in (Bueno et al., 2021).

Due to the fact that the arrangement of the teams during the matches can change, Whitmore and Seidl (2021) investigated the situation of the team in two states of ball in-possession or out-of-possession and identified the most probable form of the team's situation. For this purpose, in different time intervals, the average location of the players has been calculated and based on that, the clustering process has been performed. Based on the results of clustering, the most probable team shapes have been chosen from the patterns presented in Figure 3.

Figure 3: General shapes of the team: a. In Possession shapes, b. Out of possession of the ball (Whitmore and Seidl, 2021)

Goes et al. (2020) in the review of the research conducted in related to the use of players' location data in support of tactical performance analysis, pointed out the importance of cumulative spatial functions. These types of functions, while modeling the team behavior at the macro level, reduces the complexity of the analyses to an interpretable level. Parameters such as point geometric center, linear geometric center, the average distance of players to the center of the team, the area covered by the team shape, and the length and width of the team shape are among the parameters mentioned in this research. Narizuka and Yamazaki (2019), according to the time changes of the players’ arrangement during attack and defense, have used the hierarchical method and time series analysis to determine the team shape. This analysis is based on the location of the players and the area under the control of each of them, which was extracted with the Voronoi diagram, and based on the number of matches held by the same team.

Goncalves et al. (2019), for the spatio-temporal analysis of the team's collective behavior, have defined the effective space of a team's game as a convex hull. Then they have investigated characteristics such as ball location, distance traveled, the length and width of ​​the team, possession of the ball and the distance between the location of the most pioneer player and the goal line. Afterward, by comparing these parameters, it has been tried to evaluate the relative position of the team compared to the top and bottom teams of the ranking table, as well as the effect of different parameters on the results. According to the obtained results, ball possession had the greatest impact on the team's collective behavior patterns. Bialkowski et al. (2014) have used the spatio-temporal data of players in order to analyze large-scale football matches. For this purpose, the K-Means clustering process has been performed on the data, and also, the role of the players has also been considered. The output of this process was to identify the arrangement of players during the match and in different situations. Figure 4 shows a view of the output of this clustering according to the offensive and defensive situations of the team.

Figure 4: Clustering of players in different situations (Bialkowski et al., 2014)
In most of the researches, the analyses performed at the macro level were mostly based on quantitative parameters such as the geometric position, area, perimeter, length and width of the team, and the geometrical aspects of the shape were considered less. Also, most of the researches that have investigated the geometrical aspects have considered these aspects independently of the scale and direction. This is despite the fact that both the area covered by the team and the direction of the team are important parameters in the football analysis. Considering these challenges, this paper aims to introduce another geometrical descriptor and also examine zone-based spatial parameters to analyze the overall shape of the team. In the next section, these parameters are listed.

Methodology

For post-match review of a team from a geometric and spatial point of view, spatio-temporal data related to that match is needed. Generally, these data include three parts: the players’ locations in the pitch, the ball locations at each time frame, and the event data (Liu, 2022). With the growth of technology, these data can be collected with different methods such as satellite tracking systems such as GPS (Zhang, 2022), optical tracking systems (with the help of image and video processing) (Csanalosi et al., 2020) and radio tracking systems (by sending electrical signals) (Seidl, 2016). Before starting the analysis, the process of data cleaning and data preprocessing must be done. This preprocessing begins with the construction of a single geometry in each time frame for the team. Then a set of parameters describing this shape should be extracted. Based on these descriptors, the data is partitioned into two states of in-possession and out-of-possession, and then two stages of spatial-geometric clustering are performed in each partition. At the end, similar patterns of team behavior will be identified and interpreted for the team. Figure 5 shows this general procedure. In the following, each of the steps will be explained separately.

Figure 5: General process of the proposed method

Data Preprocessing

In the pre-processing step, the data located outside the boundaries of the pitch as well as the data outside the legal time of the match are removed. Also, due to the replacement of a series of players during the match, it is necessary to replace the location of them with the new players. Another process that must be done for data integrity is transferring the data of the two halves of the match to one side of the field. For this, it is enough to mirror the data in the second half of the match with respect to the middle line of the pitch. With the completion of these processes, the shape construction phase begins.

Shape Construction in Each Time Frame

Spatial data related to players includes a set of points. Therefore, to create the team shape, it is necessary to consider the location of all players in each time frame in the form of a single geometry. Various methods have been presented to define the geometric team shape. The most common of these methods are the usual formation of players in a match: The most well-known pattern that is usually used in matches to express the team shape is the starting arrangement of the players (Shaw and Glickman, 2020). From a geometric point of view, in this case, the shape is defined in the form of a multipoint in each time frame (Figure 6.a). Despite the commonality of this method, this arrangement does not visually provide information on the extent of dominance of the players on the pitch. Buffer Zone: If the scope of each player's activity is defined by a circle with a certain radius and then these boundaries are considered for the whole team as an integrated geometry, a general shape for the team can be defined in each frame (Liu, 2022) (Figure 6.b). The challenge that exists related to this geometry is the existence of holes that are created by the transfer of players in the team shape, and from the computational point of view, it can lead to the complexity of calculations.

Voronoi diagram of team players: One of the geometries used to define the area covered by the players of a team is Voronoi diagram (Fujimura and Sugihara, 2005) (Figure 6.c). Considering that the final shape divides the entire area of ​​the pitch into specific areas, therefore the overall shape of the team reproduces the pitch area. Convex hull: This geometric shape defines the region that the entire team can cover in each frame (Liu, 2022) and Narizuka and Yamazaki (2019) introduce this region as the effective area of the game. The convex hull is defined as a function of time and location of all players (except the goalkeeper) (Figure 6.d). This shape is integrated and convex and creates a visual insight of the team's coverage over the match. According to this description, in this paper, the convex hull is used to define the team shape.

Figure 6: Different shape of the team: a. Initial formation, b. Buffer zone, c. Voronoi diagram, d. convex hull

Extraction of Team Shape Descriptors in Each Time Frame

In this paper, to analyze the overall behavior of a team during the match, the geometric changes of the team shape and similar geometric patterns are examined. According to the previous section, the geometry used to define the shape of the team is a convex hull. To analyze the shape, it is necessary to first extract a set of parameters describing the shape. These descriptors, in this paper, are divided into four groups: spatial descriptors, zone-based descriptors, geometric descriptors, and event-based descriptors. In the following, these descriptors will be examined. Here, it should be noted that the shape may not be fully reconstructed from the descriptors, but the descriptors defined by the shapes must be different enough to distinguish the shapes from each other (Wirth, 2004).

Spatial Descriptors

To describe the overall location of the team on the pitch, the geometric center of the team shape is extracted. This center is calculated based on the average of ​​the coordinates of the shape vertices according to equation 1. In this relation, is the number of vertices and is the coordinates of the vertices of the shape.

         (1)

Zone-based Descriptors

In order to better manage the ​pitch and the arrangement of the players, the coaches divide the pitch into a set of zones. In this paper, in each time frame, the percentage of the team shape's area overlaps with each of these zones can be used as a parameter describing the shape. Figure 7 shows the proposed zoning by two current football coaches. In this paper, the zoning proposed by Pep Guardiola, which divides the field into 26 zones, is used.

Figure 7: Zoning of the football pitch in different ways: a. Louis van Gaal's method, b. Pep Guardiola's method

Geometrical Descriptors

In general, there are two main methods for describing geometry, contour-based and region-based. In contour-based methods, focusing on the vector space, it is tried to extract the characteristics of the main shape based on the boundary of the shape and turn it into a numerical representation (Fan, Zhao and Wenwen, 2021). But in region-based methods, first the shape is transferred from vector space to the image space and then based on the characteristics of the pixels, the characteristics of the shape are tried to be reflected (Bueno et al., 2021). One of the well-known region-based shape representation method is the Moment invariant. Moments are used to define a unique shape by considering a set of features in the form of scalar quantities (Flusser, Suk and Zitova, 2009). The definition of moments is derived from probability theory, which is a numerical property used to describe the distribution of random variables (Fu et al., 2018). Until now, various methods have been presented to define moments, among which Hu Invariant moments and Zernike moments can be mentioned. 

Hu defines seven numerical quantities as shape descriptors, which are calculated from central moments to the third order and are independent of the shape's translation, scale, and orientation (Keyes and Winstanley, 2001). Zernike moment is a shape descriptor based on image area and its basis is orthogonal radial polynomial (Chen and Lu, 2016). The important thing about this method is that these moments are independent of the scale, orientation and displacement of the shape (Marouf and Faez, 2013). This is, while the parameters of the area and the direction of movement in relation to a team during a match can play an important role on the efficiency and behavior of the team. Also, to calculate the invariant moment, it is necessary to transfer the geometric shape from the vector space to the image space and then perform the calculations in this space (Sabhara, Lee and Lim, 2013). Accordingly, Moment invariant is not used in this paper. Rather, instead of them, a new method based on vector space is presented to describe the shape of the team according to the dimensions and orientation of the team shape. In this proposed method, the entire space is divided radially into 10 sectors centered on the shape centroid. In each sector, it calculates the average number of vertices and their distance to the centroid, adding 20 parameters to the shape descriptors. Considering that the convex hull created here is based on the location of 10 players (Without goalkeeper), hence the final shape will have a maximum of 10 vertices. Therefore, the whole space is divided into the same number of sub-spaces (Figure 8). This set of descriptors, without transferring data to raster space, efficiently describes the team's shape with less computation than the Hu method. It better captures the shape and behavior of the team due to its dependence on the number of vertices and shape elongation.

Figure 8: Spread the space into 10 sectors and check the distribution of shape vertices to describe the shape of the team

Event-Based Descriptors

Along with the location data, one of the most important types of data in football is the data related to the events. Based on the nature of events, different classifications are considered for them. Events can occur at a specific point (such as fault) or have a specific origin and destination (such as pass). On the other hand, the events can express the players' activities (such as pass or shot) or display a technical event (such as fault or the beginning and end of the half of the match) (Gudmundsson and Horton, 2017). Also, in another classification, they can be divided into two parts: ball-based events (such as throw-out, pass, deep pass, shots, and receiving the ball), and non-ball events (such as tackles, and dribble) (Haaren et al., 2019). According to (Goncalves et al., 2019), one of the most important parameters influencing the team's behavior is the in-possession or out-of-possession of the ball. Therefore, in this paper, the issue of ball possession plays a role as one of the important decision-making parameters in the analysis, and finally, the results are presented based on it. In order to determine the ball possession, the events data is used here. Based on the events that require the ball (such as pass, shot, etc.), the winning of ground or aerial challenges on (such as dribbling or tackling), the times when the ball is in the possession of the team are determined. In the same way, the times when the above events are carried out by the opponent team are considered as times of out-of-possession for the team.

Data Partitioning

Considering the importance of possession of the ball on the overall behavior of the team, before clustering, it is necessary to divide the entire data set into two partitions based on event-based descriptors. Based on these descriptors, all the time frames of the match are partitioned, and the continuation of the data processing and analysis process is done separately in each part. Also, the interpretation of the results is performed based on this data partitioning.

Spatio-Geometrical Clustering

One of the methods that can help to analyze the teams is a post-review of the matches that the team participated (Carling, Williams, and Reilly, 2005). In this paper, to review the matches, it has been tried to analyze the team by finding similar patterns during the match. One of the most well-known machine learning methods that can help identify hidden patterns in data is the clustering method (Chaudhry et al., 2023). This paper performs the clustering process based on the team shape descriptors during the match. The output of this analysis is finding similar behavior patterns for the team at different times of the match. There are different methods for clustering and in this paper, the K-Means algorithm is chosen. The possibility of using for a large amount of data, ease of implementation, speed of execution, computational efficiency, and the optimal amount of memory consumption make this algorithm a suitable option for team analysis (Morissette and Chartier, 2013; Gasparini and Álvaro, 2020). The process involves determining the number of clusters, selecting initial centers, calculating data distances to cluster centers, and assigning data to the nearest cluster. The weighted average of each cluster's data becomes the new center, and this process repeats until all similar data are placed in a cluster (Morissette and Chartier, 2013). This paper describes a two-stage clustering process for analyzing team shapes in soccer matches. First, spatial clustering using the K-Means algorithm considers the geometric center and overlapping with 26 pitch zones, resulting in five clusters representing team states: defense, attack, transition from defense to attack, transition from attack to defense, and play in the middle zones. Next, geometric clustering, also using K-Means, is applied to shapes within each main cluster based on proposed geometric descriptors. This method provides a detailed analysis of team formations and transitions during a match.

Evaluation of results

To evaluate clustering results, compare shape differences in each cluster over time based on location, coverage area, and orientation relative to the center of the cluster. The shape closest to the cluster center is considered the representative shape of the cluster center. Here, equation 2 is used for this evaluation.

In which, the , , and express the amount of difference based on the location, as the difference based on the distance and the number of vertices located in the 10 sectors (Figure 8), respectively, which is defined according to equations. The parameters are the weight corresponding to each of them, which are considered equal here.

where, is the location of ​​the team and is the location of ​​the cluster center in the pitch.

where, is the distance of the vertices of the team shape from the geometric center in each sector and is the distance of the vertices of the cluster center from its geometric center.

Where, is the number of team shape vertices and is the number of cluster center vertices

Implementation

In this section, the data and software used in the implementation process are explained, then the implementation results are evaluated and interpreted.

Data and software used

There are various sources for accessing to spatio-temporal data of football matches. In this paper, the data related to two matches have been received from the Metrica Sports Sample Data source[1]. The site anonymizes player and team information in CSV files, with data recorded every 0.4 seconds. The first match has 145,006 rows and the second 141,156 rows, plus smaller event files with 1,745 and 1,935 events, respectively. Both teams won their matches 3-0 and 3-2. Data was imported into Oracle 19c for pre-processing, convex hull construction, and shape descriptor calculation, resulting in four datasets. Clustering and result display were developed using Python and Scikit-Learn.

Data Processing

This paper investigates the behavior of two different teams in two matches from a spatio-geometrical perspective. After pre-processing and extracting descriptors, the data is clustered in two stages. In the first stage, K-Means clustering is applied with five clusters and 20 repetitions. The spatial clustering results for the teams, both in possession and out-of-possession, are illustrated in Figure 9. 

Figure 9: The position of all T1 team shapes in five clusters in spatial clustering in two situations of in-possession and out-of-possession

In the second stage, for each of the main clusters identified in the first stage, the process of geometric clustering is performed using the K-Means method and considering 5 clusters in 20 repetitions. Figures 10 to 13 show the results of the clustering of the teams in the second stage in two matches and in two situations of in-possession or out-of-possession of the ball. In these figures, the time contribution of each sub-cluster is specified. In each sub-cluster, the dominant direction of the team's movement based on the amount of the team's displacement until the next time frame is also displayed. Also, the sub-clusters in which the match goals were scored have been identified.

Figure 10: The results of the geometric clustering of the shape of team T1 in match 1 in the situation of in-possession

Figure 11: The results of the geometric clustering of the shape of team T1 in match 1 in the situation of out-of-possession

Figure 12: The results of the geometric clustering of the shape of team T2 in match 2 in the situation of in-possession

Figure 13: The results of the geometric clustering of the shape of team T2 in match 2 in the situation of out-of-possession

In the next section, based on the results of clustering, the general behavior of the two teams in the two matches is investigated.

Analysis of the Overall Behavior of Teams

By clustering, similar patterns in team shape during the match are identified. To analyze overall team behavior, the effectiveness of shape changes is examined, considering game quality and team efficiency. For this purpose, this section provides meaningful indicators and interpretations. Several cluster-related indicators are introduced and examined, using event data to define good or mediocre team play. In the first stage of analysis, the team's location is clustered into five horizontal pitch areas. According to Figure 9, clusters A, B, C, D, and E represent the team's location in the defensive area, home half, midfield, opponent's half, and opponent's defense area, respectively. The first indicator examines changes in team shape within these clusters during the 2 minutes before goals are scored or conceded. Figure 14 illustrates this average time for teams T1 and T2 in two matches.

Figure 14: Average time share of main clusters in 2 minutes leading to match goals

According to Figure 14, in the lead-up to T1's three goals, the team was mostly in home areas (clusters A and B) and out-of-possession, indicating a strategy of strong possession play and quick counterattacks. Conversely, T2 was more present in the opponent's field before scoring, suggesting direct and forward play. Despite T2's full presence in home areas before conceding goals, the goals were not due to their absence, and this cannot be fully explained by the figure. The second indicator examines the relationship between the number of passes and team shape. Figure 15 shows that as the area covered by teams T1 and T2 increases, their number of passes also increases, while opponent passes decrease. This suggests that a larger covered area indicates better play, with a stronger relationship correlating to improved team performance.

Figure 15: Ratio of passes to teams’ shape area

The third indicator examines the relationship between player distribution within the team's shape and the outcomes of aerial and ground challenges, based on event data. The distribution is quantified by calculating the distance between the geometric center of the shape and the geometric center of all players. Figure 16 shows that a more homogeneous player distribution results in a smaller distance.

Figure 16: Distribution of players inside the team shape 

In Figure 17, the horizontal axis represents the distance between two geometric centers and the vertical axis shows the number of challenges. According to this figure, with a more heterogeneous distribution, the T1 team has succeeded in more challenges, while with increasing heterogeneity, the number of wins in challenges of the T2 team has decreased. This indicates that T1 and T2 employ different strategies during match challenges.

Figure 17: The ratio of the number of team challenges to the distribution of players

The fourth indicator measures the average time a team stays in the same cluster. Figure 18 shows that T2 had a more uniform pattern across the field, regardless of in-Possession or out-of-Possession states, while T1 spent more time in clusters A and E. Maybe just based on this figure, it can be said that in the first match, the T1 team performed poorly in the defense and attack phases, but considering the team's win (without the opposing team being able to score), it can be said that the team in out-of-possession state adopts predetermined patterns to manage the game, especially in its defensive area and the opponent team's defensive area. This indicator supports the first indicator's explanation of the teams' overall strategies.

Figure 18: The average time the team has been in a cluster continuously

The proposed post-review method and cluster analysis indicators assist coaches in evaluating team performance across matches. By examining these indicators in more games, coaches can better understand and accurately discuss the team's overall behavior.

Conclusion

This paper focuses on analyzing the general shape of football teams rather than individual player performance. It examines how a team's structure during a match aligns with the coach's strategy and its impact on results. This analysis aids coaches in adjusting tactics and evaluating overall team performance. The spatio-geometrical shape of the team reveals its structure, and identifying similar patterns helps achieve the paper's goal. In this regard, in this paper, first, for each time frame of the match, a convex hull geometry is considered as the overall shape of the team. Then, to describe the team shape at each time frame, spatial and zone-based descriptors, geometric descriptors and event-based descriptors are extracted. These descriptors are the basis of two-stage spatio-geometrical clustering, by using K-Means algorithm, which leads to the identification of similar behavioral patterns for the team. By identifying these similar patterns, it is possible to examine their effectiveness on the quality of the team's game, such as the number of passes, winning and losing challenges, the team's behavior in the times leading to the goals, and the team's overall strategy. Examining these patterns can provide coaches with valuable information to analyze teams and identify the weaknesses and strengths of the team.

While team shape behavior based on clustering has been explored in other research, this paper distinguishes itself by introducing a descriptor based on the distribution of shape vertices around its geometric center and evaluating the results based on team performance. It also performs two-stage clustering considering spatial and geometrical characteristics, using only vector data without image processing techniques. In order to provide more detailed analysis, in the next researches, the temporal changes of the data and the shape of the team, the simultaneous analysis of two teams in the same match, and the overlapping of the team's range with the shape of the opposing team can be considered. The issue of density and spatial distribution of players inside the team shape has been an issue that has not been considered in this paper and needs to be examine in the Post-Match Review. 

References