Generalized Linear Models, Machine Learning Techniques, motor insurance, data mining, non-life insurance, financial risk coverages, risk classification, policy holders, linear regression models, canonical parameter, Maximum Likelihood Estimation, betas validation, Akaike Information Criterion, decision trees, Adaboost, data cleansing, frequency models
The concept of insurance finds its roots back to the Antiquity period (2,500 years BC) among the Babylonian civilization where the first insurance transaction has been observed in the form of a loan. As carved in the code of Hammurabi, a bottomry or a bottomage represents an arrangement between a ship master and a lender, in which the ship is used as security against a loan to finance a journey. The lender bears entirely the risk of loss in case the ship was damaged or lost. One thousand years later, the Greek merchants invented the very actual concept of mutualization to collectively indemnify the mercantile losses encountered during a storm.
These two concepts would be reused centuries later by Edward Lloyd who has been the first to open a tavern that quickly became the best place to exchange maritime coverages. In the modern sense of the term, the insurance contract appeared solely at the end of seventeenth century after the Great Fire of London that urged Nicholas Barbon to start a business insuring the city buildings. Despite the evolution of the insurance product complexity over the years, the underlying principle always remained the same, namely: to provide a guarantee of compensation for a loss that may occur in return for payment of a specified premium. In other words, this financial mechanism allows individuals to straighten out a given risk that may threaten their wealth in exchange of a certain amount of money.
[...] On the other hand, the a posteriori variables are information collected along the contract's validity period and which constitute the most valuable source of information in defining a driver risk profile. As a matter of fact, the size of a claim as well as the number of claims experienced bear witness of the driving habits of the policyholders. Knowing this information, it is thus easier for an insurance company to assess whether or not a client should see its premium amount reevaluated. [...]
[...] However, it is necessary to stress that this choice is not straightforward and requires an analysis of the response variable nature. Indeed, the model could eventually lead to counter-intuitive results if the latter's characteristics were not appropriately considered (i.e. negative results while estimating a frequency). Thus, the choice of another link function should always be an alternative when building up a model. When the canonical link function of the response variable is chosen, the model prediction always tends towards the average in the data. [...]
[...] Eventually, the outcome of the process is summarized within the following capture which shows the missing values for each variable embedded in the dataset. Figure 11: Frequency model - Number of missing values per variable Once again, one can clearly observe that variables KM, Experience, NBPeople and NB_KM, contain respectively and 41% of missing values. As mentioned previously, it is preferred to get rid of these type of variable to avoid any sizable loss of information. Finally, all the claims still displaying shady observations are also removed from the dataset. [...]
[...] Thereafter, it rewards the "weak learners" displaying the best results while computing the outcomes of the "strong learner". In a nutshell , the different steps of the algorithm can be summarized as show in the following capture. Figure Schematic overview of the Adaboost Algorithm Where, corresponds to the predictions made by the "weak learner" obtained at iteration m. Considered as the "greatest off-the-shelf classifier in the world" by its inventors, the Adaptive Boosting presents valuable features participating in its acceptance across a large spectrum of expertise. [...]
[...] The two statisticians were the first to provide a concrete example of the Boosting theory for a binary variable through the Adaptive Boosting algorithm, or Adaboost. Nowadays, this algorithm remains one of the most powerful and popular, being extensively exploited, not only in Machine Learning, but also in genomic research or neuroscience. Later on, the premise of this study permitted to Bauer and Kohavi to show that Boosting is capable of reducing the variance as well as the bias of the prediction while the Bagging solely operates on the former. [...]
Online readingwith our online reader
Content validatedby our reading committee