Last month on January 18nd, Sir David Cox passed away. For those who studied statistics, engineering or finance this name ought to ring a bell. Among the many things he is famous for, two stand out.
First is his solution to a very popular question asked multiple times a day in hospitals, banks, and control rooms all over the world: “How long before X happens”.
How long before this customer defaults? How long before this patient dies?
In his 1972 paper Regression Models and Life Tables, he proposes a regression model that could assess the relationship between multiple factors and the survival time (or time till event). It is important to note that before this there was no mathematical method to investigate how multiple variables contributed to an outcome time. In essence, the method allows you to answer both the “How long before…happens” and “by how much a factor contributes to the rate (speed) at which the event happens”. This method has allowed us to take a big step forward in understanding and navigating some of life’s uncertainties. From being able to better manage financial risks or by providing evidence for treatments that help extend the life of patients.
Cox’s second important contribution is logistic regression (1958), a classification method used to predict binary outcomes (yes/no, 0/1, like/dislike).
This regression technique can be used to know the probability of customers renewing their contracts or the probability you would purchase a certain item given your search history.
Logistic Regression is still hugely popular because its results are relatively simple to interpret and explain to non-technical audiences. Its simplicity also allows for faster training time and lower computational requirements.
You can already learn a lot about your data by starting with logistic regression before moving on to more sophisticated methods
We would like to thank Sir Cox for his contributions to statistics and beyond, and wish him a great afterlife.
References.
- Papers: Regression Models and Life-Tables on JSTOR (1972), The Regression Analysis of Binary Sequences on JSTOR (1958).
- For those interested in interpreting proportional hazard models, we recommend the following resources: Surviving Proportional Hazards – Elashoff – 1983 – Hepatology – Wiley Online Library
- Multivariate survival analysis using regression: https://scholar.google.be/scholar_url?url=https://ecstep.com/wp-content/uploads/2020/03/Multivariate-Survival-Analysis-using-Coxs-regression-model.pdf&hl=nl&sa=X&ei=k_X3Ybu5MPqVy9YPn9aQQA&scisig=AAGBfm0l44YRtzxrSWNwz7gWGjgASmIPTw&oi=scholarr&force_isolation=true