XREAP 2018-03: Machine Learning Forecasts of Public Transport Demand: A comparative analysis of supervised algorithms using smart card data

Public transport smart cards are widely used around the world. However, while they provide information about various aspects of passenger behavior, they have not been properly exploited to predict demand. Indeed, traditional methods in economics employ linear unbiased estimators that pay little attention to accuracy, which is the main problem faced by the sector’s regulators. This paper reports the application of various supervised machine learning (SML) techniques to smart card data in order to forecast demand, and it compares these outcomes with traditional linear model estimates. We conclude that the forecasts obtained from these algorithms are much more accurate.

Palacio, S. M. (GiM, XREAP)


XREAP 2018-02: Detecting Outliers with Semi-Supervised Machine Learning: A Fraud Prediction Application

Abnormal pattern prediction has received a great deal of attention from both academia and industry, with applications that range from fraud, terrorism and intrusion detection to sensor events, medical diagnoses, weather patterns, etc. In practice, most abnormal pattern prediction problems are characterized by the presence of a small number of labeled data and a huge number of unlabeled data. While this points most obviously to the adoption of a semi-supervised approach, most empirical studies have opted for a simplification and treated it as a supervised problem, resulting in a severe bias of false negatives. In this paper, we propose an innovative methodology based on semi-supervised techniques and introduce a new metric the Cluster-Score for abnormal homogeneity measurement. Specifically, the methodology involves transmuting unsupervised models to supervised models using the Cluster-Score metric, which defines the objective boundaries between clusters and evaluates the homogeneity of the abnormalities in the cluster construction. We apply this methodology to a problem of fraud detection among property insurance claims. The objectives are to increase the number of fraudulent claims detected and to reduce the proportion of claims investigated that are, in fact, non-fraudulent. The results from applying our methodology considerably improved these objectives.

Palacio, S. M. (GiM, XREAP)



Previous results show that gender diversity increases the probability that firms invest in R&D and engage in innovation. This paper explores the relationship between gender diversity of R&D departments and their capacity to patent. Based on the Spanish Community Innovation Survey between 2004 and 2014, we apply a two-step procedure in order to control for endogeneity. Although gender diversity affects OEPM patents negatively, its impact is non-significant for patents with international coverage (EPO, USPTO, or PCT). A relevant result is the fact that the generation of patents is positively affected by the diversity of categories in the R&D labs. Our results highlight that, gender diversity of R&D teams does not play a relevant impact on the capacity of the firm to register patents. However, the diversity according to the professional role in R&D teams exerts a positive influence. In sum, the key question is not the gender diversity per se but the gender diversity jointly with the professional status.

Teruel, M. (GRIT, XREAP); Segarra-Blasco, A. (GRIT, XREAP)