Skip to main content
SLU publication database (SLUpub)

Research article2020Peer reviewedOpen access

Comparison of methods for predicting cow composite somatic cell counts

Anglart, Dorota; Hallen-Sandgren, Charlotte; Emanuelson, Ulf; Ronnegard, Lars


One of the most common and reliable ways of monitoring udder health and milk quality in dairy herds is by monthly cow composite somatic cell counts (CMSCC). However, such sampling can be time consuming, and more automated sampling tools entail extra costs. Machine learning methods for prediction have been widely investigated in mastitis detection research, and CMSCC is normally used as a predictor or gold standard in such models. Predicted CMSCC between samplings could supply important information and be used as an input for udder health decision-support tools. To our knowledge, methods to predict CMSCC are lacking. Our aim was to find a method to predict CMSCC by using regularly recorded quarter milk data such as milk flow or conductivity. The milk data were collected at the quarter level for 8 wk when milking 372 Holstein-Friesian cows, resulting in a data set of 30,734 records with information on 87 variables. The cows were milked in an automatic milking rotary and sampled once weekly to obtain CMSCC values. The machine learning methods chosen for evaluation were the generalized additive model (GAM), random forest, and multilayer perceptron (MLP). For each method, 4 models with different predictor variable setups were evaluated: models based on 7-d lagged or 3-d lagged records before the CMSCC sampling and additionally for each setup but removing cow number as a predictor variable (which captures indirect information regarding cows' overall level of CMSCC based on previous samplings). The methods were evaluated by a 5-fold cross validation and predictions on future data using models with the 4 different variable setups. The results indicated that GAM was the superior model, although MLP was equally good when fewer data were used. Information regarding the cows' level of previous CMSCC was shown to be important for prediction, lowering prediction error in both GAM and MLP. We conclude that the use of GAM or MLP for CMSCC prediction is promising.


generalized additive model; multilayer perceptron; random forest; udder health

Published in

Journal of Dairy Science
2020, Volume: 103, number: 9, pages: 8433-8442 Publisher: ELSEVIER SCIENCE INC