Reliability is closely intertwined with validity. On the one hand, reliability is a necessary, but not a sufficient, condition for validity. On the other hand, in the pursuit of high reliability, validity tends to get lost, for example, by oversimplifying measurement procedures or by overly strict annotation instructions. More often than not, the notion of reliability is claimed as the primary criterion for the adequacy of a measurement. However, the concept of reliability is not clearly defined for predictions in NLP and data science. There are a multitude of metrics that are commonly applied to measure reliability in data annotation, and a different set of measures that address reliability of model predictions. The goal of this chapter is to provide a clear definition of reliability that applies to data annotation and model prediction alike, and that can be operationalized into a procedure to assess reliability of data annotation and model prediction in concrete applications in NLP and data science.