We had a multi-month project to get a subset of our data considered 'clean', and it required a consultant, a stats PhD and many dev hours. It was healthcare, so on the high end of paranoia (justifiably) but nowhere it is as simple as dropping the "name" column
We had a multi-month project to get a subset of our data considered 'clean', and it required a consultant, a stats PhD and many dev hours. It was healthcare, so on the high end of paranoia (justifiably) but nowhere it is as simple as dropping the "name" column