Data exploration is the initial step in any data science project, where users explore a data set to uncover possible initial patterns,
characteristics, and points of interest. This process is not meant to be exhaustive, but rather to help create a broad picture of
important trends that will need to be studied in greater detail.
Some elements to look at:
- Source of the data. IS it reliable?
- The data themselves: are the missing data? are the data reasonable?
- Basic statistics: measures of centrality, variations. Are there outliers?
A key element of data exploration is visualization!
Lecture Notes
Further Reading